https://towardsdatascience.com/learn-on-towards-data-science-52245bc91451
Practitioner’s Guide to Data Science
By Hui Lin, Ming Li
1st Edition
First Published 2023
eBook Published 24 May 2023
Based on industry experience, this book outlines real-world scenarios and discusses pitfalls that data science practitioners should avoid. It also covers the big data cloud platform and the art of data science, such as soft skills. The authors use R as the primary tool and provide code for both R and Python.
This book is for readers who want to explore possible career paths and eventually become data scientists. This book comprehensively introduces various data science fields, soft and programming skills in data science projects, and potential career paths. Traditional data-related practitioners such as statisticians, business analysts, and data analysts will find this book helpful in expanding their skills for future data science careers. Undergraduate and graduate students from analytics-related areas will find this book beneficial to learn real-world data science applications. Non-mathematical readers will appreciate the reproducibility of the companion R and python codes.
Key Features:
• It is hands-on. We provide the data and repeatable R and Python code in notebooks. Readers can repeat the analysis in the book using the data and code provided. We also suggest that readers modify the notebook to perform analyses with their data and problems, if possible. The best way to learn data science is to do it!
TABLE OF CONTENTS
Chapter 1|28 pages
Introduction
Chapter 2|18 pages
Soft Skills for Data Scientists
Chapter 3|8 pages
Introduction to the Data
Chapter 4|22 pages
Big Data Cloud Platform
Chapter 5|26 pages
Data Pre-processing
Chapter 6|22 pages
Data Wrangling
Chapter 7|26 pages
Model Tuning Strategy
Chapter 8|16 pages
Measuring Performance
Chapter 9|20 pages
Regression Models
Chapter 10|30 pages
Regularization Methods
Chapter 11|42 pages
Tree-Based Methods
Chapter 12|78 pages
Deep Learning
https://linhui.org/hui's_files/datascientist1#(20)
https://scholar.google.com/citations?user=PAArLQIAAAAJ&hl=en&oi=sra
https://scholar.google.com/citations?user=PAArLQIAAAAJ&hl=en
https://linhui.org/
https://github.com/happyrabbit
https://scientistcafe.com/
A Tour of Data Science: Learn R and Python in Parallel
Nailong Zhang
CRC Press, 11-Nov-2020 - Computers - 216 pages (C) 2021.
A Tour of Data Science: Learn R and Python in Parallel covers the fundamentals of data science, including programming, statistics, optimization, and machine learning in a single short book. It does not cover everything, but rather, teaches the key concepts and topics in Data Science. It also covers two of the most popular programming languages used in Data Science, R and Python, in one source.
Key features:
Allows you to learn R and Python in parallel
Cover statistics, programming, optimization and predictive modelling, and the popular data manipulation tools – data table and pandas
Provides a concise and accessible presentation
Includes machine learning algorithms implemented from scratch, linear regression, lasso, ridge, logistic regression, gradient boosting trees, etc.
Appealing to data scientists, statisticians, quantitative analysts, and others who want to learn programming with R and Python from a data science perspective.
A Hands-On Introduction to Data Science
Chirag Shah
Cambridge University Press, 02-Apr-2020 - Business & Economics - 424 pages
This book introduces the field of data science in a practical and accessible manner.
The foundational ideas and techniques of data science are provided allowing students to easily develop a firm understanding of the subject. The material that will have continual relevance even after tools and technologies change.
Using popular data science tools such as Python and R, the book offers many examples of real-life applications, with practice ranging from small to big data. A suite of online material for both instructors and students provides a strong supplement to the book, including datasets, chapter slides, solutions, sample exams and curriculum suggestions. This entry-level textbook is ideally suited to readers from a range of disciplines wishing to build a practical, working knowledge of data science.
https://books.google.co.in/books?id=rljPDwAAQBAJ
Data Science Job: How to become a Data Scientist
Przemek Chojecki, 31-Jan-2020 - Computers - 100 pages
Data Scientist is one of the hottest job on the market right now. Demand for data science is huge and will only grow, and it seems like it will grow much faster than the actual number of data scientists. So if you want to make a career change and become a data scientist, now is the time.
This book will guide you through the process. From my experience of working with multiple companies as a project manager, a data science consultant or a CTO, I was able to see the process of hiring data scientists and building data science teams. I know what’s important to land your first job as a data scientist, what skills you should acquire, what you should show during a job interview.
https://books.google.co.in/books?id=h0PZDwAAQBAJ
Foundations of Data Science
Avrim Blum, John Hopcroft, Ravindran Kannan
Cambridge University Press, 23-Jan-2020 - Computers - 432 pages
This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks.
Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing.
Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.
https://books.google.co.in/books?id=koHCDwAAQBAJ
Data Science and Intelligent Applications: Proceedings of ICDSIA 2020
Ketan Kotecha, Vincenzo Piuri, Hetalkumar N. Shah, Rajan Patel
Springer Nature, 17-Jun-2020 - Technology & Engineering - 576 pages
This book includes selected papers from the International Conference on Data Science and Intelligent Applications (ICDSIA 2020), hosted by Gandhinagar Institute of Technology (GIT), Gujarat, India, on January 24–25, 2020. The proceedings present original and high-quality contributions on theory and practice concerning emerging technologies in the areas of data science and intelligent applications. The conference provides a forum for researchers from academia and industry to present and share their ideas, views and results, while also helping them approach the challenges of technological advancements from different viewpoints.
The contributions cover a broad range of topics, including: collective intelligence, intelligent systems, IoT, fuzzy systems, Bayesian networks, ant colony optimization, data privacy and security, data mining, data warehousing, big data analytics, cloud computing, natural language processing, swarm intelligence, speech processing, machine learning and deep learning, and intelligent applications and systems. Helping strengthen the links between academia and industry, the book offers a valuable resource for instructors, students, industry practitioners, engineers, managers, researchers, and scientists alike.
p.217 Human activity recognition
https://books.google.co.in/books?id=eSbsDwAAQBAJ
© 2020
Data Science and Productivity Analytics
Editors: Charles, Vincent, Aparicio, Juan, Zhu, Joe (Eds.)
Table of contents (15 chapters)
Data Envelopment Analysis and Big Data: Revisit with a Faster Method Pages 1-34
Khezrimotlagh, Dariush (et al.)
Data Envelopment Analysis (DEA): Algorithms, Computations, and Geometry Pages 35-56
Dulá, José H.
An Introduction to Data Science and Its Applications Pages 57-81
Rabasa, Alex (et al.)
Identification of Congestion in DEA Pages 83-119
Mehdiloo, Mahmood (et al.)
Data Envelopment Analysis and Non-parametric Analysis Pages 121-160
Villa, Gabriel (et al.)
The Measurement of Firms’ Efficiency Using Parametric Techniques Pages 161-199
Orea, Luis
Fair Target Setting for Intermediate Products in Two-Stage Systems with Data Envelopment Analysis
Pages 201-226
An, Qingxian (et al.)
Fixed Cost and Resource Allocation Considering Technology Heterogeneity in Two-Stage Network Production Systems Pages 227-249
Ding, Tao (et al.)
Efficiency Assessment of Schools Operating in Heterogeneous Contexts: A Robust Nonparametric Analysis Using PISA 2015 Pages 251-277
Cordero, Jose Manuel (et al.)
A DEA Analysis in Latin American Ports: Measuring the Performance of Guayaquil Contecon Port
Pages 279-309
Morales-Núñez, Emilio J. (et al.)
Effects of Locus of Control on Bank’s Policy—A Case Study of a Chinese State-Owned Bank
Pages 311-335
Xu, Cong (et al.)
A Data Scientific Approach to Measure Hospital Productivity Pages 337-358
Daneshvar Rouyendegh (B. Erdebilli), Babak (et al.)
Environmental Application of Carbon Abatement Allocation by Data Envelopment Analysis Pages 359-389
Yu, Anyu (et al.)
Pension Funds and Mutual Funds Performance Measurement with a New DEA (MV-DEA) Model Allowing for Missing Variables Pages 391-413
Badrizadeh, Maryam (et al.)
Sharpe Portfolio Using a Cross-Efficiency Evaluation Pages 415-439
Landete, Mercedes (et al.)
https://www.springer.com/gp/book/9783030433833
Special Issue on Data Science for Better Productivity
Data science for better productivity
Vincent Charles,Juan Aparicio &Joe Zhu
Journal of the Operational Research Society
Volume 72, 2021 - Issue 5: Special Issue Data Science for Better Productivity
Afsharian, M. (2019). A frontier-based facility location problem with a centralised view of measuring the performance of the network. Journal of the Operational Research Society, 72(5), 1058–1074. https://doi.org/10.1080/01605682.2019.1639476
Bougnol, M.-L., & Dulà, J. (2020). Improving productivity using government data: The case of US Centers for Medicare & Medicaid's ‘Nursing Home Compare. Journal of the Operational Research Society, 72(5), 1075–1086. https://doi.org/10.1080/01605682.2020.1724056
Del Vecchio, M., Kharlamov, A., Parry, G., & Pogrebna, G. (2020). Improving productivity in Hollywood with data science: Using emotional arcs of movies to drive product and service innovation in entertainment industries. Journal of the Operational Research Society, 72(5), 1110–1137. https://doi.org/10.1080/01605682.2019.1705194
Grimaldi, D., Fernandez, V., & Carrasco, C. (2019). Exploring data conditions to improve business performance. Journal of the Operational Research Society, 72(5), 1087–1098. https://doi.org/10.1080/01605682.2019.1590136
Ihrig, S., Ishizaka, A., Brech, C., & Fliedner, T. (2019). A new hybrid method for the fair assignment of productivity targets to indirect corporate processes. Journal of the Operational Research Society, 72(5), 989–1001. https://doi.org/10.1080/01605682.2019.1639477
Jiang, R., Yang, Y., Chen, Y., & Liang, L. (2019). Corporate diversification, firm productivity and resource allocation decisions: The data envelopment analysis approach. Journal of the Operational Research Society, 72(5), 1002–1014. https://doi.org/10.1080/01605682.2019.1568841
Li, Y., & Chen, W. (2019). Entropy method of constructing a combined model for improving loan default prediction: A case study in China. Journal of the Operational Research Society, 72(5), 1099–1109. https://doi.org/10.1080/01605682.2019.1702905
Lin, S.-W., Lu, W.-M., & Lin, F. (2020). Entrusting decisions to the public service pension fund: An integrated predictive model with additive network DEA approach. Journal of the Operational Research Society, 72(5), 1015–1032. https://doi.org/10.1080/01605682.2020.1718011
Routh, P., Roy, A., & Meyer, J. (2020). Estimating customer churn under competing risks. Journal of the Operational Research Society, 72(5), 1138–1155. https://doi.org/10.1080/01605682.2020.1776166
Shi, Y., Zhu, J., & Charles, V. (2020). Data science and productivity: A bibliometric review of data science applications and approaches in productivity evaluations. Journal of the Operational Research Society, 72(5), 975–988. https://doi.org/10.1080/01605682.2020.1860661
Summerfield, N. S., Deokar, A. V., Xu, M., & Zhu, W. (2020). Should drivers cooperate? Performance evaluation of cooperative navigation on simulated road networks using network DEA. Journal of the Operational Research Society, 72(5), 1042–1057. https://doi.org/10.1080/01605682.2019.1700766
Zhu, J. (2020). DEA under big data: Data enabled analytics and network data envelopment analysis. Annals of Operations Research, 1–23. In press. https://doi.org/10.1007/s10479-020-03668-8
Zhu, W., Liu, B., Lu, Z., & Yu, Y. (2020). A DEALG methodology for prediction of effective customers of internet financial loan products. Journal of the Operational Research Society, 72(5), 1033–1041. https://doi.org/10.1080/01605682.2019.1700188 [Taylor & Francis On
https://www.tandfonline.com/doi/full/10.1080/01605682.2021.1892466
Ud. 16.11,2023, 3.45 am Austin, Texas
Pub. 16.7.2021
Read chapter 1 from Chirag Shah's Book.
ReplyDelete