Wednesday, November 15, 2023

Data Science - Books - Bibliography - Introduction

 

https://towardsdatascience.com/learn-on-towards-data-science-52245bc91451


Practitioner’s Guide to Data Science

By Hui Lin, Ming Li

1st Edition

First Published 2023

eBook Published 24 May 2023




Based on industry experience, this book outlines real-world scenarios and discusses pitfalls that data science practitioners should avoid. It also covers the big data cloud platform and the art of data science, such as soft skills. The authors use R as the primary tool and provide code for both R and Python. 

This book is for readers who want to explore possible career paths and eventually become data scientists. This book comprehensively introduces various data science fields, soft and programming skills in data science projects, and potential career paths. Traditional data-related practitioners such as statisticians, business analysts, and data analysts will find this book helpful in expanding their skills for future data science careers. Undergraduate and graduate students from analytics-related areas will find this book beneficial to learn real-world data science applications. Non-mathematical readers will appreciate the reproducibility of the companion R and python codes.


Key Features:

• It is hands-on. We provide the data and repeatable R and Python code in notebooks. Readers can repeat the analysis in the book using the data and code provided. We also suggest that readers modify the notebook to perform analyses with their data and problems, if possible. The best way to learn data science is to do it!



TABLE OF CONTENTS

Chapter 1|28 pages

Introduction

 

Chapter 2|18 pages

Soft Skills for Data Scientists

 

Chapter 3|8 pages

Introduction to the Data

 

Chapter 4|22 pages

Big Data Cloud Platform

 

Chapter 5|26 pages

Data Pre-processing

 

Chapter 6|22 pages

Data Wrangling

 

Chapter 7|26 pages

Model Tuning Strategy

 

Chapter 8|16 pages

Measuring Performance

 

Chapter 9|20 pages

Regression Models

 

Chapter 10|30 pages

Regularization Methods

 

Chapter 11|42 pages

Tree-Based Methods

 

Chapter 12|78 pages

Deep Learning

 


https://linhui.org/hui's_files/datascientist1#(20)

https://scholar.google.com/citations?user=PAArLQIAAAAJ&hl=en&oi=sra

https://scholar.google.com/citations?user=PAArLQIAAAAJ&hl=en

https://linhui.org/

https://github.com/happyrabbit

https://scientistcafe.com/

A Tour of Data Science: Learn R and Python in Parallel

Nailong Zhang

CRC Press, 11-Nov-2020 - Computers - 216 pages (C) 2021.

A Tour of Data Science: Learn R and Python in Parallel covers the fundamentals of data science, including programming, statistics, optimization, and machine learning in a single short book. It does not cover everything, but rather, teaches the key concepts and topics in Data Science. It also covers two of the most popular programming languages used in Data Science, R and Python, in one source.

Key features:

Allows you to learn R and Python in parallel

Cover statistics, programming, optimization and predictive modelling, and the popular data manipulation tools – data table and pandas

Provides a concise and accessible presentation

Includes machine learning algorithms implemented from scratch, linear regression, lasso, ridge, logistic regression, gradient boosting trees, etc.

Appealing to data scientists, statisticians, quantitative analysts, and others who want to learn programming with R and Python from a data science perspective.

A Hands-On Introduction to Data Science

Chirag Shah

Cambridge University Press, 02-Apr-2020 - Business & Economics - 424 pages


This book introduces the field of data science in a practical and accessible manner.

The foundational ideas and techniques of data science are provided  allowing students to easily develop a firm understanding of the subject. The material that will have continual relevance even after tools and technologies change. 

Using popular data science tools such as Python and R, the book offers many examples of real-life applications, with practice ranging from small to big data. A suite of online material for both instructors and students provides a strong supplement to the book, including datasets, chapter slides, solutions, sample exams and curriculum suggestions. This entry-level textbook is ideally suited to readers from a range of disciplines wishing to build a practical, working knowledge of data science.

https://books.google.co.in/books?id=rljPDwAAQBAJ

Data Science Job: How to become a Data Scientist

Przemek Chojecki, 31-Jan-2020 - Computers - 100 pages

Data Scientist is one of the hottest job on the market right now. Demand for data science is huge and will only grow, and it seems like it will grow much faster than the actual number of data scientists. So if you want to make a career change and become a data scientist, now is the time.

This book will guide you through the process. From my experience of working with multiple companies as a project manager, a data science consultant or a CTO, I was able to see the process of hiring data scientists and building data science teams. I know what’s important to land your first job as a data scientist, what skills you should acquire, what you should show during a job interview.

https://books.google.co.in/books?id=h0PZDwAAQBAJ


Foundations of Data Science

Avrim Blum, John Hopcroft, Ravindran Kannan

Cambridge University Press, 23-Jan-2020 - Computers - 432 pages

This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. 

Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. 

Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.

https://books.google.co.in/books?id=koHCDwAAQBAJ


Data Science and Intelligent Applications: Proceedings of ICDSIA 2020

Ketan Kotecha, Vincenzo Piuri, Hetalkumar N. Shah, Rajan Patel

Springer Nature, 17-Jun-2020 - Technology & Engineering - 576 pages

This book includes selected papers from the International Conference on Data Science and Intelligent Applications (ICDSIA 2020), hosted by Gandhinagar Institute of Technology (GIT), Gujarat, India, on January 24–25, 2020. The proceedings present original and high-quality contributions on theory and practice concerning emerging technologies in the areas of data science and intelligent applications. The conference provides a forum for researchers from academia and industry to present and share their ideas, views and results, while also helping them approach the challenges of technological advancements from different viewpoints.


The contributions cover a broad range of topics, including: collective intelligence, intelligent systems, IoT, fuzzy systems, Bayesian networks, ant colony optimization, data privacy and security, data mining, data warehousing, big data analytics, cloud computing, natural language processing, swarm intelligence, speech processing, machine learning and deep learning, and intelligent applications and systems. Helping strengthen the links between academia and industry, the book offers a valuable resource for instructors, students, industry practitioners, engineers, managers, researchers, and scientists alike.

p.217 Human activity recognition

https://books.google.co.in/books?id=eSbsDwAAQBAJ


© 2020

Data Science and Productivity Analytics

Editors: Charles, Vincent, Aparicio, Juan, Zhu, Joe (Eds.)


Table of contents (15 chapters)

Data Envelopment Analysis and Big Data: Revisit with a Faster Method Pages 1-34

Khezrimotlagh, Dariush (et al.)

Data Envelopment Analysis (DEA): Algorithms, Computations, and Geometry Pages 35-56

Dulá, José H.

An Introduction to Data Science and Its Applications  Pages 57-81

Rabasa, Alex (et al.)

Identification of Congestion in DEA Pages 83-119

Mehdiloo, Mahmood (et al.)

Data Envelopment Analysis and Non-parametric Analysis Pages 121-160

Villa, Gabriel (et al.)

The Measurement of Firms’ Efficiency Using Parametric Techniques Pages 161-199

Orea, Luis

Fair Target Setting for Intermediate Products in Two-Stage Systems with Data Envelopment Analysis

Pages 201-226

An, Qingxian (et al.)

Fixed Cost and Resource Allocation Considering Technology Heterogeneity in Two-Stage Network Production Systems Pages 227-249

Ding, Tao (et al.)

Efficiency Assessment of Schools Operating in Heterogeneous Contexts: A Robust Nonparametric Analysis Using PISA 2015 Pages 251-277

Cordero, Jose Manuel (et al.)

A DEA Analysis in Latin American Ports: Measuring the Performance of Guayaquil Contecon Port 

Pages 279-309

Morales-Núñez, Emilio J. (et al.)

Effects of Locus of Control on Bank’s Policy—A Case Study of a Chinese State-Owned Bank 

Pages 311-335

Xu, Cong (et al.)

A Data Scientific Approach to Measure Hospital Productivity Pages 337-358

Daneshvar Rouyendegh (B. Erdebilli), Babak (et al.)

Environmental Application of Carbon Abatement Allocation by Data Envelopment Analysis Pages 359-389

Yu, Anyu (et al.)

Pension Funds and Mutual Funds Performance Measurement with a New DEA (MV-DEA) Model Allowing for Missing Variables Pages 391-413

Badrizadeh, Maryam (et al.)

Sharpe Portfolio Using a Cross-Efficiency Evaluation Pages 415-439

Landete, Mercedes (et al.)

https://www.springer.com/gp/book/9783030433833



Special Issue on Data Science for Better Productivity

Data science for better productivity

Vincent Charles,Juan Aparicio &Joe Zhu 

Journal of the Operational Research Society 

Volume 72, 2021 - Issue 5: Special Issue Data Science for Better Productivity



Afsharian, M. (2019). A frontier-based facility location problem with a centralised view of measuring the performance of the network. Journal of the Operational Research Society, 72(5), 1058–1074. https://doi.org/10.1080/01605682.2019.1639476   

Bougnol, M.-L., & Dulà, J. (2020). Improving productivity using government data: The case of US Centers for Medicare & Medicaid's ‘Nursing Home Compare. Journal of the Operational Research Society, 72(5), 1075–1086. https://doi.org/10.1080/01605682.2020.1724056   

Del Vecchio, M., Kharlamov, A., Parry, G., & Pogrebna, G. (2020). Improving productivity in Hollywood with data science: Using emotional arcs of movies to drive product and service innovation in entertainment industries. Journal of the Operational Research Society, 72(5), 1110–1137. https://doi.org/10.1080/01605682.2019.1705194   

Grimaldi, D., Fernandez, V., & Carrasco, C. (2019). Exploring data conditions to improve business performance. Journal of the Operational Research Society, 72(5), 1087–1098. https://doi.org/10.1080/01605682.2019.1590136   

Ihrig, S., Ishizaka, A., Brech, C., & Fliedner, T. (2019). A new hybrid method for the fair assignment of productivity targets to indirect corporate processes. Journal of the Operational Research Society, 72(5), 989–1001. https://doi.org/10.1080/01605682.2019.1639477   

Jiang, R., Yang, Y., Chen, Y., & Liang, L. (2019). Corporate diversification, firm productivity and resource allocation decisions: The data envelopment analysis approach. Journal of the Operational Research Society, 72(5), 1002–1014. https://doi.org/10.1080/01605682.2019.1568841   

Li, Y., & Chen, W. (2019). Entropy method of constructing a combined model for improving loan default prediction: A case study in China. Journal of the Operational Research Society, 72(5), 1099–1109. https://doi.org/10.1080/01605682.2019.1702905   

Lin, S.-W., Lu, W.-M., & Lin, F. (2020). Entrusting decisions to the public service pension fund: An integrated predictive model with additive network DEA approach. Journal of the Operational Research Society, 72(5), 1015–1032. https://doi.org/10.1080/01605682.2020.1718011   

Routh, P., Roy, A., & Meyer, J. (2020). Estimating customer churn under competing risks. Journal of the Operational Research Society, 72(5), 1138–1155. https://doi.org/10.1080/01605682.2020.1776166   

Shi, Y., Zhu, J., & Charles, V. (2020). Data science and productivity: A bibliometric review of data science applications and approaches in productivity evaluations. Journal of the Operational Research Society, 72(5), 975–988. https://doi.org/10.1080/01605682.2020.1860661   

Summerfield, N. S., Deokar, A. V., Xu, M., & Zhu, W. (2020). Should drivers cooperate? Performance evaluation of cooperative navigation on simulated road networks using network DEA. Journal of the Operational Research Society, 72(5), 1042–1057. https://doi.org/10.1080/01605682.2019.1700766   

Zhu, J. (2020). DEA under big data: Data enabled analytics and network data envelopment analysis. Annals of Operations Research, 1–23. In press. https://doi.org/10.1007/s10479-020-03668-8 

Zhu, W., Liu, B., Lu, Z., & Yu, Y. (2020). A DEALG methodology for prediction of effective customers of internet financial loan products. Journal of the Operational Research Society, 72(5), 1033–1041. https://doi.org/10.1080/01605682.2019.1700188 [Taylor & Francis On 

https://www.tandfonline.com/doi/full/10.1080/01605682.2021.1892466



Ud. 16.11,2023, 3.45 am Austin, Texas

Pub. 16.7.2021














1 comment: