Monday, July 26, 2021

Data Science Research - Journals, Papers and Areas

 Data Science - Areas of Research

Top ten areas - MIT Press - 2020

https://hdsr.mitpress.mit.edu/pub/d9j96ne4/release/2

2020

https://www.analyticsinsight.net/top-10-research-challenge-areas-pursue-data-science/


Data Science - Journals

Journal of Management Analytics

Publish open access in this journal

Focuses on the theory of data analytics and its application in traditional business disciplines, such as accounting, finance, and supply chain management.



2021

Data Science Methodologies: Current Challenges and Future Approaches

I˜nigo Martineza,, Elisabeth Viles, Igor G Olaizolaa

Preprint submitted to Big Data Research - Elsevier

June 15, 2021

https://arxiv.org/pdf/2106.07287


Research questions:

• RQ1: What methodologies can be found on the literature to manage data science projects?

• RQ2: Are these available methodologies prepared to meet the demands of current challenges?


7. Development Workflows for Data Scientists 

Development Workflows for Data Scientists by Github and O’Reilly Media

8. Big Data Ideation, Assessment and Implementation
Big data ideation, assessment and implementation by Martin Vanauer

10. Agile Delivery Framework
Larson and Chang proposed a framework based on the synthesis of agile principles with Business Intelligence (BI), fast analytics and data science. There are two layers of strategic tasks: (A) the top layer includes BI delivery and (B) the bottom layer includes fast analytics and data science.


In this article the conceptual framework is presented for designing integral methodologies for the management of data science projects. The framework proposes  three foundation stones: project, team and data & information management.


The disciplinary research landscape of data science reflected in data science journals

Lingzi Hong , William Moen , Xinchen Yu , Jiangping Chen 

Information Discovery and Delivery (2020)

The research questions for the study are:

RQ1. What is the population of journals that focus on topics of data science?

RQ2. What disciplinary landscape of data science is reveal


Important - Table - Top keywords of disciplines



Saturday, July 24, 2021

Data Science - Techniques

 ________________________



2.78M subscribers
SUBSCRIBE

πŸ”₯ Data Science Master Program (Use Code "π˜πŽπ”π“π”ππ„πŸπŸŽ"): https://www.edureka.co/masters-progra... This Edureka Data Science Full Course video will help you understand and learn Data Science Algorithms in detail. This Data Science Tutorial is ideal for both beginners as well as professionals who want to master Data Science Algorithms. Below are the topics covered in this Data Science for Beginners tutorial video: 00:00 Agenda 2:44 Introduction to Data Science 9:55 Data Analysis at Walmart 13:20 What is Data Science? 14:39 Who is a Data Scientist? 16:50 Data Science Skill Set 21:51 Data Science Job Roles 26:58 Data Life Cycle 30:25 Statistics & Probability 34:31 Categories of Data 34:50 Qualitative Data 36:09 Quantitative Data 39:11 What is Statistics? 41:32 Basic Terminologies in Statistics 42:50 Sampling Techniques 45:31 Random Sampling 46:20 Systematic Sampling 46:50 Stratified Sampling 47:54 Types of Statistics 50:38 Descriptive Statistics 55:52 Measures of Spread 55:56 Range 56:44 Inter Quartile Range 58:58 Variance 59:36 Standard Deviation 1:14:25 Confusion Matrix 1:19:16 Probability 1:24:14 What is Probability? 1:27:13 Types of Events 1:27:58 Probability Distribution 1:28:15 Probability Density Function 1:30:02 Normal Distribution 1:30:51 Standard Deviation & Curve 1:31:19 Central Limit Theorem 1:33:12 Types of Probablity 1:33:34 Marginal Probablity 1:34:06 Joint Probablity 1:34:58 Conditional Probablity 1:35:56 Use-Case 1:39:46 Bayes Theorem 1:45:44 Inferential Statistics 1:56:40 Hypothesis Testing 2:00:34 Basics of Machine Learning 2:01:41 Need for Machine Learning 2:07:03 What is Machine Learning? 2:09:21 Machine Learning Definitions 2:!1:48 Machine Learning Process 2:18:31 Supervised Learning Algorithm
2:19:54 What is Regression? 2:21:23 Linear vs Logistic Regression 2:33:51 Linear Regression 2:25:27 Where is Linear Regression used? 2:27:11 Understanding Linear Regression 2:37:00 What is R-Square?
2:46:35 Logistic Regression 2:51:22 Logistic Regression Curve 2:53:02 Logistic Regression Equation 2:56:21 Logistic Regression Use-Cases 2:58:23 Demo 3:00:57 Implement Logistic Regression 3:02:33 Import Libraries 3:05:28 Analyzing Data 3:11:52 Data Wrangling 3:23:54 Train & Test Data 3:20:44 Implement Logistic Regression 3:31:04 SUV Data Analysis
3:38:44 Decision Trees 3:39:50 What is Classification? 3:42:27 Types of Classification 3:42:27 Decision Tree 3:43:51 Random Forest 3:45:06 Naive Bayes 3:47:12 KNN 3:49:02 What is Decision Tree? 3:55:15 Decision Tree Terminologies 3:56:51 CART Algorithm 3:58:50 Entropy 4:00:15 What is Entropy? 4:23:52 Random Forest 4:27:29 Types of Classifier 4:31:17 Why Random Forest? 4:39:14 What is Random Forest? 4:51:26 How Random Forest Works? 4:51:36 Random Forest Algorithm 5:04:23 K Nearest Neighbour 5:05:33 What is KNN Algorithm? 5:08:50 KNN Algorithm Working 5:14:55 kNN Example 5:24:30 What is Naive Bayes? 5:25:13 Bayes Theorem 5:27:48 Bayes Theorem Proof 5:29:43 Naive Bayes Working 5:39:06 Types of Naive Bayes
5:53:37 Support Vector Machine 5:57:40 What is SVM? 5:59:46 How does SVM work? 6:03:00 Introduction to Non-Linear SVM 6:04:48 SVM Example
6:06:12 Unsupervised Learning Algorithms - KMeans 6:06:18 What is Unsupervised Learning? 6:06:45 Unsupervised Learning: Process Flow 6:07:17 What is Clustering? 6:09:15 Types of Clustering 6:10:15 K-Means Clustering 6:10:40 K-Means Algorithm Working 6:16:17 K-Means Algorithm 6:19:16 Fuzzy C-Means Clustering 6:21:22 Hierarchical Clustering 6:22:53 Association Clustering 6:24:57 Association Rule Mining 6:30:35 Apriori Algorithm 6:37:45 Apriori Demo
6:40:49 What is Reinforcement Learning? 6:42:48 Reinforcement Learning Process 6:51:10 Markov Decision Process 6:54:53 Understanding Q - Learning 7:13:12 Q-Learning Demo 7:25:34 The Bellman Equation
7:48:39 What is Deep Learning? 7:52:53 Why we need Artificial Neuron? 7:54:33 Perceptron Learning Algorithm 7:57:57 Activation Function 8:03:14 Single Layer Perceptron 8:04:04 What is Tensorflow? 8:07:25 Demo 8:21:03 What is a Computational Graph? 8:49:18 Limitations of Single Layer Perceptron 8:50:08 Multi-Layer Perceptron 8:51:24 What is Backpropagation? 8:52:26 Backpropagation Learning Algorithm 8:59:31 Multi-layer Perceptron Demo
9:01:23 Data Science Interview Questions ----------Edureka Data Science Training & Certifications------------ πŸ”΅ Data Science Training using Python: http://bit.ly/2P2Qbl8 πŸ”΅ Data Science Training using R: http://bit.ly/2u5Msw5 πŸ”΅ Python Programming Training: http://bit.ly/2OYsVoE πŸ”΅Python Masters Program: https://bit.ly/3e640cY πŸ”΅ Machine Learning Course using Python: http://bit.ly/2SApG99 πŸ”΅ Data Scientist Masters Program: http://bit.ly/39HLiWJ πŸ”΅ Machine Learning Engineer Masters Program: http://bit.ly/38Ch2MC For more information, please write back to us at sales@edureka.in or call us at IND: 9606058406 / US: 18338555775 (toll free).





_____________________________

Tuesday, July 20, 2021

Deep Learning - Introduction and Bibliography



What is Deep Learning?


Deep learning is a form of machine learning for nonlinear high dimensional data reduction and prediction.

Using  Bayesian probabilistic perspective in deep learning provides a number of advantages. Specifically statistical interpretation and properties, more efficient algorithms for optimisation and
hyper-parameter tuning, and an explanation of predictive performance. 

Traditional high dimensional statistical techniques; principal component analysis (PCA), partial least squares (PLS), reduced rank regression (RRR), projection pursuit regression (PPR) are shallow learners.

Their deep learning counterparts exploit multiple layers of of data reduction which leads to performance gains. Stochastic gradient descent (SGD) training and optimisation and Dropout (DO) provides model and variable selection. Bayesian regularization is central to finding networks and provides a framework for optimal bias-variance trade-off to achieve good out-of sample performance.

To illustrate the use of bayesian perspective,  an analysis of first time international bookings on Airbnb. is presented in the paper.


https://arxiv.org/pdf/1706.00473.pdf



Deep Learning Introduction
___________________


___________________



How to get started with Deep Learning for Data Science?



-1. Learn Python and R ;)

0. Andrew Ng and Coursera

- https://lnkd.in/eUe9YZE

1. Siraj Raval: YouTube channel. Specifically this playlists:

- The Math of Intelligence: https://lnkd.in/eYPJbsW

- Intro to Deep Learning: https://lnkd.in/e4Sg9qy

2. François Chollet's book: Deep Learning with Python (and R soon):

- https://lnkd.in/gfV2ery
- https://lnkd.in/e6_YGqx

3. IBM Cognitive Class:

- https://lnkd.in/eNKPSnJ
- https://lnkd.in/eBVRf-R

4. Medium blogs:

- https://lnkd.in/eaUx5aN
- https://lnkd.in/eGaQwts

5. DataCamp:

- https://lnkd.in/eWVz7e5
- https://lnkd.in/ezXBq6M

Info collected from a Linkedin Post

https://www.linkedin.com/feed/update/urn:li:activity:6363784952114401280

--------------------------------------


Updated 21 July 2021,  2 February 2018
5 June 2017

Data Science - Online Study Programs, Notes and Video Courses - Free Also


UC Berkeley School of Information - Master of Information and Data Science (MIDS) - Curriculum

The online Master of Information and Data Science (MIDS) is designed to educate data science leaders
https://ischoolonline.berkeley.edu/data-science/curriculum/




https://towardsdatascience.com/functions-of-data-science-4afd5341a659

https://towardsdatascience.com/how-youtube-recommends-videos-b6e003a5ab2f

2018

You Need To Keep Learning In Data Science
https://datafloq.com/read/why-you-need-to-keep-learning-in-data-science/

10 Free Must-Read Books for Machine Learning and Data Science
April 2017
https://www.kdnuggets.com/2017/04/10-free-must-read-books-machine-learning-data-science.html 



2016
Learn R Free
https://www.datacamp.com/courses/free-introduction-to-r

http://tryr.codeschool.com/levels/1/challenges/2

Edureka YouTube Video

https://www.youtube.com/watch?v=TGo9F0QyBuE


Businesses Will Need One Million Data Scientists by 2018
International Data Corporation (IDC) predicts a need for 181,000 people with deep analytical skills in the US by 2018 and a requirement for five times that number of positions with data management and interpretation capabilities.
http://www.kdnuggets.com/2016/01/businesses-need-one-million-data-scientists-2018.html



Data analytics is  growing. Now computer applications in industry are broughtly divided into transaction application and intelligence applications. Business intelligence, data mining, data analytics, data science etc. are the subjects that are in the area of intelligence applications of computers in business organizations.
_______________

_______________



Updated  14 Feb 2016, 7 Feb 2016



NPTEL IIT Madras Course: Introduction to Data Analytics

http://nptel.ac.in/courses/110106064/







Harvard Stat 221 “Statistical Computing and Visualization”:  Online Lecture Links
http://harvarddatascience.com/2013/05/05/harvard-stat-221-statistical-computing-and-visualization-all-lectures-online/


Data Analysis
26 Resources 310+ Hours 24,298 Learners
Learn how to manipulate and analyze data better with this free online curriculum
https://www.springboard.com/learning-paths/data-analysis/





The Open Source Data Science Masters
Curriculum for Data Science
Follow me on Twitter @clarecorthell   - Follow the author of this blog on   @knoltweet

The Open-Source Data Science Masters
The open-source curriculum for learning Data Science.
Foundational in both theory and technologies, the OSDSM breaks down the core competencies necessary to make data useful.
http://datasciencemasters.org/

Updated  21 July 2021
13 July 2018, 2 February 2018
14 Apr 2016,  14 Feb 2016