Deep learning is a form of machine learning for nonlinear high dimensional data reduction
Using Bayesian probabilistic perspective in deep learning provides a number of advantages. Specifically statistical interpretation and properties, more efficient algorithms for optimisation and
hyper-parameter tuning, and an explanation of predictive performance. Traditional high dimensional
statistical techniques; principal component analysis (PCA), partial least squares
(PLS), reduced rank regression (RRR), projection pursuit regression (PPR) are shallow learners. Their deep learning counterparts exploit multiple layers of of data reduction which leads to performance gains. Stochastic gradient descent (SGD) training and optimisation
and Dropout (DO) provides model and variable selection. Bayesian regularization
is central to finding networks and provides a framework for optimal bias-variance trade-off
to achieve good out-of sample performance.
To illustrate the use of bayesian perspective, an analysis of first time international bookings on Airbnb. is presented in the paper.