Saturday, June 24, 2017

Sunday, June 18, 2017

Microsoft Power BI



What is Power BI?

Power BI is a suite of business analytics tools that deliver insights throughout your organization. Connect to hundreds of data sources, simplify data prep, and drive ad hoc analysis. Produce beautiful reports, then publish them for your organization to consume on the web and across mobile devices. Everyone can create personalized dashboards with a unique, 360-degree view of their business. And scale across the enterprise, with governance and security built-in.

https://powerbi.microsoft.com/en-us/


The rise of self service BI

______________________

______________________
Microsoft Power BI

Thursday, June 8, 2017

Cloudera Altus - Data Engineering Solution



A cloud service for engineers

For data engineers
Focus on the workload. Cloudera Altus elevates data pipeline operations over cluster operations

Leave the management to Altus. Data engineering experience is delivered as a service. Altus takes care of cluster management and operations

Analyze and troubleshoot jobs. Don’t waste time sifting through logs to find the root cause of a failed job. Monitor your work as it’s running, or let us help you troubleshoot

Simple migration and backward compatibility. Easily move your on-prem workloads to and from the cloud with minimal risk. Leverage new platform versions without breaking compatibility with existing applications

Data engineering made easy
Cloudera Altus is a managed service that makes it easier than ever to execute data pipelines. Launch your cluster in minutes on AWS, and start exploring and extracting value from all your data. Over time Cloudera plans to expand Altus to support other leading public clouds such as Microsoft Azure, etc.


https://www.cloudera.com/products/altus.html


The initial rollout of Cloudera Altus includes support for Apache Spark, Apache Hive on MapReduce2, and Hive on Spark. It is available today in most Amazon Web Services (AWS) regions.

https://www.cloudera.com/more/news-and-events/press-releases/2017-05-24-cloudera-launches-altus-to-simplify-big-data-workloads-in-the-cloud.html

Wednesday, June 7, 2017

Google BIGQUERY - Enterprise Cloud Data Warehouse



BigQuery is Google's fully managed, petabyte scale, low cost enterprise data warehouse for analytics. BigQuery is serverless. There is no infrastructure to manage and you don't need a database administrator, so you can focus on analyzing data to find meaningful insights using familiar SQL. BigQuery is a powerful Big Data analytics platform used by all types of organizations, from startups to Fortune 500 companies.

Speed & Scale

BigQuery can scan TB in seconds and PB in minutes. Load your data from Google Cloud Storage or Google Cloud Datastore, or stream it into BigQuery to enable real-time analysis of your data. With BigQuery you can easily scale your database from GBs to PBs.

https://cloud.google.com/bigquery/


Building and scaling new business models to gain insights from disparate data faster, while reducing IT costs, requires an architecture that can go from prototype to petabyte scale as your needs evolve. Google BigQuery’s serverless architecture can help ensure that your enterprise data warehouse withstands growth at any scale. Informatica helps you unlock the power of hybrid data with high performance, highly scalable data management solutions that efficiently move and manage large volumes of data to Google BigQuery. Informatica and Google BigQuery is the best combination for modernizing your data architecture.

https://www.brighttalk.com/webcast/10477/258043/modernize-your-data-architecture-with-google-bigquery-and-informatica

Tuesday, June 6, 2017

Artificial Neural Networks - Introduction



Author: Marek Libra

Posted under creative commons from Knol

The Artificial Neural Network (NN later) is a favorite subject to study in the recent years. It was successfully applied in a wide range of problem domains like finance, engineering, medicine, geology, physics or control.

    Neural networks are useful especially for solving problems of prediction, classification or control. They are also a good alternative to classical statistical approaches like regression analysis.

    The artificial neural networks try to model some properties of biological neural networks. The biological neural networks build the nervous system of biological organisms. This inspiration is commonly known fact and it is mentioned in most of neural networks publications.

    The NN is built from a large number of simple processing units called artificial neurons (called
just neurons later).

    The interface of an artificial neuron stays from n numeric inputs and one numeric output. Some models of neurons consider one next special input called bias. Each the input is evaluated by its numeric weight. The neuron can perform two operations: compute and adapt.


    The compute operation transforms inputs to output. The compute operation takes numerical
inputs and computes their weighted sum. It performs a so called activation function to this sum
(a mathematical transformation) afterwards. The result of the activation function is set as a value
to the output interface.

    The adapt operation, based on a pair of inputs and awaited outputs specified by the user,
tunes the weights of an NN for a better approximation of the computed output compared to
the awaited output for considered input.

    The neurons in an NN are ordered and numerically signed (from N) according to the order.
    A lot of models of NNs are known. These models differs to each other by different usage of

    • the domain of numeric input, output and weights (real, integer or finite set like {0,1}),

    • the presence of bias (yes or no),

    • the definition of an activation function (sigmoid, hyperbolic tangents, discrete threshold,
      etc),

    • the topology of interconnected neurons (feed-forward or recurrent),

    • the ability to change the number of neurons or the network topology during the lifetime of
      the network,

    • the algorithm of the computation flow through the network over neurons,

    • the simulation time (discrete or continuous) or

    • the adaptation algorithm (none, back propagation, perceptron rule, genetic, simulated annealing etc.).

    A good taxonomy of NN models can be found i.e. in [1]

    More detailed general descriptions, which are formal and well readable, can be found in [2] .

References

  • [1] Šíma and P. Orponen. General purpose computation with neural
  • [2] David M Skapura. Building Neural Networks. Addison-Wesley, 1995

Source Knol: http://knol.google.com/k/marek-libra/artificial-neural-networks/5rqq7q8930m0/12#
Knol Nrao - 5193


Further Reading

Simple mathematical steps in Neural Network problem solving



Mode detailed reading

Artificial Neural Networks: Mathematics of Backpropagation (Part 4)
October 28, 2014 in ml primers, neural networks
http://briandolhansky.com/blog/2013/9/27/artificial-neural-networks-backpropagation-part-4



Further Reading

Knols



Updated 8 June 2017, 8 May 2017, 28 April 2012.

Monday, June 5, 2017

Blockchain Technology - Introduction and Bibliography



What is Blockchain Technology?


“The blockchain is an incorruptible digital ledger of economic transactions that can be programmed to record not just financial transactions but virtually everything of value.”
Don & Alex Tapscott, authors Blockchain Revolution (2016)


Alex Tapscott: "Blockchain Revolution" | Talks at Google
11 July 2016
_____________________________


_____________________________
Talks at Google


http://blockchain-revolution.com/

Dan Tapscott

https://www.linkedin.com/pulse/whats-next-generation-internet-surprise-its-all-don-tapscott





What is Blockchain Technology? A Step-by-Step Guide For Beginners
An in-depth guide by BlockGeeks
https://blockgeeks.com/guides/what-is-blockchain-technology/

Deep Learning - Introduction and Bibliography





Deep learning is a form of machine learning for nonlinear high dimensional data reduction
and prediction.

Using  Bayesian probabilistic perspective in deep learning provides a number of advantages. Specifically statistical interpretation and properties, more efficient algorithms for optimisation and
hyper-parameter tuning, and an explanation of predictive performance. Traditional high dimensional
statistical techniques; principal component analysis (PCA), partial least squares
(PLS), reduced rank regression (RRR), projection pursuit regression (PPR) are shallow learners. Their deep learning counterparts exploit multiple layers of of data reduction which leads to performance gains. Stochastic gradient descent (SGD) training and optimisation
and Dropout (DO) provides model and variable selection. Bayesian regularization
is central to finding networks and provides a framework for optimal bias-variance trade-off
to achieve good out-of sample performance.

To illustrate the use of bayesian perspective,  an analysis of first time international bookings on Airbnb. is presented in the paper.


https://arxiv.org/pdf/1706.00473.pdf

What is Data Science? - An Introduction to Data Science


Data driven or data analysis driven decision making is age old. But new data processing technology allows people to process data in ways that was not done before. Hence data will drive business decisions much more intensively in the next decade.


IT departments are not content anymore with just providing technology for processing data. The discipline and the profession of  IT is getting  involved in finding and understanding the relevance of new data sources, big and small.

The practice of business intelligence is  expanding to create to develop capabilities for analyzing and visualizing structured and unstructured data for their relevance for business decision making, and then building applications that can be run on a periodic basis which can as small as even seconds to take crime or fraud prevention activities.

Data science is the name of this emerging discipline.

Data Science Tutorial 1 - Video

__________________________

__________________________
edureka!

More videos are available on YouTube on Data Science




Concise Visual Summary of Deep Learning Architectures
Basically neural network architectures
http://www.datasciencecentral.com/profiles/blogs/concise-visual-summary-of-deep-learning-architectures


http://www.datasciencecentral.com has number of articles on data science.


Updated on 7 June 2017, 2 September 2014