Thursday, February 11, 2016

Big Data - Introduction

Big data usually includes data sets with sizes beyond the ability of commonly-used software tools to capture, curate, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target, as of 2012 ranging from a few dozen terabytes to many petabytes of data in a single data set. With this difficulty, a new platform of "big data" tools has arisen to handle sensemaking over large quantities of data, as in the Apache Hadoop Big Data Platform.

MIKE2.0, an open approach to Information Management, defines big data in terms of useful permutations of data sources, complexity in interrelationships, and difficulty to delete (or modify) individual records.

In  2001, Gartner analyst Doug Laney defined data growth challenges and opportunities as being three-dimensional, i.e. increasing volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources). Gartner, and now much of the industry, continue to use this "3Vs" model for describing big data. In 2012, Gartner updated its definition as follows: "Big data are high-volume, high-velocity, and/or high-variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization."
(Source:  )

Big data - Four dimensions: Volume, Velocity, Variety, and Veracity (IBM document)
Examples of big data in enterprises

Volume: Enterprises are awash with ever-growing data of all types, easily amassing terabytes—even petabytes—of information.

12 terabytes of Tweets created each day has to analysed to get improved product sentiment analysis
Convert 350 billion annual meter readings to better predict power consumption

Velocity: Sometimes 2 minutes is too late. For time-sensitive processes such as catching fraud, big data must be used as it streams into your enterprise in order to maximize its value.

Scrutinize 5 million trade events created each day to identify potential fraud
Analyze 500 million daily call detail records in real-time to predict customer churn faster

Variety: Big data is any type of data - structured and unstructured data such as text, sensor data, audio, video, click streams, log files and more. New insights are found when analyzing these data types together.

Monitor 100’s of live video feeds from surveillance cameras to target points of interest
Exploit the 80% data growth in images, video and documents to improve customer satisfaction

Veracity:  Establishing trust in big data presents a huge challenge as the variety and number of sources grows.

McKinsey Article on Big Data


11 Feb 2016

Evolution of Big

Analytics 1.0—the era of “business intelligence.”

Analytics 1.0 started gaining an objective, deep understanding of important business phenomena and giving managers the fact-based comprehension to go beyond intuition when making decisions. For the first time, data about production processes, sales, customer interactions, and more were recorded, aggregated, and analyzed.

Updated 11 Feb 2016, 28 Feb 2013

No comments:

Post a Comment