What is big data? Raul F. Chong Senior program manager Big data, DB2, and Cloud IM Cloud Computing Center of Competence - IBM Toronto Lab, Canada 1 2011 IBM Corporation
Agenda The world is changing What is big data? The IBM big data platform Examples of big data solutions Big data solutions and the cloud Setting up a Hadoop cluster on the cloud 2 2011 IBM Corporation
Agenda The world is changing What is big data? The IBM big data platform Examples of big data solutions Big data solutions and the cloud Setting up a Hadoop cluster on the cloud 3 2011 IBM Corporation
The world is changing and becoming more 2 Billion internet users 4.6 Billion mobile phones 4 2011 IBM Corporation
The world is changing and becoming more Is it really? 5 2011 IBM Corporation
Agenda The world is changing What is big data? The IBM big data platform Examples of big data solutions Big data solutions and the cloud Setting up a Hadoop cluster on the cloud 6 2011 IBM Corporation
What is big data? Big data are datasets that grow so large that they become awkward to work with using on-hand database management tools. Difficulties include capture, storage, search, sharing, analytics, and visualizing. Source: Wikipedia 7 2011 IBM Corporation
Big data characteristics 44x as much data and content over coming decade 2020 35 zettabytes 2009 800,000 petabytes Information is growing at a phenomenal rate Source: IDC, The Digital Universe Decade Are You Ready?, May 2010 8 2011 IBM Corporation
Big data characteristics About 80%of the world s data is unstructured It may be data we ve been collecting before, but could not process 9 2011 IBM Corporation
10 2011 IBM Corporation
Two types of big data Data in movement -streams Twitter / Facebook comments Stock market data Sensors: Vital signs of a newly-born Data at rest -oceans Collection of what has streamed Web logs, emails, social media Unstructured documents: forms, claims Structured data from disparate systems 11 2011 IBM Corporation
Agenda The world is changing What is big data? The IBM big data platform Examples of big data solutions Big data solutions and the cloud Setting up a Hadoop cluster on the cloud 12 2011 IBM Corporation
The IBM Big Data Platform Data Warehouse Client and Partner Solutions IBM Big Data Solutions InfoSphere Warehouse Warehouse Appliances Big Data User Environments Developers End Users Netezza Administrators Master Data Mgmt InfoSphere MDM INTEGRATION AGENTS Big Data Enterprise Engines Open Source Foundational Components Hadoop HBase Pig Lucene Jaql Information Server InfoSphere BigInsights InfoSphere Streams Database DB2 Content Analytics ECM Business Analytics Cognos & SPSS Marketing Unica Data Growth Management InfoSphere Optim 13 2011 IBM Corporation
The IBM Big Data Platform Data Warehouse Big Data Platform Enterprise Integration Traditional Sources New Sources 14 2011 IBM Corporation
Traditional vs. big data business approaches Traditional Approach Structured & Repeatable Analysis Big Data Approach Iterative & Exploratory Analysis Business Users Determine what question to ask IT Delivers a platform to enable creative discovery IT Structures the data to answer that question Monthly sales reports Profitability analysis Customer surveys Business Explores what questions could be asked Brand sentiment Product strategy Maximum asset utilization 15 2011 IBM Corporation
Agenda The world is changing What is big data? The IBM big data platform Examples of big data solutions Big data solutions and the cloud Setting up a Hadoop cluster on the cloud 16 2011 IBM Corporation
Examples of big data solutions Multi-channel customer sentiment and experience analysis Detect life-threatening conditions at hospitals in time to intervene Make risk decisions based on real-time transactional data Identify criminals and threats from disparate video, audio, and data feeds Predict weather patterns to plan optimal wind turbine usage, and optimize capital expenditure on asset placement 17 2011 IBM Corporation
18 18 2011 IBM Corporation
19 19 2011 IBM Corporation
Examples IBM Watson 20 2011 IBM Corporation
Agenda The world is changing What is big data? The IBM big data platform Examples of big data solutions Big data solutions and the cloud Setting up a Hadoop cluster on the cloud 21 2011 IBM Corporation
Big data solutions and the cloud Big Data solutions and the Cloud are a good fit. The Cloud provides On-demand access to resources Elasticity Utility-like billing Big Data solutions on the Cloud are mainly appropriate for test/development Eg: Hadoop was not designed for virtualized environments 22 2011 IBM Corporation
Demo Setting up a Hadoop cluster on the IBM Cloud 23 2011 IBM Corporation
24 2011 IBM Corporation
25 2011 IBM Corporation
26 2011 IBM Corporation
27 2011 IBM Corporation
28 2011 IBM Corporation
29 2011 IBM Corporation
30 2011 IBM Corporation
31 2011 IBM Corporation
32 2011 IBM Corporation
33 2011 IBM Corporation
34 2011 IBM Corporation
35 2011 IBM Corporation
36 2011 IBM Corporation
37 2011 IBM Corporation
38 2011 IBM Corporation
INSTRUMENTED INTERCONNECTED INTELLIGENT The resulting explosion of information creates a need for a new kind of intelligence to help build a Smarter Planet 39 2011 IBM Corporation
For more information 40 2011 IBM Corporation
Thank you! Questions? 41 2011 IBM Corporation