A Strategic Approach to Unlock the Opportunities from Big Data



Similar documents
From Data to Foresight:

Big Data, Integration and Governance: Ask the Experts

Exploiting Data at Rest and Data in Motion with a Big Data Platform

A New Era Of Analytic

DGE /DG Connect

No Data Governance, No Actionable Insights

Data Centric Systems (DCS)

Test Data Management in the New Era of Computing

Big Data, Analytics, Intelligence: Potenziale und Nutzen

Big Data & Analytics for Semiconductor Manufacturing

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

Analyzing Big Data: The Path to Competitive Advantage

A New Era of Computing

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Beyond Watson: The Business Implications of Big Data

The Lab and The Factory

BIG Data Analytics Move to Competitive Advantage

Big Data & Analytics. The. Deal. About. Jacob Büchler jbuechler@dk.ibm.com Cand. Polit. IBM Denmark, Solution Exec IBM Corporation

Big Data Integration: A Buyer's Guide

BIG DATA. - How big data transforms our world. Kim Escherich Executive Innovation Architect, IBM Global Business Services

Big Data Use Case Deep Dive 5 Game Changing Use Cases for Big Data

Data Centric Computing Revisited

The Internet of Things

Deploying Big Data to the Cloud: Roadmap for Success

Smarter Analytics. Barbara Cain. Driving Value from Big Data

Industry Impact of Big Data in the Cloud: An IBM Perspective

Are You Ready for Big Data?

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

Big Data and the new trends for BI and Analytics Juha Teljo Business Intelligence and Predictive Solutions Executive IBM Europe

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Synergies between the Big Data Value (BDV) Public Private Partnership and the Helix Nebula Initiative (HNI)

Big Data in Healthcare: Myth, Hype, and Hope

Big Data a threat or a chance?

Are You Ready for Big Data?

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

BIG DATA. Value 8/14/2014 WHAT IS BIG DATA? THE 5 V'S OF BIG DATA WHAT IS BIG DATA?

We are Big Data A Sonian Whitepaper

The Future of Business Analytics is Now! 2013 IBM Corporation

Big Data Effects on Weather and Climate

Decoding CAMS: Cloud, Analytics, Mobile, & Social Technologies: A Discussion of the Implications for Enterprises and their Providers

How the oil and gas industry can gain value from Big Data?

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

Business Analytics for Big Data

Big Data Analytics: Driving Value Beyond the Hype

Big Data and Trusted Information

Demystifying Big Data Government Agencies & The Big Data Phenomenon

Understanding traffic flow

Information Visualization WS 2013/14 11 Visual Analytics

How To Use Big Data Effectively

IBM Big Data Platform

PICTURE Project Final Event. 21 May 2014 Minsk, Belarus

North Highland Data and Analytics. Data Governance Considerations for Big Data Analytics

Luncheon Webinar Series May 13, 2013

Big Data overview. Livio Ventura. SICS Software week, Sept Cloud and Big Data Day

Statistics for BIG data

How To Understand The Benefits Of Big Data

EMC ADVERTISING ANALYTICS SERVICE FOR MEDIA & ENTERTAINMENT

Good morning. It is a pleasure to be with you here today to talk about the value and promise of Big Data.

Strategic Decisions Supported by SAP Big Data Solutions. Angélica Bedoya / Strategic Solutions GTM Mar /2014

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

Transforming the Telecoms Business using Big Data and Analytics

Safe Harbor Statement

Big Data: Image & Video Analytics

Data Lake-based Approaches to Regulatory- Driven Technology Challenges

Tapping the benefits of business analytics and optimization

It s a New World: Innovations in Oncology Data Analytics. By Mahmood Majeed and Prashant Poddar

VIEWPOINT. High Performance Analytics. Industry Context and Trends

What happens when Big Data and Master Data come together?

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Training for Big Data

Big Data and the Data Lake. February 2015

RC & CREATING DATA PRIVACY OPPORTUNITIES USING BIG IN EUROPE DATA AND ANALYTICS. risk compliance RISK & COMPLIANCE MAGAZINE.

The Scientific Data Mining Process

Game On: How Information is Changing the Rules of Insurance

Exploiting the power of Big Data

BIG DATA FUNDAMENTALS

Big Data Use Cases Update

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

Transcription:

A Strategic Approach to Unlock the Opportunities from Big Data Yue Pan, Chief Scientist for Information Management and Healthcare IBM Research - China [contacts: panyue@cn.ibm.com ]

Big Data or Big Illusion? Much of the focus on the big data zoo has missed one key point: big or small, it s still data. It must be managed and integrated across the entire enterprise to extract its full value, to ensure its consistent use. Barry Devlin, The Big Data Zoo --- Taming the Beasts *Source: Gartner,

A Bird s Eye View of Big Data 12+ TBs of tweet data every day 30 billion RFID tags today (1.3B in 2005) 4.6 billion camera phones world wide? TBs of data every day 25+ TBs of log data every day 76 million smart meters in 2009 200M by 2014 100s of millions of GPS enabled devices sold annually 2+ billion people on the Web by end 2011

of tweet data every day 25+ TBs of log data every day (1.3B in 2005) 76 million smart meters in 2009 200M by 2014 phones world wide annually on the Web by end 2011 A Bird s Eye View of Big Data The three domains of information* 30 billion RFID tags today 4.6 bill ion camera 12+ TBs 100s of milli ons of GPS ena bled devices sold? TBs of data every day 2 + bi lli o n people *Source: Barry Devlin, The Big Data Zoo --- Taming the Beasts

The fourth dimension of Big Data: Veracity handling data in doubt Volume Velocity Variety Veracity* Data at Rest Data in Motion Data in Many Forms Data in Doubt Terabytes to exabytes of existing data to process Streaming data, milliseconds to seconds to respond Structured, unstructured, text, multimedia Uncertainty due to data inconsistency & incompleteness, ambiguities, latency, deception, model approximations * Truthfulness, accuracy or precision, correctness 5

Tame Big Data, Turn into Insight - Example: IBM Watson Watson s advanced analytic capabilities sort through the equivalent of 200 MILLION pages of data to uncover an answer in 3 SECONDS.

Jeopardy Challenge the Broad Domain We do NOT attempt to anticipate all questions and build databases. We do NOT try to build a formal model of the world 3.00% 2.50% 2.00% 1.50% 1.00% In a random sample of 20,000 questions we found 2,500 distinct types*. The most frequent occurring <3% of the time. The distribution has a very long tail. And for each these types 1000 s of different things may be asked. Even going for the head of the tail will barely make a dent 0.50% 0.00% he film group capital woman song singer show composer title fruit planet there person language holiday color place son tree line product birds animals site lady province dog substance insect way founder senator form disease someone maker father words object writer novelist heroine dish post month vegetable sign countries hat bay *13% are non-distinct (e.g, it, this, these or NA) Our Focus is on reusable NLP technology for analyzing vast volumes of as-is text. Structured sources (DBs and KBs) provide background knowledge for interpreting the text. 7

Algorithms built in Watson

Most Client Use Cases Combine Multiple Technologies Pre-processing Ingest and analyze unstructured data types and convert to structured data Combine structured and unstructured analysis Augment data warehouse with additional external sources, such as social media Combine high velocity and historical analysis Analyze and react to data in motion; adjust models with deep historical analysis Reuse structured data for exploratory analysis Experimentation and ad-hoc analysis with structured data

Advanced analytics requires a robust, comprehensive information platform Trusted Relevant Governed Transactional & Collaborative Applications Integrate Analyze Content Business Analytics Applications Manage Master Data Big Data Cubes Warehouse Data ODS Streams External Information Sources Content Streaming Information Govern Data Model Information Governance Quality Lifecycle Security & Privacy Standards

Big Data for Research and Innovation Based on empirical research or simulation results Exploit intensive computation and big data technology Combine domain expert s knowledge and data scientist s skills The Fourth Paradigm: Data-Intensive Scientific Discovery

Research: the road from data to foresight is long and expensive The 4 V s of Data? Volume Velocity Variety Veracity Data at Rest Data in Motion Data in Many Forms Data in Doubt Must acquire, integrate, enhance and align Must deal with missing and incomplete data Must store, protect, and manage Must create models and other analytics and test them Must run these analyses efficiently over large data volumes Must understand and share results Requires significant EXPERTISE in data management, systems, analytics, and the domain Takes TIME and MONEY

A Plug-and-Play environment could reduce cost and risk The Institute for Massive Data, Analytics and Modeling will unlock the value of data by providing a plug-and-play environment for exploring massive data Pre-integrated data sets to provide context Powerful infrastructure for data management and analytics Rich collection of analytics and tools for analysis Expertise in all aspects of the process Lets the domain expert focus on their strengths; we handle the data challenges Leverage these capabilities across multiple domains, and multiple investigations, to solve important problems for people, industry and the world at large Reduce costs, risk, and time to value! Center for Energy Optimization Center for Water Management Center for Oncology Analytics User Services: Visualization, Reporting, Collaboration Center for Business Risk Exposure The Institute for Massive Data, Analytics and Modeling Add l Projects Human-Computer Interaction expertise Application Layer: Models, Analytics, Applications Data and Analytic Services & Tools: Libraries, Catalogs Data Management Data Preparation & Ingestion Systems Infrastructure New (Big) Data Analysis Traditional Data Analysis System Management BAO consultants Data scientists Researchers in information mgmt Computer systems researchers IT operations support Scientific Innovation and Services

The Institute as an Ecosystem: Vision IBM Universities Provide: Domain expertise Research leadership Students: labor and talent Additional data and analytics Get: Commercialization opportunities Recruitment/training for students Leverage for funding opportunities Provides: MADAM core capabilities: analytics, infrastructure, data, expertise Facilities, working space Business development leadership Commercialization vehicles Gets: Access to top talent, trained on IBM tools Leverage for funding opportunities Sales enablement The Institute for Massive Data, Analytics and Modeling Data Providers Provide: Data and analytics Path to market Domain expertise Get: Observe users, new use cases Exposure to new clients Sales enablement All Get: Accelerated innovation Rich research env t PR opportunities Shared cost, shared risk Provides: Business needs and challenges Data Funding Gets: Solutions to specific problems Access to talent Industry Provides: Needs and challenges Data Funding Gets: Economic development Talent development (new skills) Government All Provide: Expertise Specific data and IP Enabling the Benefits of Big Data

Conclusion Big Data doesn t operate in a silo. Most Client Use Cases Combine Multiple Technologies Big Data Platform and Open Collaboration could reduce cost and risk

Thank you! 16