Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013
Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative? Summary 2
What is Big Data? 3
What is Big Data? Big data" is high-volume, -velocity, -variety and -veracity information assets that demand costeffective, innovative forms of information processing for enhanced insight and decision making. Volume (TB to ZB) Velocity (streaming &large volume data movement) Variety (relational & nonrelational data types) Model, Predict and Score Twitter RFID Machine Data Monitors Relational Video Facebook Click Stream Trades & Transactions Identity Geospatial Text Measure and Analyze Cost-effective Veracity (managing the reliability and predictability of inherently imprecise data types) 4
What might a Big Data platform look like? Data Warehouse Hadoop BI/ Reporting Information Integration Stream Computing Content Analytics Functional Apps Exploration/ Visualization Industry Apps Instrumentation Analytics Predictive Analytics 5
What is Hadoop? Open source software project Distributed processing of large data sets Leverage clusters of commodity servers Scale from single server to thousands of machines High degree of fault tolerance (detects and handles failures at the application layer) 6
What are the benefits of Hadoop? Scalable New nodes can be added as needed Add without needing to change: data formats how data is loaded how jobs are written the applications Cost effective Massively parallel computing on commodity servers Sizeable decrease in the cost per terabyte of storage Fault tolerant Redirects work to another location of the data Continues processing Flexible Schema-less Can absorb any type of data, structured or not Any number of sources Data from multiple sources can be joined and aggregated in arbitrary ways 7
What are the key components of Hadoop? MapReduce Hadoop Distributed File System (HDFS) Pig Hive ZooKeeper 8
What does a Big Data platform do? Analyze a Variety of Information Novel analytics on a broad set of mixed information that could not be analyzed before. Analyze Information in Motion Streaming data analysis Large volume data bursts and ad hoc analysis Analyze Extreme Volumes of Information Cost-efficiently process and analyze petabytes of information Manage and analyze high volumes of structured, relational data Discover and Experiment Ad hoc analytics, data discovery and experimentation Manage and Plan Enforce data structure, integrity and control to ensure consistency for repeatable queries 9
How does a Big Data platform fit? Data Warehouse Big Data Platform Enterprise Integration Traditional Sources New Sources 10
Is the approach the same? Traditional Approach Structured & Repeatable Analysis Big Data Approach Iterative and Exploratory Analysis Business Users Determine what questions to ask IT Delivers a platform to enable creative discovery IT Structures the data to answer the questions Monthly sales reports Profitability analysis Customer surveys Business Users Explore what questions could be asked Brand sentiment Product strategy Maximum asset utilization 11
Leveraging Big Data 12
What can you do with Big Data? Analyze Information in Motion Smart Grid management Multimodal surveillance Real-time promotions Cyber security ICU monitoring Options trading Click-stream analysis CDR processing IT log analysis RFID tracking and analysis Analyze Extreme Volumes of Information Transaction analysis to create insightbased product/service offerings Fraud monitoring and detection Risk modeling and management Social media/sentiment analysis Environmental analysis 13 Manage and Plan Operational analytics BI reporting Planning and forecasting analysis Predictive analysis Analyze a Variety of Information Social media/sentiment analysis Geospatial analysis Brand strategy Scientific research Epidemic early warning system Market analysis Video analysis Audio analysis Discovery and Experimentation Sentiment analysis Brand strategy Scientific research Ad hoc analysis Model development Hypothesis testing Transaction analysis to create insight-based product/service offerings
What are some use cases? Fraud Detection and Modeling o 360 View of the Customer Smart Grid / Smarter Utilities Cyber Security Email, Call Center Transcript Analysis Risk Modeling & Management Call Detail Record Analysis Threat Detection / Multi-modal Surveillance RFID Tracking and Analysis Geo-marketing 14
What are some analytics examples? Financial Services Improved risk decisions Customer sentiment analysis AML (Anti Money Laundering) Transportation Weather and traffic impact on logistics and fuel consumption Call Centers Voice-to-text for customer behavior understanding Telecommunications Operations and failure analysis from device, sensor, and GPS inputs Utilities Weather impact analysis on power generation Smart meter data analysis IT Transaction log analysis for multiple transactional systems E Commerce Internet behavior and buying patterns Digital asset piracy Multi-channel Integration Integrated customer behavior modeling 15
What are some streaming analytics examples? Transportation Intelligent traffic management Manufacturing Process control for microchip fabrication Natural Systems Wild fire management Water management Health & Life Sciences Neonatal ICU monitoring Epidemic early warning system Remote healthcare monitoring Telephony CDR processing Social analysis Churn prediction Geomapping Stock Market Impact of weather on securities prices Market analysis at ultra-low latencies Law Enforcement, Defense & Cyber Security Real-time multimodal surveillance Situational awareness Cyber security detection Fraud Prevention Detecting multi-party fraud Real time fraud prevention e-science Space weather prediction Detection of transient events Genomics research Other Smart Grid Text analysis Who s talking to whom? 16
Preparing for a Big Data Initiative 17
Five Practical Questions 18
What do you want to know? Business Objectives Improved decision-making Better business performance Needs Postulates Questions Results Improved customer satisfaction Increased profit margin Expanded social awareness 19
Big Data or lots of data? or 20
Is there a data source? Surveys Twitter LinkedIn Foursquare Sentiment Analysis Demographics Sales Geospatial Identity Facial Recognition Predictive Analytics License Plate Recognition Effectiveness Site behavior & Experience Ad Campaigns Facebook Blogs Competitors Weather RFID Monitors Machine Data Trades & Transactions Display Media 21
Is it worth it? Labor Options ROI Sourcing Hardware & Software 22
Will it work? Model, Predict and Score Options Resources (Internal & External) Measure and Analyze Intranet & Extranet Time & Money 23
Summary 24
Summary Big Data High-volume, -velocity, -variety and -veracity information assets Cost-effective, innovative forms of information processing Enhanced insight and decision making Features and Functions Analyze a variety of information Analyze information in motion Analyze extreme volumes of information Discover and experiment Manage and plan Be Pragmatic Business-driven Provable ROI Proof of concept Not for everyone Uses Wide applicability Cross-industry Iterative and exploratory Complimentary to BI/DW 25
For More Information Jim Gallo National Director, Business Analytics Information Control Corporation jgallo@iccohio.com (614) 523-3070 x192 26