BAO & Big Data Overview Applied to Real-time Campaign GSE Joel Viale Telecom Solutions Lab Solution Architect
Agenda BAO & Big Data - Overview Customer use-cases Live Prototypes: Streams for Real-time Campaign Big Insights for Web Log Analysis 2
The Business Analytics Journey: Transform data into actionable insight! Transactional & Collaborative Applications Integrate Analyze Business Analytics Applications Manage Master Data Big Data Cubes Data Content Data Warehouses Streams Internal & External nformation Sources Streaming Information Quality Govern Lifecycle Security & Privacy 3
What is Big Data? How could it impact your organization? Big Data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling velocity capture, discovery and/or analysis. - Matt Eastwood, IDC http://www.tweetdeck.com/twitter/matteastwood/~hhgsu 4
The Big Data Challenge Manage and benefit from massive and growing amounts of data Handle varied data formats (structured, unstructured, semistructured) and increased data velocity Exploit BIG Data in a timely and cost effective fashion COLLECT Collect MANAGE Manage Integrate INTEGRATE Analyze ANALYZE 5
Bring Together a Large Volume and Variety of Data to Find New Insights Leverage Multi-channel customer sentiment and experience analysis to upsell in real time Detect life-threatening conditions at hospitals in time to intervene Predict weather patterns to plan optimal wind turbine usage, and optimize capital expenditure on asset placement Make risk decisions based on realtime transactional data 6 Identify criminals and threats from disparate video, audio, and data feeds
The Solution IBM s Big Data Platform Bring together any data source, at any velocity, to generate insight Analyzing a variety of data at enormous volumes Insights on streaming data Large volume structured data analysis 7 IBM Big Data Platform Variety Velocity Volume
2 ways of creating insights: Streaming and Storing Big Data Real time analysis of data-in-motion Structured or unstructured Data Queries Results Analytic operations on streaming data in real-time b) streaming data Managing and analyzing Internet-scale volumes Structured or unstructured data Based on Google s MapReduce technology Inspired by Apache Hadoop; compatible with its ecosystem and distribution Well-suited to batch-oriented, read-intensive applications Integrated Text Analytics 8
How to deal with Big Data: Split a big job into smaller pieces Highly Massive Parallel Processing (MPP) -Distributed File System (HDFS) -Map/Reduce 9 9
InfoSphere Streams enables highly scalable stream processing InfoSphere Streams provides a programming model for defining data flow graphs consisting of data sources (inputs), operators, and sinks (outputs) controls for fusing operators into processing elements (PEs) infrastructure to support the composition of scalable stream processing applications from these components deployment and operation of these applications across distributed x86 processing nodes, when scaled-up processing is required 10
Big Data complements traditional warehouse & analytics infrastructure Moving Beyond the Traditional Warehouse Traditional Warehouse Traditional / Relational Data Sources Database & Warehouse At-Rest Data Analytics Results Streams Non-Traditional / Non-Relational Data Sources In-Motion Analytics Ultra Low Latency Results Interne t Scale Non-Traditional/ Non-Relational Data Sources Traditional/Relatio nal Data Sources Internet Scale InfoSphere Big Insights Data Analytics, Data Operations & Model Building Results 11
Agenda BAO & Big Data - Overview Customer use-cases Live Prototypes: Streams for Real-time Campaign Big Insights for Web Log Analysis 12
Examples of Streams use-cases Natural Systems Wildfire management Water management Transportation Intelligent traffic management Manufacturing Process control for microchip fabrication Health & Life Sciences Neonatal ICU monitoring Epidemic early warning system Remote healthcare monitoring Stock market Impact of weather on securities prices Analyze market data at ultra-low latencies Telephony CDR processing Social analysis Churn prediction Geomapping Law Enforcement, Defense & Cyber Security Real-time multimodal surveillance Situational awareness Cyber security detection Fraud prevention Detecting multi-party fraud Real time fraud prevention e-science Space weather prediction Detection of transient events Synchrotron atomic research Other Smart Grid Text Analysis Who s Talking to Whom? ERP for Commodities FPGA Acceleration 13
Streams in Telecommunications Data in motion CDRs Billing CRM Location Dropped Calls Outgoing International Calls Call Duration Extra Call Invoice Issued Contract Expiration Entered new cell Millions of events per second Data Aggregated at Single Customer Level 3 Dropped Calls in the last Location-based hour Promo 5 Outgoing international Calls in the last day 100 SMSs sent in 1 hour 500$ top-up on prepaid Campaign Mgt Real-time Promo Fraud Detection Account Mgt Internet Network New Top-Up 5 minutes left on prepaid Microsecond Latency No game download in last 30 500 Failed SMS minutes Deliveries last 10 minutes Service Assurance Network Monitoring 14 MDM EDW Aggregated Data Records at Service Aggregated Data Level Records at Cell Aggregated Level Voucher Recharge Data Aggregated at Cell or Service Level Real-time Monitoring (Voucher Recharge & Service Usage)
CDR cleansing and pre-processing The Pain Point: A CSP has 6 Billions CDRs per day! As much as 500,000 per second peak rate Kept up, but re-processing meant waiting for nights/weekends to catch up The Solution: InfoSphere Streams to clean and pre-process CDRs De-duplication of CDR against 15 days of data (90B CDRs) Offloading CDRs processing to Streams platform increased the performance of their warehouse for other analytics Single platform for mediation and real time analytics reduced IT complexity
Real-time marketing campaign Business flexibility & responsiveness Insight Information Data active active prescriptive prescriptive The Pain Point 100M CDRs per day from SMS from 25M subscribers, only used to send bills to customers The Solution InfoSphere Streams to create realtime marketing promotions and monetize the CDRs Business value
Typical Telco Use-Cases with InfoSphere Streams ETL-like for massive data to off-load traditional ETL and Warehouse Mediations Prediction and prevention of customer churn In-motion analytics can help identify group leaders Real Time context sensitive promotions Location based promotions Customer minutes usage based promotions Customer smart phone browsing pattern based promotions Network equipment utilization based promotions Real Time & pro-active network monitoring Real Time call traffic & revenue monitoring SMS spam filtering Fraud detection 17
Typical Telco Use-Cases with InfoSphere Big Insights CDR Use Cases: User behavior analysis Prediction on the basis of Customer churn Service association analysis Social Media Analytics: Get insights, improve campaign, influence opinions Fraud Detection: Monitoring/Mining transaction logs to detect fraud activities Recommendation Engines: Improving user experience and likelihood of purchase Network Traffic Logging: Network optimization & fault detection Advertising Optimization: in support of advertising based models Server Logs: Fault detection / Performance related analysis Email and Email Logs: Consumer email analysis & decision system 18
Big Insights at the core of Cognos Consumer Insight Competitive Analysis Business Drivers Corporate Reputation Customer Care Campaign Effectiveness Product Insight Source Areas FACEBOOK BLOGS COMPREHENSIVE ANALYSIS Keyword Search Dimensional Navigation Drill Through to Content Product Capabilities SENTIMENT Dimensional Analysis Filtering Voice DISCUSSION FORUMS TWITTER NEWSGROUPS AFFINITY ANALYTICS Relationship Tables Relationship Matrix Relationship Graph EVOLVING TOPICS Relevant Topics Associated Themes Ranking and Volume MULTILINGUAL 19
Agenda BAO & Big Data - Overview Customer use-cases Live Prototypes: Streams for Real-time Campaign Big Insights for Web Log Analysis 20
Real-time Campaign Solution Architecture Real-time Analytics of InfoSphere Streams Data-in-Motion Advanced BI, Reporting & Real-time Cognos Suite Monitoring Enterprise Marketing Unica Suite Management Advance d Predictiv SPSS e Analytics High-Performance Data Netezza Warehouse Appliance
Outbound campaign, enhanced with Real-time Monitoring and advanced predictive analytics Events Analytics Collaboration Rules Content Monitoring Product Manager Marketing Manager Real-time Monitoring Detect situations and patterns in service usage. Determine the need of a campaign Identify Target Customers and Offering Create Campaig n Predictive Analytics Segment Customers based on Churn Prediction, likelihood of campaign response, LTCV, etc Execute Campaig n Monitor key KPIs Sales/Usage increase % of campaign response Monitor & Evaluate Campaig n VP of Marketin g Define Marketing Strategy Approve Campaig n Customer 22 Network Collect and aggregate xdrs Real-time xdrs Processing Collect, correlate and aggregate data records in real-time Delivery Channel Presence & Rules based channel selection Send campaign message on appropriate channel (including social networks), based on customer preferences and availability/presence
Real-time campaign, based on network events CDRs Dropped Calls Filtering Dropped Calls Aggregation Per Subscriber Real-time Event Detection InfoSphere Streams correlate multiple data records in realtime down to the customer level. Here: detects a number of dropped calls in a certain time CDR Feed InfoSphere Streams Real-time event detection Event Based Campaign Processing Manage Campaign Delivery Monitor Campaign Execution Evaluate Campaign Effectiveness 23 Event-based Campaign The event-based campaign processing is triggered: a campaign is executed in Unica Campaign. Message Delivery The campaign message is delivered to the customer on the appropriate channel Campaign monitoring Monitor campaign execution and analyze effectiveness
Agenda BAO & Big Data - Overview Customer use-cases Live Prototypes: Streams for Real-time Campaign Big Insights for Web Log Analysis 24
The Sample Outdoors Company has many visitors a year to their website and only a subset resulted in purchases. They would like to know what is happening to the other visits? 25
Every time the user browses a page, the click-stream data is logged User goes to web site Search for item Add item to cart User checks out 26
Click Stream Data Analysis for Customer Insights Web logs capture click streams of user Basic analysis of web logs can provide valuable insights to web site usage and user behaviors How many users going to my web site? What pages and products are people looking at? Additional sessionization and aggregation analysis help to detect valuable click patterns and provide insights for: Improving customer service Ads promotion and sales improvement Detection of incidents and errors 27
Big Insights Web Log Analysis Solution Overview Web Server Logs BigInsights Analytics 28
Solution Architecture for Web Log Analysis Aggregated reports on user shopping behavior JAQL Text Analytics (SystemT) Big Insights (HDFS, ) Raw Logs cluster 29