IEEE CQR San Diego May 16, 2012 Timely Decisions with Big Data: Opportunities & Challenges Anukool Lakhina Guavus, Inc. GUAVUS INC. ALL RIGHTS RESERVED
Double Clicking on Big Data Sources of Big Data: Data became big because networks of sensors started generakng data Defining Big Data: Big is not just about volume, it is also velocity at which data is produced, the variety & # of sources which generate data, and the cardinality of the dimensions
Opportunities with Big Data Information Systems Devices & Networks Enterprise Apps Networks Databases Data at Rest Views Data in Motion Flows Data Warehouses Finance & Regulatory Network Engineering Marketing Customer Care Dynamic & Tiered Pricing Traffic Engineering Capacity Planning Peering Optimization Subscriber Segmentation Device Returns Mobile malware Data Overage Dispute Infected Devices Contract/SLA Enforcement Targeted Advertising Network Security DDOS & Anomalies
A Typical Scenario OpportuniKes Scale Challenges Network: How much demand is the new ipad punng on my network now? Marke<ng: How can I profile my subscribers, and target them with just- in- Kme tailored monekzakon offers, e.g. adverksements? Care: How do I respond to customers when they call to complain about data charges? Security: How can I detect & isolate devices that are using malware apps / URLs? 100M+ Customers 250+ Billion records generated per day 200+ Terabytes per day High Dozens of DistribuKon Centers PB of stored data for compliance & forensics High Dozens of disknct feeds from network that need to be fused for e2e view Need for high availability, rapid recovery & fail- over
Challenge: Need for an End-To-End View Decisioning Apps Analytics Platform Reports BI Dashboards Decision Support & Planning systems External analysis workflows Data Feeds Analytics & Reporting Data Ingestion Fabric Probes / Switch feeds Device-side Clients Signaling & Control Network probes RF / Network Data Usage PCMD Signaling Connections Mobility (MIP v4) CDR / SMS BSS IP EDR OSS 3G - CDMA 4G LTE / HSPA MME PCRF Serving Gateway RNC / enodeb PCF Billing Policy Control Service Control PDSN / FA PDN Gateway HA M/S MSC Operator s IP Services
Challenge: Big Data Economics Solutions Purpose-Built for Big Network Data Traditional Solutions, when adopted for Big Network Data Compute Storage Storage Compute Transport Transport Centralized, Store- First Architecture Consolidate data in a repository to analyze To answer a queskon, first need to transport and store the data Transport and storage costs alone may put it over budget Project may not even get started Distributed, Compute- First Architecture Move processing to data edge Spend on analykcs first, then on others, by not transporkng and storing data you do not need ConKnuous processing yields Kmely insights & ackons Reduce overall spend per new analykcs queskon
An Example Distributed Deployment DPI LTE SMS LBS PDSN Edge Analy<cs IngesKon Edge Analy<cs Edge Analy<cs 1:1000 savings in transport costs Region 13 DPI LTE SMS LBS PDSN Data Silos DPI LTE SMS LBS PDSN Region 1 DPI LTE SMS LBS PDSN Region 8 Region 3 Edge Analy<cs Digest Fusion Data Center AggregaKon Aggregate Analy<cs Enriched Stream Dynamic Fusion ConKnuous Machine Learning Regional compressed store for archival and forensics Intelligent Aggrega<on 1. Time series modeling 2. AutomaKc, mulk- feature, profiling & segmentakon 3. Outlier/anomalies 4. PredicKon and forecast Access Control Live GUI Export Reports Compressed store for ad- hoc search (Processed) 7
Decisioning Application: Network, Device, Apps & Subscribers Insights Business Problem How do I plan for a new smartphone or Tablet or M2M device being introduced? Which applicakons are using most bandwidth, tonnage or flows? What parts of my network are most congested? Insights Provided Insights on Network, Device, Subscribers and applicakons Most popular devices or applicakon in a locakon Traffic profile in busy hour Top N subscriber usage pakerns What is the next iphone going do to my network? What applica<ons are causing most load? Which loca<ons in network are most congested? Benefits Beker network planning Target offending users, devices and applicakons PromoKons and campaigns based on usage data 8
9
Decisioning Application: Visibility for Customer Care Business Problem Customer care is blind to data usage Amount of informakon from network is overwhelming Customer need beker explanakon on how their usage limits were used No co- relakon across mulkple silos Insights Provided Customer interackon with SP call center Subscribers usage pakerns Customer calls have changed from focused on Voice to Data usage issues No visibility end- 2- end on user data sessions Benefits Self service portal for users to help them understand the usage in a limited data plans. Pro- ackve idenkficakon of (data) network performance issues and changes Customer retenkon 10
11 Detailed Usage Troubleshooting Customer Care
Opportunities with Big Data Information Systems Devices & Networks Enterprise Apps Networks Databases Data at Rest Views Data in Motion Flows Data Warehouses Finance & Regulatory Network Engineering Marketing Customer Care Dynamic & Tiered Pricing Traffic Engineering Capacity Planning Peering Optimization Subscriber Segmentation Device Returns Mobile malware Data Overage Dispute Infected Devices Contract/SLA Enforcement Targeted Advertising Network Security DDOS & Anomalies