Large-Scale Machine Learning at Verizon. Ashok N. Srivastava, Ph.D. Chief Data Scientist, Verizon



Similar documents
Data Science & Big Data Practice

EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS

Big Data - An Automotive Outlook

How To Understand The Power Of The Internet Of Things

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

What is Driving Rapid Growth in the Australian Mobile Advertising Market?

2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist

AUTOMOTIVE REPORT 2015 ADOBE DIGITAL INDEX

A Comparison of Media Sources for Mobile App User Acquisition

2013 SIIA Strategic and Financial Conference New York, NY

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Big Analytics: A Next Generation Roadmap

This Symposium brought to you by

The 4 Pillars of Technosoft s Big Data Practice

Marketing Automation Solutions Market India Increasing Enterprise Digital Marketing Initiatives Drive Growth at a CAGR of 25% by 2020

the 3 keys to achieving real-time visibility of your customer s experience

Industrial Dr. Stefan Bungart

In-Vehicle Infotainment. A View of the European Marketplace

Data-Driven Decisions: Role of Operations Research in Business Analytics

Mobile Marketing for Customer Acquisition and Retention

Analyzing Big Data with AWS

The Enterprise Data Hub and The Modern Information Architecture

SAP Predictive Analytics

Safe Harbor Statement

AdTheorent s. The Intelligent Solution for Real-time Predictive Technology in Mobile Advertising. The Intelligent Impression TM

Amplify Serviceability and Productivity by integrating machine /sensor data with Data Science

Comprehensive Analytics on the Hortonworks Data Platform

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

Information-Driven Transformation in Retail with the Enterprise Data Hub Accelerator

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Strategic Decisions Supported by SAP Big Data Solutions. Angélica Bedoya / Strategic Solutions GTM Mar /2014

From Big Data to Smart Data Thomas Hahn

Turning Big Data into a Big Opportunity

Microsoft in Automotive and the Future of Connected Vehicle Consumer Experiences

The Big Data Paradigm Shift. Insight Through Automation

How To Handle Big Data With A Data Scientist

Autos Consumer Journey Yahoo Insights Team. November 2013

The Future of Data Management

How To Transform Insurance Through Digital Transformation

Mobile Real-Time Bidding and Predictive

Revenue Enhancement and Churn Prevention

Zero-in on business decisions through innovation solutions for smart big data management. How to turn volume, variety and velocity into value

Operating from the middle of the digital economy: Integrated Digital Service Providers. By Ed Bae, Sumit Banerjee and Tom Loozen

INTRODUCING RETAIL INTELLIGENCE

ByteMobile Insight. Subscriber-Centric Analytics for Mobile Operators

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

ADGEAR TRADER EMPOWERING DIGITAL MEDIA INNOVATORS WITH CROSS-CHANNEL, REAL-TIME DISPLAY SOLUTIONS

Big Data for Banking. Kaleem Chaudhry Senior Director, Sales Consulting, ASEAN. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

BIG DATA: IT MAY BE BIG BUT IS IT SMART?

Audience Management & Targeting

CDK Digital Marketing Websites

The Importance of Local Marketing to Multi-Location Automotive Businesses

Big Data. Fast Forward. Putting data to productive use

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

Developing a successful Big Data strategy. Using Big Data to improve business outcomes

Architecting your Business for Big Data Your Bridge to a Modern Information Architecture

Reach out and touch the future: Accenture connected vehicle solutions

Transforming Big Data Into Smart Advertising Insights. Lessons Learned from Performance Marketing about Tracking Digital Spend

Integrating a Big Data Platform into Government:

Unlocking the Intelligence in. Big Data. Ron Kasabian General Manager Big Data Solutions Intel Corporation

Integrate Big Data into Business Processes and Enterprise Systems. solution white paper

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Smart Cities. Opportunities for Service Providers

The Canadian Realities of Big Data and Business Analytics. Utsav Arora February 12, 2014

Brazilian Mobile Advertising Services Market Application-to-peer SMS Advertising is Expected to Generate a CAGR of 55.

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK

Big Data Analytics: 14 November 2013

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

Consumer Cloud Demand

[x+1] Completes Next-Generation POE; Its Origin Enterprise Data Management Platform for Automated, Big Data-Driven Marketing Optimization

Splunk Company Overview

Turning Big Data into More Effective Customer Experiences. Experience the Difference with Lily Enterprise

Internet of Things. Point of View. Turn your data into accessible, actionable insights for maximum business value.

Big Data / FDAAWARE. Rafi Maslaton President, cresults the maker of Smart-QC/QA/QD & FDAAWARE 30-SEP-2015

Marketing Orchestration. Better Metrics, Happier Customers. 72% Impression Lift 107% CTR Improvement

How To Use Big Data Effectively

Big Data Analytics- Innovations at the Edge

What s New in Analytics: Fall 2015

SAP Makes Big Data Real Real Time. Real Results.

3 Ways Retailers Can Capitalize On Streaming Analytics

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Enabling Manufacturing Transformation in a Connected World. John Shewchuk Technical Fellow DX

Company Category S2S Description

COMP9321 Web Application Engineering

Big Data: Start small for big possibilities

Transcription:

Large-Scale Machine Learning at Verizon Ashok N. Srivastava, Ph.D. Chief Data Scientist, Verizon

A Transition in Roles Enormous data volumes Massive computing infrastructure Significant public benefit that touches daily lives

Organizations started to recognize they are operating with blind spots Factors supporting major decisions 1 in 3 To a little extent business leaders frequently make critical decisions without the information they need 79 % 62 % 53% 52 % don t have access to the information across their organization needed to do their jobs To a great extent Personal Experience Analytics Collective Experience

Mobile in the United States Consumers are Mobile Social Already Mobile Data Traffic Exploding Smartphone penetration 61% 2013 7% CAGR 80% 2017 70% Smartphone consumers using social 40%-45% growth projected per year will 74% Game will watch 45% Videos 58% of social media time 2,000 Terabytes per day 1 Terabyte (TB) = ~ 1,000 GB 21% will listen to 41% Music of 2013 Black Friday ecommerce 4 IN 10 social users bought after sharing on 200 TB metadata/day What site? What page? Last site? Next site? From where? What time? What app? From who? Sources: emarketer, Jumptap & comscore, Vision Critical, Verizon internal data

Revenue Streams due to Analytics 5

The Orion Cluster at Verizon VZ Management Center Vertical Solutions Web Services & APIs Advertising Managed Network Services Cybersecurity (unified network) Network Health Management M2M Reporting and Advanced Analytics BI Reporting & Dashboards Domain-Specific Rules Engine Anomaly & Pattern Detection Prediction Algorithms Recommendation Engine Insight Discovery Real-time & Batch Processing Big Data Infrastructure Hadoop Ecosystem Massive Parallel Processing RDBMS NoSQL DBMS Apache Projects Data Feeds VZW Network, Clickstream, Time, Location Data PIP and other Enterprise Data FiOS Other 3 rd party data Publicly available data

Orion today

Precision Market Insights Connecting marketers with consumers, improving engagement and driving audience response Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 8

New Challenges for Marketers The rise of Big Data & Mobile are complicating marketing activities Data Onslaught of 1 st and 3 rd party information Access to key information from channels Mobile Burgeoning channel for customer interactions The amount of data from mobile is exploding Mobile can be the solution Bridge Digital & Physical Direct customer relationship Context for cross-channel interactions Increased efficiency of customer communications Siloed data stores across internal groups Lack of standards for tracking and targeting

The Online/Mobile Ad Ecosystem AD NETWORKS $0.40 BRANDS $1.00 AGENCIES $0.10 PUBLISHERS $0.35 CONSUMER Buy wholesale inventory for advertisers Streamline multiple ad networks for publishers AD EXCHANGES $0.10 ENABLER (Targeting, Data provider) $0.05

Mobile + Big Data Solutions for Marketing App Usage Clickstream Location Demographics Married 35-40 year old female in DC metro interested in tennis and luxury brands; recently browsed Tory Burch shoes at m.nordstrom.com and visits their store 3 times per month. Precision enables better 1:1 understanding of customers across physical and digital contexts to drive more relevant and personalized interactions

Case Study: Relevant Mobile Ads Drive A Major Sports League OBJECTIVE Engage a targeted audience to click on a mobile banner ad, inviting them to purchase and download the NFL Premium App. PRECISION MEETS THE CHALLENGE Precision can reach a broad audience across multiple properties and devices, accurately targeting specific mobile user segments. AUDIENCE Known interest in sports Previous basic app download Have basic app but not premium app Privacy RESULT Had 64% higher CTR compared to other mobile media properties Precision 0.38% Others 0.23%

Adaptive Architecture for Optimal Advertising Data Driven Customer Model Adjustment Mechanism Automated Decision Maker Observed Consumer Behavior 13

Automatic Profile Discovery 14

Large-scale Machine Learning with Applications to Connected Machines Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement.

Connected Ecosystem Market Size Healthcare By 2017, in the U.S., there will be Finance Energy Over 426 Million Ways to Interact with Customers Manufacturing Transportation 207 Million Smartphones Distribution Source: emarketer, August 2013 and Frost & Sullivan 2013 Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 16

Telematics / Fleet Management Today OEM & AFTERMARKET TELEMATICS COMMERCIAL TELEMATICS ENERGY SECURITY HEALTHCARE EDUCATION MANUFACTURING RETAIL FINANCIAL SERVICES Others 2012 2013 2014 CONSUMER TELEMATICS: O E M E M B E D D E D HUGHES ACQUISITION CONSUMER TELEMATICS: A F T E R M A R K E T I N F O T A I N M E N T S M A R T T R A N S P O R T A T I O N : T R A F F I C PARKING S M A R T T R A N S P O R T A T I O N : A I R M A R I T I M E COMMERCIAL T E LEMATICS: F LEET COMMERCIAL T E L E M A T I CS: W O R K F O R CE W O R K - O R D E R A S S E T M G T. S M A R T T R A N S P O R T A T I O N : CA R S H A R I N G R A I L PUBLIC T R A N S P O R T A S S E T M G T. M U N I T R A N S P O R T S M A R T B U I LDINGS R E T A I L Making Safety and Efficiency Real Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 17

Connected Machine Landscape (I) Car 250M cars on the road in the US, with about 20M new cars sold each year. Location analytics Comparative analytics across vehicle makes/models Prognostic and diagnostic vehicle health management (self-healing autonomous systems) Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 18

Connected Machine Landscape (II) Over 100M mobile subscribers currently on Verizon s network Car + Consumer Driver behavior (safety, remote management) Introduces uncertainty Need for personalization Autonomous control Consumer behavior (infotainment, advertising, etc.) Information retrieval Relevance Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 19

Connected Machine Landscape (III) Car + Consumer + Environment Environmental inputs (Traffic, weather, emergency services, etc.) Data correlation Distributive decision making and optimization 100s of TB of environmental data generated daily Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 20

Advanced Analytics for Connected Machines Verizon has core competency in advanced analytics areas including: Anomaly Detection Diagnostics Predictive Analytics Time Series Analysis and Forecasting Correlations and Association Analysis Optimization Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 21

Connected Machine Health Management Manual intervention or autonomous and selfhealing ( intelligent ) systems Domain rules and expert opinion Data and domain rules 4. Mitigation What actions to take? Predictive Analytics 1. Anomaly Detection Is something different? 3. Prognosis What will happen next and when? Real- and near-realtime monitoring and anomaly detection 2. Diagnosis What is happening? Domain rules and expert opinion Comprehensive and real-time view of big data relevant to diagnosis Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 22

Primary Source: Switches, routers and other machines Multiple Kernel Anomaly Detection Discrete Primary Source: Maintenance Reports Continuous Multiple Kernel Anomaly Detection (MKAD) Textual Primary Source: Network connectivity Networks Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 23

Optimization problem One class SVMs training algorithms require solving the quadratic problem Dual form Q 1 min 2 i j Ki, j i, j Subject to: i i 1 0,1, Linear equality constraint Control parameter 0 1 i, i l Bounds on design variables : Lagrange multipliers of the primal QP problem

Anomaly scores Decision boundary is determined only by margin and non-margin support vectors obtained by solving the QP problem,, f z, K h i i, z i k Data points with will be the support vectors Indicator 0 Sign of h: if negative outlier if positive - normal Value of h: degree of anomalousness

Increasing Degree of Anomalousness The Impact of Big Data Innovations Numeric Inputs Only Numeric and Text Inputs The addition of Numeric and Text Data in the new MKAD algorithm significantly improves (by as much as 7000 points) the ranking of the monitored anomalies. Execution time increased by approximately 5%. 26

Anomaly Detection using NASA MKAD Algorithm Unusual Auto Landing Configuration Reported Exceedance Level 3: Speed low at touchdown Level 2: Flaps at questionable setting at landing Aviation Safety Program Annual Review 2012 SSAT Project 27

Big Data and Telematics Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 28

Telematics F L E E T Monitor fleet location, condition and driver behavior W O R K F L O W Manage work and share updates both internally and externally W O R K F O R C E Align resource capacity with work and direct the right resource to the right location at the right time CLOUD ANALYTICS NETWORK I N V E N T O R Y Monitor the location and condition of cargo What Can We Build? Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 29

Connected Car Solutions After market data collectors (for all modern makes and models) Built-in data collectors for Mercedes and other brands Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 30

1 Billion Miles Dallas, TX United States Detroit, MI New York, NY Chicago, IL Houston, TX 58% City Driving, 42% Highway Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 31

Real-world MPG calculations are sampled from thousands of vehicles, not just a handful. Japanese OEM Vehicles U.S. OEM Vehicles Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 32

Driver behavior is analyzed at multiple levels. Driver behavior can be compared and contrasted to drivers from other OEM vehicles. Volkswagen Vehicles: 33.69 mi/day Mitsubishi Vehicles: 33.32 mi/day Honda Vehicles: 32.65 mi/day Nissan Vehicles: 33.22 mi/day Ford Vehicles: 32.08 mi/day Toyota Vehicles: 31.92 mi/day Cadillac (GM) Vehicles: 29.24 mi/day Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 33

Driver behavior is analyzed at multiple levels. Older model years are driven at significantly lower speeds: More homogeneous driving across model years All Vehicles Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written agreement. 34

Prediction Interval Estimation using Bootstrap Regression Ensemble Learning Approach We have proven that the empirical quantiles of the bootstrap prediction models can be used to consistently estimate the prediction intervals. S. Kumar and A. N. Srivastava, under submission to NIPS 2014.

Anomaly Detection Application S. Kumar and A. N. Srivastava, under review at NIPS 2014.

Acknowledgements + Notes The Entire Verizon Big Data and Analytics Team My former team at NASA Collaborators at Stanford We are hiring for junior and senior roles: Data Scientists Machine Learning experts Platform Engineers Analytics and visualization Application Developers Product Managers