The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1
A New Platform for Pervasive Analytics Multiple big data opportunities in one optimized, high-performance, multi-tenant platform. Process Ingest Sqoop, Flume Transform MapReduce, Hive, Pig, Spark Discover Analytic Database Impala Search Solr Security and Administration Model Machine Learning SAS YARN, Cloudera Manager, Cloudera Navigator Unlimited Storage HDFS, HBase Serve NoSQL Database HBase Streaming Spark Streaming Batch, Interactive, and Real-Time. Leading performance and usability in one platform. End-to-end analytic workflows Access more data Work with data in new ways Enable new users 2
SAS and Cloudera: The Power to Know at Scale SAS High Performance Analytics Server SAS Visual Analytics SAS/Access Know your customer: Offer optimization Payment risk Customer link analytics 3
The Pervasive Analytics Journey 4
Customer success across industries Financial Services Telecom Healthcare & Life Sciences Media & Technology Retail & CP Public Sector 5
6 6
Ask Bigger Questions: How can we anticipate maintenance associated with specific vehicles? American multinational automaker captures every touchpoint to provide a seamless customer experience. 7
Automaker Streamlines the Customer Experience The Challenge: Each vehicle is comprised of thousands or millions of components, many streaming machine data Want to build loyalty by minimizing maintenance issues American multinational automaker improves customer loyalty through proactive care. The Solution: Cloudera correlates manufacturing data with service and customer interaction data Predictive analytics & machine learning enable dynamic customer profiles & personalization 8
Ask Bigger Questions: How can we improve our support team s productivity? NetApp AutoSupport processes 600,000+ phone home transactions weekly to offer proactive customer support. 9
NetApp Delivers Proactive Support The Challenge: 40% of phone home data transmitted within 18 hours each weekend, creating bottlenecks that affect SLAs Data storage footprint doubles every 16 months Queries take weeks; some don t run NetApp AutoSupport meets stringent SLAs with 64X faster processing. The Solution: NetApp Open Solution for Hadoop Processes machine-generated data from 600K+ weekly transactions Supports 7TB/month data volume growth 10
Ask Bigger Questions: Which semiconductor chips will fail? A Semiconductor Manufacturer uses predictive analytics to take preventative action on chips likely to fail. 11
Cloudera enables better predictions The Challenge: Want to capture greater granular and historical data for more accurate predictive yield modeling Storing 9 months data in a traditional RDBMS is expensive Semiconductor manufacturer can prevent chip failure with more accurate predictive yield models. The Solution: Dell Cloudera solution for Apache Hadoop 53 nodes; plan to store up to 10 years (~10PB) Capturing & processing data from each phase of manufacturing process 12
Ask Bigger Questions: Where will the next cyber attack attempt occur? Multinational ecommerce firm prevents cyber attacks with realtime anomaly detection of log data from hundreds of sources. 13 13
Multinational ecommerce firm The Challenge: Ingesting logs in many formats from 40,000 machines, hundreds of sources Need to find signals in the data for global threat management Symantec SIEM: costly, poor performance, doesn t scale Multinational ecommerce firm prevents cyber attacks with realtime anomaly detection. The Solution: Cloudera Enterprise: real-time log streaming, correlation, & analysis Splunk: data ingest & daily operational search 14 14
Thank you! Mike Olson @mikeolson mike.olson@cloudera.com 15