Turning Data Into Answers With HP Vertica Sekher Seshadri March, 2014
Agenda Big Data Challenges and Opportunities HP Vertica Overview Customer Use Cases Q&A 2
Big Data Challenges & Opportunities
Completing Analytical Vision Traditional Enterprise Data Big Data Dark Data CRM ERP Data Warehouse Web Social Log Files Machine Data Semi-structured 4
Where Are You On Big Data Continuum? Value / Benefit Big Data is not used Level 0 No Big Data Minimal use of structured internal data e.g., from ERP, CRM systems 48%* Level 1 Structured - Internal Solution Complexity Supplement with structured external data, e.g., paid subscriptions 7%* Level 2 Structured Internal & External Include use of unstructured data, e.g., text, graphic 8%* Level 3 Structured Internal & External Unstructured - Internal Add use of external unstructured data, e.g., search, social media 8%* Level 4 Structured Internal & External Unstructured Internal & External All data sources are fully integrated and optimized Level 5 Fully Optimized Over 90% of executives reported that in the next three years their organization plans to incorporate unstructured data into their enterprise insights, processes, and strategy. Data Types Source: Coleman Parkes Survey, Nov. 2012 Present Future 5 Source: Gartner Research Survey, March 2012 * Percentage of Gartner survey respondents who indicated high usage of that particular class of data
HAVEn The Leading Big Data Platform HAVEn Hadoop/ HDFS Autonomy IDOL Vertica Enterprise Security n Apps Catalog massive volumes of distributed data Process and index all information Analyze at extreme scale in real-time Collect & unify machine data Powering HP Software + your apps 6 Transactional Social media Video Audio Email Texts Mobile data Documents IT/OT Search engine Images hp.com/haven
HP Vertica Analytics Platform
How Do You Measure Return on Information? Data Depth of volume x analytics x # of users Speed (or Time to Value) Hardware + License + Support = = Data value Total cost = ROI Return on information 8
HP Vertica Analytics Platform HP Vertica Analytics Platform High-performance data analytics platform purpose built for big data Blazing fast analytics Gain insights into your data in near-real time by running queries 50x-1000x faster than legacy products Massive scalability Infinitely and easily scale your solution by adding an unlimited number of industry-standard servers Open architecture Protect your investment in hardware and software, with built-in support for Hadoop, R Easy set-up and administration Get to market quickly with your analytics initiatives at low cost of administration and maintenance Optimized data storage Store 10x-30x more data per server than row databases with patented columnar compression 9 Speed, scalability, simplicity, and openness at lower TCO
Flexible Deployment Software-Only Low-Cost Linux Based Server Hardware Scale to as many nodes as required Reference architecture available for HP DL380p Gen8 servers Cloud Solution Implement on HP Cloud environment or on Amazon EC2 Scale to as many virtual nodes as required Save hardware and data center costs while taking advantage of cloud based usage-on-demand Appliance Comes in different models/sizes All built and ready for plug/play Scale up as needed HP cloud, Amazon cloud, or any other cloud solution 10
Elastic Cluster Scale-Out Simple process to add more servers Add nodes to increase performance or capacity Vertica automatically redistributes data in the background No database downtime Database continues to support query requests while rebalance is in progress High performance redistribution Elastic cluster and local segmentation enable fast cluster scaling E.g. One customer expanded their 11 TB database cluster from 16 nodes to 32 nodes in 65 minutes! 11
Vertica Supports Analytics SQL Analytics Time series gap filing and interpolation Event window functions and sessionisation Social Graphing Statistical & Geospatial functions R Analytics SQL to call User Defined Extensions in R Optimised data transfer between Vertica and R Benefits High performance - Keep Data close to CPU Low cost - Industry Standard building blocks Ease of use - Automated + Available Use Cases 12 CDR/VOD data analysis Clickstream sessionisation Monte Carlo simulation Graph algorithms Sensor Data Mobile check-in and gaming services Asset management and insurance Public sector and intelligence Data mining algorithms In-database Scoring
HP Vertica Flex Zone: How It Works 13
The Richest, Most Open SQL on Hadoop Challenge: Extracting Data from Hadoop requires complex and brittle ETL processes SOLUTION: Hadoop Navigation and Analytics Benefits: Navigate Hadoop data using its native catalog Quickly and easily load native data types from Hadoop to Vertica Avoid recreating schemas to explore external tables Use the full power of Vertica SQL and Analytics Choose your own Hadoop distribution 14
Vertica Analytics Platform SDK A framework for Open Source and 3 rd Party plug-in Analytics Simple: concise APIs and examples accelerate deployment Flexible: operate on Structured and Unstructured data sets Efficient: In-process, fully parallel Fully leverage CPUs, Disks, Memory investments Supports Java, C++ and R 15
HP Vertica Application Integration 16
HP Vertica Customers
Thousands of Customers Finding Answers with HP Vertica Promotional testing Behavior analytics Claims analyses Click stream analyses Patient records analyses Network analyses Clinical data analyses Customer analytics Fraud monitoring Compliance testing Financial tracking Loyalty analysis Tick data back-testing Campaign management 18
Retail Sales Insights in Real Time GUESS, Inc. Replaced legacy POS data warehouse that wouldn t scale Essential daily store reports generated in 1-2 minutes instead of 3 hours 90-180x faster using HP Vertica, 24x7 loads, minimal admin overheads Retail moves at lightning speed so we needed a highperformance analytics platform that could handle our fast-paced requirement for information. - Mike Relich, CIO, Guess, Inc. Every store manager has access to the data on mobile device with Microstrategy 19
The Knowledge from a Single View Comcast Network performance monitoring for millions of devices for QoS, analyzing billions of metrics HP Vertica data mart now larger than corporate EDW Follow-on projects to analyze CDR, IPDR, and VoD data Extensive use of HP Vertica s time series analytics Vertica opened doors to analyses that otherwise were either too time-intensive or impossible. A larger team of business managers now have faster, easier access to more information. That knowledge is invaluable in an aggressively competitive market like ours. - Brian Harvell, Network Ops 20
Improved Customer Loyalty for Banks Cardlytics As volume of data mushroomed, company sought scalable architecture to replace their legacy platform 40-80x faster and 5% of the overall cost of legacy platform 90% reduction in operational overhead HP Vertica enables us to fine-tune and personalise offers, and we are providing this service at hyper-speed. - Scott Grimes, CEO ROI in three months 21 Increased revenue, resulting from more merchants participating in rewards programs
Improving Mobile Services KDDI Corporation Speeds up analysis of increasing volumes of call data to help maintain and improve customer service Queries went from three minutes to ten seconds for much faster resolution of service problems Rapid identification and resolution of service problems helps improve customer service and increases competitiveness Our mission is to make it possible for customers to use a variety of applications and content on networks and devices that are easy to connect and enjoyable to use. We also aim to continue providing high-quality customer services and we anticipate that HP Vertica will continue to support our mission. Takahiro Yasunaga, Manager, Head of OSS Development Section, Core Network Development Depart., Network Technical Development Div., KDDI Corporation 22
Accelerating New Drug Discovery Novartis Institute for Biomedical Research (NIBR) Challenge at Novartis 1,000s of scientists perform analysis on tissue sample data for new drug research Queries with existing database taking four to five hours to complete Concurrency issues with existing system prevented many scientists from performing analyses Hardware and licensing costs with existing DB vendor prohibitive and restrictive HP Vertica Solution Queries run in under five minutes; 48x 60x faster Speed and concurrency allow more NIBR scientists to run tests = more drug discovery Use of commodity hardware cut operational costs by 30% Currently 20 TB of data in 37 billion row comprising 60,000 samples and 6 million measurements under HP Vertica management Plan to add at least 10 TB of additional capacity soon 23
Real-time Data Analytics Empirix, Inc. HP Vertica helps power Empirix IntelliSight, a Big Data Platform for mobile and telecommunications business analytics Analyze billions of rows of subscriber data and return results in seconds, a task that was not possible in the past Boost subscriber service quality, reduce customer churn, identify and target customers with revenuegenerating offers Predictive analytics improves customers ROI 24
Machine Data for Warranty Impact HP IT Challenge at HP IT Current data warehouses could not provide iterative or predictive analytics in a timely, costeffective manner HP Gen8 servers call-out feature provide telemetry data 24x7, generating massive amounts of data HP would like to be proactive about notifying server customers before a warranty event, increasing satisfaction and reducing support costs HP Vertica Solution Speed and compression allow HP to load and query on the massive amounts of telemetry data provided by Gen8 servers Discovered a memory change followed quickly by a processor change leads to a warranty event 24.5% of the time HP was proactive with customers to fix issues before problems Determined for every 1% improvement in quality for their servers provides $4.5 million in potential benefit for HP 25
Analyzing Billions of Clicks www.hp.com Challenge at HP.com Millions of visitors generate 11 12 billion clicks per month Must store 5 years worth of data to get full value of year-overyear clickstream analysis Oracle database had sluggish performance queries took 48 hours after each day s transactions Extremely complex website many pages are generated dynamically creating complex clickstream trails HP Vertica Solution Queries run in hours or even minutes; 48x 100x faster Industry-standard SQL accelerated acceptance and proficiency Speed of HP Vertica allows iterative and recursive analysis for deeper dives HP can build functionality tailored to individual interactions based on nuanced understanding of user behavior at an individual level 26
Realise the value Next steps Free trial of Vertica A paper: Derive maximum value from my enterprise data warehouse Information on Customer Analytics with HP Vertica Analytics platform 27
Enter to win! ElitePad 900 valued at $799.00 incl GST Fill in your feedback form Draw at 5.45pm Be here to win and collect your prize! Submit forms by 5.30pm (online at www.hp.com.au/formb or hardcopy) 28
Thank you