The Vertica Database simply fast!



Similar documents
Turning Data Into Answers With HP Vertica

How To Use Hp Vertica Ondemand

Technical white paper. R you ready? Turning big data into big value with the HP Vertica Analytics Platform and R

HP Vertica. Echtzeit-Analyse extremer Datenmengen und Einbindung von Hadoop. Helmut Schmitt Sales Manager DACH

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

Big Data Analytics: Today's Gold Rush November 20, 2013

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Il mondo dei DB Cambia : Tecnologie e opportunita`

Are You Ready for Big Data?

Are You Ready for Big Data?

Innovative technology for big data analytics

How To Handle Big Data With A Data Scientist

SEIZE THE DATA SEIZE THE DATA. 2015

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

ADVANCED ANALYTICS AND FRAUD DETECTION THE RIGHT TECHNOLOGY FOR NOW AND THE FUTURE

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

The Enterprise Data Hub and The Modern Information Architecture

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

Getting Started Practical Input For Your Roadmap

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Big Data and Market Surveillance. April 28, 2014

5 Signs You Might Be Outgrowing Your MySQL Data Warehouse*

Transforming the Telecoms Business using Big Data and Analytics

What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy

BIG DATA What it is and how to use?

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Luncheon Webinar Series May 13, 2013

Achieving Business Value through Big Data Analytics Philip Russom

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

The Internet of Things and Big Data: Intro

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Big Data. Fast Forward. Putting data to productive use

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Using Tableau Software with Hortonworks Data Platform

SQL Server 2012 Parallel Data Warehouse. Solution Brief

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

From Spark to Ignition:

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Oracle Database 12c Plug In. Switch On. Get SMART.

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Introducing Oracle Exalytics In-Memory Machine

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

The Future of Data Management

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

In-memory computing with SAP HANA

Trafodion Operational SQL-on-Hadoop

Big Data Executive Survey

Big Data overview. Livio Ventura. SICS Software week, Sept Cloud and Big Data Day

Data Refinery with Big Data Aspects

PARC and SAP Co-innovation: High-performance Graph Analytics for Big Data Powered by SAP HANA

High-Performance Analytics

The 3 questions to ask yourself about BIG DATA

Advanced In-Database Analytics

Information Optimization

How To Make Data Streaming A Real Time Intelligence

Chapter 1. Contrasting traditional and visual analytics approaches

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

Big Data Analytics. David Dietrich, EMC Education Services. April 4, 2013

BIG DATA TRENDS AND TECHNOLOGIES

How To Scale Out Of A Nosql Database

The Power of Predictive Analytics

VIEWPOINT. High Performance Analytics. Industry Context and Trends

ANALYTICS BUILT FOR INTERNET OF THINGS

Deploying Big Data to the Cloud: Roadmap for Success

Next-Generation Cloud Analytics with Amazon Redshift

Actian SQL in Hadoop Buyer s Guide

Advanced Big Data Analytics with R and Hadoop

Taming Big Data. 1010data ACCELERATES INSIGHT

The 4 Pillars of Technosoft s Big Data Practice

Digitization of Enterprise - New Style of IT

Page 2 of 5. Big Data = Data Literacy: HP Vertica and IIS

Microsoft Big Data. Solution Brief

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Embedded inside the database. No need for Hadoop or customcode. True real-time analytics done per transaction and in aggregate. On-the-fly linking IP

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

Big Data Analytics Nokia

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

An Oracle White Paper October Oracle: Big Data for the Enterprise

Transcription:

The Vertica Database simply fast! Mastering Big Data with HP Software Lior Tzabari - Regional Sales Manager Moshe Goldberg - Vertica System Engineer Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Welcome to the World of Big Data There is strategic value in big data; with real-time analytics, organizations are able to maximize business value and efficiencies Compliance Sarbanes-Oxley, HIPAA, Basel II Geophysical Exploration Healthcare Electronic Patient Record Gene Sequencing Medical Imaging Enterprise ERP CRM Products, Customers, Suppliers, Partners Technology Sensors, LOBs, XML Mobility Social Media Financial Services High-frequency Trading Algorithmic Trading Communications Call Detail Records

HP Customers Big Data Concerns HP survey responses - senior business and technology executives 50% 98% 34% 35% Do not have an effective information strategy in place Can not deliver the right information, at right time to support enterprise outcomes all of the time Say that half of their information is unconnected, undiscovered and unused Are not effective at accessing enterprise information as and when needed for compliance or operational needs * Source: Coleman Parkes 3

How Big is Big Data? Storage capacity growing 23% per annum Figure 1: The Digital Universe 2009-2020 Computing capacity growing 54% per annum 60% of the world s population used mobile phones in 2010 30 billion pieces of content shared every month on Facebook 30 million network sensor nodes in 2010 annual growth rate > 30% a year 40% projected growth in global data generated per year vs. 5% growth in global IT spending Source: Big Data The Next Frontier for Innovation, Competition and Productivity McKinsey Global Institute 2009 0.8 ZB* 2020 35 ZB * Growing by a Factor of 44 *Zettabyte = 1 trillion gigabytes

What is Big Data? Extreme Information: volume, velocity, variety and complexity Social Media Video Audio Email Texts Mobile Transactional Data Targeted Engagement BIG DATA Pattern- Based Analytics BIG DATA: datasets whose velocity and/or volume is beyond the capability of typical database tools to collect, store, manage and analyze IT/OT Docs Search Engine Contextual Relevance Images

Why Should You Care About Big Data? It Can Be Monetized! Business Value Examples: $300B in annual U.S. Healthcare value Retailers can increase operating margin by 60% using Big Data Governments could save more than $149B (Europe alone) annual through improved operational efficiency New companies formed on Big Data: IT Value: Big Data and analytics projects offer higher ROI than any other IT projects Opportunity for IT, analysts, and business users to come together (Moneyball!) Leverage previous skills and investments in IT projects that collect and store information

The Big Data Paradox: Data volumes growing faster than people, skills, disk, plant and power Outdated Technology: Traditional DBMS were never designed for today s volume, velocity, complexity Ad hoc questions come from all users, even customers directly Detailed data is where the interesting things happen Shortage of People: U.S. alone faces shortage of 150,000+ people with deep analytic skills U.S. missing 1.5M managers and analysts to analyze data and make decisions

Vertica Analytics Platform Real Time Big Data Cloud Mobile Monetize Better Decisions Analysis Real Time Statistics Services Individual = SOFTWARE based Real- Time Analytics Platform SQL & NoSQL analytics capabilities Industry Leading LOAD & QUERY Performance SIMPLE installation & use with AUTOMATIC setup and tuning Highly SCALABLE, ELASTIC and full parallelism MPP MONETIZE 100% of your data Sensor

750 customers + Financial Services Retail Communications Consumer Marketing Healthcare Online Web & Gaming

A Platform Designed for Big Data Next Generation Administration and Design Tools Columnar Compression Concurrent Load & Query Elastic Cluster SQL Analytics User- Defined Analytics Optimized Connectors Standard Interface True Column Store - RDBMS Native and Performance Optimized High Availability Real Time Massively Parallel Processing

Graphing with Vertica It s not just Social! Visualize the Power of relationships Scale, performance, and elasticity are core attributes Relationships can be people, products, markets, compounds, etc.

Big Data Analytics Not Only SQL & Structured Structured Unstructured Semi-structured Monetize 100% of your data All data sources Internal / External More data points = greater insight Common Platform Uncommon Results Real-time analytics with both SQL & NoSQL Dynamically add / change sources Scale, elasticity, and simplicity all with predictable performance

Understand the Past, Predict the Future How HP/Vertica Predicted the Oscars from Twitter Sentiment Loaded raw tweets from Twitter into Vertica prior to Oscars Performed text parsing and sentiment analysis in Vertica Scored each film category based on positive/negative mentions Accurately predicted winners in nearly every category! How much is knowing the future worth?

Vertica Analytics Platform - Monetizing Big Data Monetize Real Time Statistics Analysis Better Decisions Make smarter decisions in real time.

Telecommunications 7 of the top 10 global telecommunications firms run their business on Vertica Revenue & Service Assurance and Fraud Detection Sensor & Device management and performance monitoring Subscriber insights and targeted marketing and advertising Vertica opened doors to analyses that otherwise were too time-intensive or impossible. A larger team of business managers now have faster, easier access to more information. That knowledge is invaluable in an aggressively competitive market like ours. - Brian Harvell, Executive Director, Comcast Network Operations

Internet Gaming/Web 2.0 Predictive & targeted engagement for every individual Pattern recognition, sentiment, and social media Capture, analyze, and store PB s of data no pruning Real-time analysis for actionable insights NOW! being able to run social graph analysis on tables with tens of billions of rows with a fast turn around is amazing - Dan McCaffrey, Director of Analytics, Zynga

Financial Services Revolutionize catastrophe and risk management Real-time measurement and management to maximize asset performance Integrated offerings for financial services Institutional, Retail, Liquidity, Risk, etc. Comprehensive structured and unstructured data capabilities with 100 s of clients and 1000 s of analyses understanding our portfolio used to take 3 months with Vertica it doesn t even take an hour. We ve not only saved millions, but made even more - RMS Client

Healthcare Re-think health care in its entirety payer, provider, and PMP $300BN annual value creation opportunity two thirds in the form of reductions to national health care expenditure Emergence of new business models powered by Big Data (e.g. Blue Health Intelligence) Four distinct health care data silos Pharmaceutical R&D Clinical Activity (claims) and cost Patient behavior and sentiment Patient safety, protocol effectiveness, fraud detection and cost reduction all Big Data opportunities we went from waiting days to waiting seconds the impact on every aspect of our business has been transformational - Doug Porter, CIO, Blue Cross Blue Shield Association

Built from the Ground Up: The Four C s of Vertica Columnar storage and execution Clustering Capacity Optimization Continuous performance Achieve best data query performance with unique Vertica column store Linear scaling by adding more resources on the fly Store more data, provide more views, use less hardware Query and load 24x7 with zero administration

Ecosystem Integration Hadoop / M.R. + = Vertica Approach Support and leverage the Hadoop ecosystem rather than reinventing the MR wheel Technology Hadoop connector Squeal optimizing compiler for Pig programs Use cases Hadoop for exploratory analysis Existing MR, Pig scripts Vertica for stylized, interactive analysis With shared features, often faster than Hadoop with a fraction of HW

Automated / Unified Platform Management HADOOP Visualize Analytic resources Health / Status Cloud Cloud Provision Dynamically deploy Distribute resources Virtualized On Premise Manage Unlimited cluster sizes Geographically distributed enterprises

SQL Analytics + - Built for Big Data Features Time series gap filing and interpolation Event window functions and sessionization Social Graphing Pattern matching Event series join Statistical functions Geospatial functions Benefits High performance (Keep Data close to CPU) Low cost (Industry Standard building blocks) Ease of use (Automated + Available) Use Cases Tickstore data cleanups CDR/VOD data analysis Clickstream sessionization Data aggregation and compression Monte Carlo simulation Graph algorithms Sensor Data Process Control Time Series SmartGrid

Geospatial Analytics Store and query using SQL: Locations as Points of Interest Networks, e.g. roads, utilities, etc. as Line Segments Regions, e.g. sales territories, high risk zones, etc. Use cases Mobile check-in and gaming services (e.g. Foursquare, SCVNGR) Asset management, insurance Public sector and intelligence

Statistical Modeling Extensions Use Cases Loan default prediction Customer labeling on purchasing behavior Technology Classification logistic regression and decision trees Native Vertica implementation is MPP and high performance

Vertica Analytics Platform SDK A framework for Open Source and 3 rd Party plug-in Analytics Simple: concise APIs and examples accelerate deployment Flexible: operate on Structured and Unstructured data sets Efficient: In-process, fully parallel Fully leverage CPUs, Disks, Memory investments > 2,000 developers globally

SDK EXAMPLES

OLAP Rollup and Cube Calculations Present Data in Business-Friendly OLAP Form Transform data in-database for maximum efficiency and scale Present it in the form readily consumable by Business users and their favorite Business Intelligence tools Fast and Efficient Eliminate latency and storage of multiple copies MPP: tackles data sets at scale impractical or impossible on a workstation Visualize insights within a timeframe that empowers decisions Servers belong in a data room use a mobile device and retire those noisy workstations

AES Encryption Secure sensitive data, even from DBAs Secure Applies standard AES libraries Protect without impacting manageability Encrypt entire columns or individual cells Fast and Efficient Executes in parallel, in process, on multiple nodes Little to no net increase in storage requirements

In-Database Location GeoCoding Understand the position of any Address or Place Name Flatten arbitrary address formats to simple Latitude and Longitude Segment by boundary or proximity in Vertica s built-in Geospatial library Simple Lookups, or Complex Analytics In-Database Identify valuable regional or social activity trends Segment, Tag, or Group by location, e.g. postal code or near place name

Web Server Log & Click Stream Analysis Scalable library functions for IIS and WC3 log formats Extracts all fields from each web server log format Executes in parallel on multiple nodes, cores Bolsters Vertica s optimized in-database sessionization, pattern matching, and event series join capabilities Implemented as extensions to familiar SQL analytic syntax High-performance in-database page rank and user activity segmentation

Sentiment Analysis Package Mine customer interactions and online comments Scoring (negative/neutral/positive) on any text string Score customer service case notes and transcripts Score tweets and blogs mentioning your brand or products (or your competitor s) Manage a complete Business Communication Strategy Stay informed of customer sentiment from all internal and external sources

XML Parsing & Transformation XML within Vertica Store and Transform XML documents in-database Generate XML documents from queries Query external Web Services directly from Vertica MPP scale : parse more documents at lower latency Avoid complexity In-Database processes are more maintainable Inherently High Availability: no investment in redundant external transformation software or gateway servers

Google Analytics & Twitter Access Libraries Acquire data on demand from within Vertica No external infrastructure to maintain Low latency access to critical information Twitter Access API Highly maintainable: store keywords in the database for visibility and easy maintenance Google Analytics connection, query, and record extraction Detailed data on demand for real-time analysis and value

Document Relevance Comparison Cluster and Tag documents for search and comparison Quickly isolate the collection of documents surrounding a topic of interest Compute relevance vectors with scalable performance Scores the relevance of a word or sentence vs. another Runs in parallel on multiple nodes, multiple cores Includes Tag Cloud Example Generates HTML with most relevant words surrounding a topic, sized by score

Natural Language Processing Functions Common Generalized Functions for Machine Processing of Natural Language Optimized for performance and scale Used in many common search algorithms Suitable for low latency, high volume text streams in a variety of languages Used across multiple industries: Online Gaming, Telco, Security, Insurance (to name a few)

Send SMS Messages from Vertica Invoke SMS Messages from ordinary SQL Run direct marketing as the result of a SQL query Notify end users of important information in real time Automate administrative alerts Notify users of batch completion Notify administrators of maintenance conditions

Shell Command Framework Secure Accessible only where privileges are specifically granted Leverages Vertica s Role Based security model Powerful and Flexible Invoke shell commands as SQL functions Results captured and transformed for use in query Easily automate administrative tasks Easily execute on all nodes or a subset

Thank You!