IBM PureData Systems. Robert Božič robert.bozic@si.ibm.com. 2013 IBM Corporation



Similar documents
Ubrzajte svoj Data Warehouse 100 puta i više

Netezza and Business Analytics Synergy

Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

PureSystems: Changing The Economics And Experience Of IT

IBM Netezza High Capacity Appliance

IBM Netezza Analytics

IBM Netezza High-performance business intelligence and advanced analytics for the enterprise. The analytics conundrum

IBM Netezza Analytics

IBM Data Warehousing and Analytics Portfolio Summary

Poslovni slučajevi upotrebe IBM Netezze

Evolving Solutions Disruptive Technology Series Modern Data Warehouse

Next Generation Data Warehousing Appliances

Introduction to the PureData for Analytics System (PDA) + Details on the N3001 Family

Harnessing the power of advanced analytics with IBM Netezza

2015 Ironside Group, Inc. 2

Main Memory Data Warehouses

Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy?

HIGH PERFORMANCE ANALYTICS FOR TERADATA

Scaling Your Data to the Cloud

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics

A Data Warehouse Approach to Analyzing All the Data All the Time. Bill Blake Netezza Corporation April 2006

QlikView Business Discovery Platform. Algol Consulting Srl

In-memory computing with SAP HANA

Advanced In-Database Analytics

Salesforce.com and MicroStrategy. A functional overview and recommendation for analysis and application development

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.

White Paper - GPU-Based SQL Database. SQream Technologies. SQream DB GPU-Based SQL Database Technical Overview White Paper

IBM Netezza Taking advantage of the new wealth of information to make more intelligent decisions at the time of impact

Microsoft Analytics Platform System. Solution Brief

Greenplum Database. Getting Started with Big Data Analytics. Ofir Manor Pre Sales Technical Architect, EMC Greenplum

Introducing Oracle Exalytics In-Memory Machine

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

HP Vertica. Echtzeit-Analyse extremer Datenmengen und Einbindung von Hadoop. Helmut Schmitt Sales Manager DACH

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Front cover IBM PureData System for Analytics Architecture: A Platform for High Performance Data Warehousing and Analytics

IBM Smart Analytics Systems

IBM BigInsights for Apache Hadoop

Accelerate Data Loading for Big Data Analytics Attunity Click-2-Load for HP Vertica

How To Use Hp Vertica Ondemand

Advanced Analytics for Financial Institutions

The Netezza FAST Engines Framework

EMC GREENPLUM DATABASE

SOLUTION BRIEF. Advanced ODBC and JDBC Access to Salesforce Data.

High Performance Data Management Use of Standards in Commercial Product Development

White Paper. Unified Data Integration Across Big Data Platforms

Unified Data Integration Across Big Data Platforms

Bringing Big Data into the Enterprise

IBM Cognos 10: Enhancing query processing performance for IBM Netezza appliances

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

IBM PureData System for Operational Analytics

From Spark to Ignition:

Infrastructure Matters: POWER8 vs. Xeon x86

An Oracle White Paper October Oracle: Big Data for the Enterprise

Real Life Performance of In-Memory Database Systems for BI

Big Data and Its Impact on the Data Warehousing Architecture

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Key Attributes for Analytics in an IBM i environment

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

SAP Real-time Data Platform. April 2013

ANALYTICS IN BIG DATA ERA

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Welcome to The Future of Analytics In Action IBM Corporation

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

A Breakthrough Platform for Next-Generation Data Warehousing and Big Data Solutions

A financial software company

SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform

IBM Analytics. Just the facts: Four critical concepts for planning the logical data warehouse

A HIGH-PERFORMANCE, SCALABLE BIG DATA APPLIANCE LAURA CHU-VIAL, SENIOR PRODUCT MARKETING MANAGER JOACHIM RAHMFELD, VP FIELD ALLIANCES OF SAP

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

2009 Oracle Corporation 1

IBM Big Data Platform

Advanced Big Data Analytics with R and Hadoop

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management

IBM System x reference architecture solutions for big data

Cost-Effective Business Intelligence with Red Hat and Open Source

<Insert Picture Here> Oracle Database Directions Fred Louis Principal Sales Consultant Ohio Valley Region

Big Data Are You Ready? Thomas Kyte

Customer Insight Appliance. Enabling retailers to understand and serve their customer

Exadata Database Machine

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

The HP Neoview data warehousing platform for business intelligence

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ANALYTICS CENTER LEARNING PROGRAM

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

Solutions for Communications with IBM Netezza Network Analytics Accelerator

Transcription:

IBM PureData Systems Robert Božič robert.bozic@si.ibm.com

IBM PureData System Meeting Big Data Challenges Fast and Easy! System for Hadoop For Exploratory Analysis & Queryable Archive Hadoop data services optimized for big data analytics and online archive with appliance simplicity System for Analytics For apps like Customer Analysis Data warehouse services optimized for high-speed, peta-scale analytics and simplicity System for Operational Analytics System for Transactions For apps like Real-time Fraud Detection Operational data warehouse services optimized to balance high performance analytics and real-time operational throughput For apps like E-commerce Database cluster services optimized for transactional throughput and scalability

Announcing a New Model! PureData System for Analytics N200X ( Striper ) February 1 Generally Available February 5 Public Announce PureData for Analytics now has 3 models N1001 high performance and scalability N2002-002 Announced October 8th 1,6x faster performance N2001 highest performance appliance to-date 3X faster performance 3

The IBM Netezza Appliance: Revolutionizing Analytics What is Netezza?

Appliances make it simple, completely transforming the user experience. Dedicated device Optimized for purpose Complete solution Fast installation Very easy operation Standard interfaces Low cost 5

The IBM Netezza Appliance: Revolutionizing Analytics Purpose-built analytics engine Integrated database, server & storage Standard interfaces Low total cost of ownership Speed: 10-100x faster than traditional systems Simplicity: Minimal administration Scalability: Peta-scale user data capacity Smart: High-performance advanced analytics

Slovenian customers Zavarovalnica Maribor Zavarovalnca Triglav Telekom Slovenije Tuš Mobil Petrol Informatika NLB MKZ Fabrika Duvana Sarajevo Iskratel

Good prospects for Netezza Large Data volumes, minimally over 1TB ideally 5TB - 20TB of user data Transformational projects in highly competitive industries where best-use of data and analytics separates competition. Company views data as a corporate asset (competitive advantage) Must be doing complex analysis Need to bring an application online fast Hair on fire type of scenario not meeting SLAs

Smart Predicts what shoppers are likely to buy in future visits Coupon redemption rates as high as 25% Because of (Netezza s) in-database technology, we believe we'll be able to do 600 predictive models per year (10X as many as before) with the same staff." Eric Williams, CIO and executive VP

Appliance Simplicity

Managing The Netezza Appliance No software installation No storage administration No database tuning

The Netezza Appliance Loading Data Integration IBM Information Server Ab Initio Business Objects/SAP Composite Software Expressor Software GoldenGate Software (Oracle) Informatica Sunopsis (Oracle) WisdomForce Data In SQL ODBC JDBC OLE-DB

The Netezza Appliance Querying Reporting & Analysis Cognos (IBM) SPSS (IBM) Unica (IBM) Actuate Business Objects/SAP Information Builders Kalido KXEN MicroStrategy Oracle OBIEE QlikTech Quest Software SAS Data Out SQL ODBC JDBC OLE-DB

Simple to Deploy and Operate Operations Simply load and go. it s an appliance Installation to Business Value in ~2 days Ease of Evaluation and Perform As Advertised BI Developers Data model agnostic No configuration or physical modeling No indexes or tuning out of the box performance Focus on business value, not physical design ETL Developers Faster load and transformation times No aggregate tables needed simpler ETL logic In-database transformation ELT Business Analysts On-Stream processing by 100 s of nodes Train of thought analysis 10 to 100x faster True ad hoc queries Lower latency load & query simultaneously 14

Appliance Architecture

IBM Netezza True Appliance Architecture SOLARIS AIX Client TRU64 HP-UX WINDOWS LINUX Database Server Storage DATA SQL ETL Server Source Systems DBA CLI 3rd Party Apps I/O CACHE CACHE I/O CACHE SQL High Performance Loader Data

IBM Netezza True Appliance Architecture SOLARIS AIX Client TRU64 HP-UX Database WINDOWS LINUX Server Storage ODBC 3.X JDBC Type 4 SQL-92 SQL-99 Analytics ETL Server Source Systems DBA CLI CACHE Database, CACHE I/O Server, I/O Storage - in one 3rd Party Apps CACHE High Performance Loader

Information Management IBM Netezza True Appliance Architecture Optimized Hardware+Software Streaming Data Purpose-built for high performance analytics; requires no tuning Hardware-based query acceleration for blistering fast results True MPP Deep Analytics All processors fully utilized for maximum speed and efficiency Complex analytics executed in-database for deeper insights 18

IBM Netezza True Appliance Massively Parallel Processing SOLARIS AIX Client TRU64 HP-UX WINDOWS LINUX ODBC 3.X JDBC Type 4 OLE-DB SQL/92 SQL Compiler Query Plan Execution Engine 1 2 3 S-Blade Processor & streaming DB logic S-Blade Processor & streaming DB logic S-Blade Processor & streaming DB logic Source Systems ETL Server DBA CLI 3rd Party Apps High-Speed Loader/Unloader Optimize Admin Front End DBOS SMP Host Network Fabric 960 High-Performance Database Engine Streaming joins, aggregations, sorts S-Blade Processor & streaming DB logic Massively Parallel Intelligent Storage High Performance Loader

IBM Netezza True Appliance Massively Parallel Processing SOLARIS AIX Client TRU64 HP-UX WINDOWS LINUX SQL SQL Compiler Query Plan Snippets 1 2 3 Execution Engine 1 2 3 S-Blade S-Blade S-Blade 2 3 Processor & streaming DB logic 2 3 Processor & streaming DB logic 2 3 Processor & streaming DB logic Source Systems ETL Server DBA CLI 3rd Party Apps High-Speed Loader/Unloader Optimize Admin SQL Front End DBOS SMP Host Network Fabric 960 High-Performance Database Engine Streaming joins, aggregations, sorts S-Blade 2 3 Processor & streaming DB logic Massively Parallel Intelligent Storage High Performance Loader

IBM Netezza True Appliance Massively Parallel Processing SOLARIS AIX Client TRU64 HP-UX WINDOWS LINUX Consolidate 1 S-Blade 2 3 Processor & streaming DB logic SQL Compiler Query Plan Execution Engine 2 3 S-Blade S-Blade 2 3 Processor & streaming DB logic 2 3 Processor & streaming DB logic Source Systems ETL Server DBA CLI 3rd Party Apps High-Speed Loader/Unloader Optimize Admin Front End DBOS SMP Host Network Fabric 960 High-Performance Database Engine Streaming joins, aggregations, sorts S-Blade 2 3 Processor & streaming DB logic Massively Parallel Intelligent Storage High Performance Loader

Our Secret Sauce select DISTRICT, PRODUCTGRP, sum(nrx) from MTHLY_RX_TERR_DATA where MONTH = '20091201' FPGA Core CPU Core and MARKET = 509123 and SPECIALTY = 'GASTRO' Uncompress Project Restrict, Visibility Complex Joins, Aggs, etc. Slice of table MTHLY_RX_TERR_DATA (compressed) select DISTRICT, where MONTH = '20091201' sum(nrx) PRODUCTGRP, and MARKET = 509123 sum(nrx) and SPECIALTY = 'GASTRO'

How We Did It 325 MB/sec (2.5 drives / core) PureData System for Analytics N1001 N200X 1000 500MB/sec 800 1000 MB/sec + 130 MB/sec 120MB/sec FPGA Core CPU Core 130 MB/sec 1300 480 MB/sec MB/sec 65 MB/sec Decompress Project Restrict Visibility SQL & Advanced Analytics From Select Where Group by

N200X

Performance & Capacity comparison N2002-002 N2002-002

Advanced Analytics

i-class: Analytics Without Constraints Big Data Big Math Analyze wider and deeper data > Additional dimensions > Richer history Increase computational intensity > More complex models > Faster execution for results

Advanced Analytics the Traditional Netezza Way Way SAS, SPSS Data Warehouse Analytics Grid Data ETL Demand Forecastin g SQL ETL R, S+ ETL SQL Fraud Detection SQL C/C++, Java, Python, Fortran,

Advanced Analytics the Netezza Way SAS, SPSS Data Analytics Grid Demand Forecastin g ETL R, S+ SQL C/C++, Java, Python, Fortran, Fraud Detection

Advanced Analytics the Netezza Way SAS, SPSS complex analytics SAS, SPSS, R, Java, etc implicit parallelism petabyte scalability appliance simplicity SQL Demand Forecastin g R, S+ Fraud Detection SQL

Pre-Built In-Database Analytics Statistics Transformations Time Series Mathematical Descriptive Statistics+ Distance Measures* Hypothesis Testing* Chi-Square & Contingency Tables* Univariate & Multivariate Distributions+ Monte Carlo Simulation* Data Profiling / Descriptive Statistics+ General Diagnostics Statistics+ Sampling Data prep Autoregressive+ Forecasting* Basic Math* Permutation and Combination* Greatest Common Divisor and Least Common Multiple* Conversion of Values* Exponential and Logarithm* Gamma and Beta Functions Matrix Algebra+ Area Under Curve* Interpolation Methods* Data Mining Predictive Geospatial Association Rules+ Clustering+ Linear Regression+ Logistic Regression+ Geospatial Data Type Geometric Functions * Fuzzy Logix DB Lytix capabilities Feature Extraction+ Discriminant Analysis* Classification Bayesian Sampling Geometric Analysis + Netezza Analytics and Fuzzy Logix DB Lytix capabilities Model Testing