PureSystems: Changing The Economics And Experience Of IT



Similar documents
Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

Netezza and Business Analytics Synergy

IBM PureData Systems. Robert Božič 2013 IBM Corporation

Main Memory Data Warehouses

Evolving Solutions Disruptive Technology Series Modern Data Warehouse

IBM Netezza High Capacity Appliance

IBM Netezza High-performance business intelligence and advanced analytics for the enterprise. The analytics conundrum

IBM Big Data HW Platform

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.

IBM PureData System for Transactions. Technical Deep Dive. Jonathan Rossi, PureSystems Specialist

2009 Oracle Corporation 1

Next Generation Data Warehousing Appliances

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

IBM Cognos 10: Enhancing query processing performance for IBM Netezza appliances

James Serra Sr BI Architect

IBM PureApplication System for IBM WebSphere Application Server workloads

Maximum performance, minimal risk for data warehousing

An Oracle White Paper May Exadata Smart Flash Cache and the Oracle Exadata Database Machine

Introducing Oracle Exalytics In-Memory Machine

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

QlikView Business Discovery Platform. Algol Consulting Srl

Overview: X5 Generation Database Machines

SQL Server Business Intelligence on HP ProLiant DL785 Server

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

Bringing Big Data into the Enterprise

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

IBM Data Warehousing and Analytics Portfolio Summary

SAP HANA. SAP HANA Performance Efficient Speed and Scale-Out for Real-Time Business Intelligence

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

DEPLOYING IBM DB2 FOR LINUX, UNIX, AND WINDOWS DATA WAREHOUSES ON EMC STORAGE ARRAYS

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Scaling Your Data to the Cloud

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

SLIDE 1 Previous Next Exit

Cost-Effective Business Intelligence with Red Hat and Open Source

Advanced In-Database Analytics

Key Attributes for Analytics in an IBM i environment

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

INVESTOR PRESENTATION. First Quarter 2014

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Oracle Database 11g Comparison Chart

Capacity Management for Oracle Database Machine Exadata v2

The Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays

A Data Warehouse Approach to Analyzing All the Data All the Time. Bill Blake Netezza Corporation April 2006

Oracle Database In-Memory The Next Big Thing

Oracle Enterprise Manager 12c New Capabilities for the DBA. Charlie Garry, Director, Product Management Oracle Server Technologies

White Paper February IBM InfoSphere DataStage Performance and Scalability Benchmark Whitepaper Data Warehousing Scenario

Instant-On Enterprise

IBM Storwize V7000 Unified and Storwize V7000 storage systems

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture

Virtual Data Warehouse Appliances

Understanding the Value of In-Memory in the IT Landscape

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

IBM System x reference architecture solutions for big data

Inge Os Sales Consulting Manager Oracle Norway

Big Data and Its Impact on the Data Warehousing Architecture

TUT NoSQL Seminar (Oracle) Big Data

Oracle Database 12c Built for Data Warehousing O R A C L E W H I T E P A P E R F E B R U A R Y

SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform

White Paper - GPU-Based SQL Database. SQream Technologies. SQream DB GPU-Based SQL Database Technical Overview White Paper

Driving Peak Performance IBM Corporation

PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation

Introduction to the PureData for Analytics System (PDA) + Details on the N3001 Family

How To Build An Exadata Database Machine X2-8 Full Rack For A Large Database Server

Oracle Exadata Database Machine for SAP Systems - Innovation Provided by SAP and Oracle for Joint Customers

SQL Server What s New? Christopher Speer. Technology Solution Specialist (SQL Server, BizTalk Server, Power BI, Azure) v-cspeer@microsoft.

IBM System Storage DS5020 Express

Constructing a Data Lake: Hadoop and Oracle Database United!

Oracle Architecture, Concepts & Facilities

Exadata for Oracle DBAs. Longtime Oracle DBA

<Insert Picture Here> Oracle Exadata Database Machine Overview

How To Use Hp Vertica Ondemand

Using Hadoop to Expand Data Warehousing

FIFTH EDITION. Oracle Essentials. Rick Greenwald, Robert Stackowiak, and. Jonathan Stern O'REILLY" Tokyo. Koln Sebastopol. Cambridge Farnham.


IBM Analytics. Just the facts: Four critical concepts for planning the logical data warehouse

SQL Server 2012 Parallel Data Warehouse. Solution Brief

How To Use Exadata

In-memory computing with SAP HANA

Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services. By Ajay Goyal Consultant Scalability Experts, Inc.

Cost/Benefit Case for IBM PureData System for Analytics

Microsoft Analytics Platform System. Solution Brief

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

Using Attunity Replicate with Greenplum Database Using Attunity Replicate for data migration and Change Data Capture to the Greenplum Database

Cost/Benefit Case for IBM PureData System for Analytics

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Blueprints for Scalable IBM Spectrum Protect (TSM) Disk-based Backup Solutions

Top 10 Performance Tips for OBI-EE

Transcription:

PureSystems: Changing The Economics And Experience Of IT Accelerating Analytics Faster Insight From Data Warehouses That Scale And Cost Less Copies: http://www.ibm.com/ibm/puresystems/events/assets/index.html

Businesses Benefit By Using An Analytic Approach Over Intuition 600% increase in cross-sell campaign $200 Million increase in cash flow $13.8 Million in cost savings 80% decrease in reporting time on top of Oracle e-business suite 40% decline in homicide rates The more analytics a business uses, the better it performs 2

Analytics Have Different Characteristics Deep Analytics Different characteristics require different Workload Optimized Systems Fewer users, more complex reports Emphasis on response time Efficiently applies all resources to a single task Dedicated Static Data Marts Operational Analytics Many users, all types of reports Emphasis on throughput Effectively balances resources across all tasks Real Time Data Flow & Enterprise Data Warehouse Data Warehouse 3

IBM PureData Optimized For Data Workloads Optimizing both data processing and data access techniques 1. Transaction Processing 2. Deep Analytics 3. Operational Analytics PureData System For Transactions PureData System For Analytics PureData System For Operational Analytics Many data transactions running in parallel for high throughput Operational Data Very complex data analysis simplified and accelerated through massively parallel processing and SMP and hardware acceleration Dedicated Static Data Marts A mix of data analysis accelerated through massively parallel processing and many operational interactions running in parallel for high throughput Live Feeds and Data Warehouse 4

Built In Expertise Makes Deep Analytics As Simple As An Appliance PureData System for Analytics Is An Efficient Dedicated Appliance Simple Fast installation No DB administrator tasks Very easy operation Speed Optimized for purpose Smart 5

PureData System for Analytics Is Workload Optimized For High Performance Analytics Workload Optimized Massively parallel processing architecture Custom hardware acceleration Purpose-built for high performance analytics Disk Enclosures User data, mirror, swap partitions High speed data streaming SMP Hosts SQL Compiler Query Planner Optimizer Administration Snippet Blades Hardware-based query acceleration with Field Programmable Gate Arrays Complex analytics executed as the data streams from disk 6

Innovative Streaming Technology Speeds Up Complex Queries select DISTRICT, PRODUCTGRP, sum(nrx) from MTHLY_RX_TERR_DATA where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO' Field Programmable Gate Array Core CPU Core Slice of table MTHLY_RX_TERR_DATA (compressed) Uncompress Select Columns Filter Rows Complex Joins, Aggs, etc. SMP Hosts sum(nrx) select DISTRICT, PRODUCTGRP, sum(nrx) where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO' 7

PureData System For Analytics Provides Innovative Streaming Technology Breakthrough Simplicity No software installation Spend Less Time Managing and More Time Innovating No indexes and tuning MONTHS No storage administration No dbspace / tablespace sizing and configuration No redo/physical/logical log sizing and configuration No page/block sizing and configuration for tables No extent sizing and configuration for tables No Temp space allocation and monitoring No RAID level decisions for dbspaces No logical volume creations of files No integration of OS kernel recommendations No maintenance of OS recommended patch levels No JAD sessions to configure host/network/storage DAYS WEEKS UP and Running: PureData for Analytics 24 hours! vs. Exadata 2.5 weeks 8

PureData System For Analytics Is Simple To Set Up Create Database (nzsql commands) Create Tables (nzsql commands) CREATE TABLE SALES_FACT ( ) distribute on random organize on (order_day_key, retailer_site_key, product_key) Load Table (nzload) Groom tables (nzsql) Gather statistics (nzsql) Run test Done in 2.5 days which includes 1 day of learning! 9

Exadata Requires Extensive Knowledge, Complex Configuration, And Time Consuming Tuning Create database using X3 data warehouse database template (dbca wizard) Create and configure multiple tablespaces and datafiles for non-partitioned and partitioned tables, DBFS staging filesystem, temp tables Configure ASM storage for datafile performance (Hot/Cold disk region settings) Create Tables CREATE TABLE SALES_FACT ( ) PARTITION BY RANGE(ORDER_DAY_KEY) SUBPARTITION BY HASH(SALES_ORDER_KEY,ORDER_DAY_KEY) SUBPARTITIONS 32 STORE IN (BIDAYTS, BIDAYTS_1, BIDAYTS_2, BIDAYTS_3) (PARTITION ORDERS_P1 VALUES LESS THAN (20060501), PARTITION ORDERS_P2 VALUES LESS THAN (20060601), PARTITION ORDERS_P3 VALUES LESS THAN (20060701), PARTITION ORDERS_P4 VALUES LESS THAN (20060801), PARTITION ORDERS_P5 VALUES LESS THAN (20060901), PARTITION ORDERS_P6 VALUES LESS THAN (20061001), PARTITION ORDERS_P7 VALUES LESS THAN (20061101),PARTITION ORDERS_P8 VALUES LESS THAN (20061201), PARTITION ORDERS_P9 VALUES LESS THAN (MAXVALUE)) COMPRESS FOR QUERY HIGH STORAGE (CELL_FLASH_CACHE KEEP) STORAGE (INITIAL 8M NEXT 8M) PARALLEL; Prior to loading, run all queries on empty tables to populate column usage statistics Create DBFS filesystem, copy raw load files into DBFS, Load tables Gather statistics on tables, indexes (dimension PKs), fixed objects, dictionary, Configure SGA/PGA memory sizes (Try to maximize without starving O/S) Run tests, performance degraded for some queries compared to V2 Access path (plan) changes in 11.2.0.3 causing less efficient plans to be used Modify optimizer OFE from 11.2.0.3.1 to 11.2.0.2 to obtain better results Done in 2.5 Weeks by an experienced Exadata administrator 10

BI Day Deep Analytics Test Measures Concurrent Multi-User Performance 4 Users doing complex reports 20 Users doing intermediate s 2 Users 2 Users 6 Users 8 Users 6 Users Complex 1 Complex 3 Intermediate 9 Intermediate 10 Intermediate 11 Each report executes one or more queries 4 Connections Data Server 20 Connections 24 concurrent users Mix of complex and intermediate reports Focus on throughput Note: Distribution of complex, intermediate, and simple workloads based on Forrester Research, Profiling the Analytic End User for Business Intelligence, 2004 11

PureData For Analytics Delivers 10x Better Price Performance On Deep Analytics Workload IBM PureData for Analytics Full Rack (N2001) System Cost 3 year TCA $2.91Million 1298 Intermediate s/hour 19 Complex s/hour 10x better price/performance 9x more intermediate reports 8x more complex reports Based on IBM internal tests comparing PureData System For Analytics with a comparably priced, comparably tuned competitor configuration (version available as of 01/01/2013) executing an identical analytics workload in a controlled laboratory environment. Tests measure report throughput (reports per hour) over a 4 hour test period with 24 connected users concurrently executing mixed SQL query workload. More reports per hour indicates higher throughput. 3YR TCA means Total Cost of Acquisition for 3 years, based on list prices, including hardware, software, and maintenance. Competitor configuration: ¼ Unit (usable uncompressed capacity = 9.5TB) including competitor recommended software options and features. IBM configuration: PureData System for Analytics N2001-10 (usable uncompressed capacity = 48TB). Price performance calculated as 3YR TCA divided by test throughput. Results may not be typical and will vary based on actual workload, configuration, applications, queries and other variables in a production environment. Users of this document should verify the applicable data for their specific environment. Contact IBM and see what we can do for you. 12

Cores, SSDs And Infiniband Cannot Overcome Competitor s Weaknesses Competitor database allocates resources on request arrival If workload is light, lots of resource allocated Request executes quickly If workload is heavy, few resources allocated Request executes very slowly Result: Unpredictable performance Competitor cannot reallocate resources as workload changes If workload lightens and resources become available In-flight requests continue with resources originally assigned Available resources sit idle In-flight requests execute very slowly Result: Wasted resource and needless delay 13

Netezza Outperforms Competitor Running Analytic Queries Same queries, same clients, same data Different Results See more like this on YouTube: http://www.youtube.com/watch?v=t3o6yj_hduu 14

Many Customers Have Had Similar Results With IBM PureData System For Analytics Digital Media Financial Services Government Health & Life Sciences Retail / Consumer Products Telecom Other 05 PureData03 for 05 - Accelerating Analytics PureData & Operational Analytics Analytics 15

IBM PureData Optimized For Data Workloads Optimizing both data processing and data access techniques 1. Transaction Processing 2. Deep Analytics 3. Operational Analytics PureData System For Transactions PureData System For Analytics PureData System For Operational Analytics Many data transactions running in parallel for high throughput Operational Data Very complex data analysis simplified and accelerated through massively parallel processing and SMP and hardware acceleration Dedicated Static Data Marts A mix of data analysis accelerated through massively parallel processing and many operational interactions running in parallel for high throughput Live Feeds and Data Warehouse 16

Operational Analytics Mixes Analytic Queries With Operational Transactions Operational Analytic workload characteristics Mix of deep analytic queries and short operational transactions Continuous updates Multiple ingest streams Concurrency Thousands of users May incorporate normalized schemas Balance long running analytics with real time operational needs Receive data from a variety of sources Work with a wide variety of analytical tools Real Time Fraud Detection When your credit card company tells you that it denied a transaction with your card today on the other side of the world you probably are not thinking about all the transactions checked in just that second But the credit provider s operational data warehouse is Operational Analytics Random and sequential reads & data loads + continuous ingest Analytics split into many parts and narrow scope operations, all running in parallel Partitioned data access 17

PureData System For Operational Analytics Can Collect And Analyze All Your Data Using The Tools That Fit Your Needs ing and Analysis Ab Initio Cloudera Composite Software IBM Big Insights IBM Information Server IBM InfoSphere Streams IBM InfoSphere Change Data Capture Informatica Oracle Data Integrator Oracle GoldenGate SAP Business Objects SQL ODBC JDBC OLE-DB Data In SQL ODBC JDBC OLE-DB Data Out IBM Cognos IBM SPSS IBM Unica Information Builders Kalido KXEN Microsoft Excel MicroStrategy Oracle OBIEE SAP Business Objects SAS Actuate 18

PureData System For Operational Analytics Complete And Ready To Run Hardware Power Systems servers AIX v7.1 Storwize V7000 storage EXP30 Ultra SSD Software DB2 InfoSphere Warehouse Tivoli Automation* Optim Performance Manager Analytics Cognos 10.1.1 IBM POWER7 P740 & P730 16 Core servers @ 3.55GHz IBM Storwize V7000 with 900GB drives Ultra SSD I/O Drawers, each with six 387GB SSD Blade Network Technologies 10G and 1G Ethernet switches Brocade SAN switches (SAN48B-5) * For Failover Orchestration 19

A Size For Every Business IBM PureData System for Operational Analytics configurations* Extra Small Foundation rack 1 foundation module Small Foundation + 1/3 rack 1 foundation module 1 data module Medium Foundation + 2/3 rack 1 foundation module 2 data modules Cores 32 64 80 96 Memory 256 GB 512 GB 640 GB 768 GB SSD Storage 4.8 TB 9.6TB 12 TB 14.4 TB Large Foundation + full rack 1 foundation module 3 data modules HDD Capacity (raw unformatted) User Data Capacity (uncompressed) 64.8 TB 151.2 TB 237.6 TB 234 TB 29.7 TB 69.3 TB 108.9 TB 148.5 TB *Scalable to 1 PB or more of user data 20

PureData System For Operational Analytics Has A Simple Integrated Single Management Console Single, integrated graphical console to manage all resources and work running on the system Role-based security and tasks management monitoring maintenance Easy integration with broader enterprise monitoring tools and processes Same user interface as IBM PureApplication System for consistency across systems 21 21

BI Day Workload Measures High Levels Of Concurrently Executing Workloads 4 Users doing complex reports 20 Users doing intermediate s 56 Users doing simple reports 2 Users 2 Users 6 Users 8 Users 6 Users 14 Users 14 Users 14 Users 14 Users Complex 1 Complex 3 Intermediate 9 Intermediate 10 Intermediate 11 Simple 2 Simple 4 Simple 5 Simple 6 Each report executes one or more queries 4 Connections Data Server 20 Connections 56 Connections 80 simultaneous users connected Measure concurrent throughput Note: Distribution of complex, intermediate, and simple workloads based on Forrester Research, Profiling the Analytic End User for Business Intelligence, 2004 22

PureData System For Operational Analytics Delivers Better Price Performance On Operational Analytics Workload IBM PureData System for Operational Analytics (Small Rack) 1 Foundation Node 1 Data Node System Cost 3 year TCA $2.98 Million 580,248 Simple s/hour 532 Intermediate s/hour 4.85 Complex s/hour 17% lower system cost for more reports! 3.8x more intermediate reports 1.8x more complex reports 1.2x more simple reports Based on IBM internal tests comparing PureData System For Operational Analytics with a comparably priced, comparably tuned competitor configuration (version available as of 01/01/2013) executing an identical analytics workload in a controlled laboratory environment. Tests measure report throughput (reports per hour) over a 4 hour test period with 80 connected users concurrently executing mixed SQL query workload. More reports per hour indicates higher throughput. 3YR TCA means Total Cost of Acquisition for 3 years, based on list prices, including hardware, software, and maintenance. Competitor configuration: ¼ Unit (usable uncompressed capacity = 9.5TB) including competitor recommended software options and features. IBM configuration: PureData System for Operational Analytics A1791-02 (usable uncompressed capacity = 18.6TB). Price performance calculated as 3YR TCA divided by test throughput. Results may not be typical and will vary based on actual workload, configuration, applications, queries and other variables in a production environment. Users of this document should verify the applicable data for their specific environment. Contact IBM and see what we can do for you. 23

Customers Choose IBM For Operational Analytics Bombay Stock Exchange Healthcare Retail Financial Markets Travel & Transportation Manufacturing/Industrial Products Computer Services Telecommunications Media & Entertainment Government Utilities & Energy 05 PureData03 for 05 - Accelerating Analytics PureData & Operational Analytics Analytics 24

Summary - IBM Delivers Insight Faster And At A Lower Cost For both Deep and Operational Analytics Faster time to solution Faster performance Lower costs A broader spectrum of systems to fit every budget : Contrasting IBM s Approach to Workload Optimization to Oracle s Exadata Single Server Approach (March 2012) http://www.clabbyanalytics.com/free_s_critiques.html 25