Ralph Behrens Client Technical Professional Big Data Certified Netezza Specialist IBM Software Group Deutschland. IBM BIG Data Plattform



Similar documents
Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

Luncheon Webinar Series May 13, 2013

IBM Netezza High Capacity Appliance

Evolving Solutions Disruptive Technology Series Modern Data Warehouse

IBM Big Data in Government

Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data

IBM Data Warehousing and Analytics Portfolio Summary

IBM Big Data Platform

Netezza and Business Analytics Synergy

How the oil and gas industry can gain value from Big Data?

The Future of Data Management

IBM BigInsights for Apache Hadoop

IBM Big Data HW Platform

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

Big Data Use Case Deep Dive 5 Game Changing Use Cases for Big Data

Introducing Oracle Exalytics In-Memory Machine

Main Memory Data Warehouses

HDP Hadoop From concept to deployment.

2015 Ironside Group, Inc. 2

Big Data & Analytics for Semiconductor Manufacturing

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Big Data Strategies with IMS

Oracle Big Data SQL Technical Update

IBM InfoSphere BigInsights Enterprise Edition

An Oracle White Paper October Oracle: Big Data for the Enterprise

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

Big Data overview. Livio Ventura. SICS Software week, Sept Cloud and Big Data Day

PureSystems: Changing The Economics And Experience Of IT

An Oracle White Paper June Oracle: Big Data for the Enterprise

2009 Oracle Corporation 1

Il mondo dei DB Cambia : Tecnologie e opportunita`

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

BAO & Big Data Overview Applied to Real-time Campaign GSE. Joel Viale Telecom Solutions Lab Solution Architect. Telecom Solutions Lab

Parallel Data Warehouse

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

IBM Netezza High-performance business intelligence and advanced analytics for the enterprise. The analytics conundrum

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Understanding the Value of In-Memory in the IT Landscape

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

HDP Enabling the Modern Data Architecture

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Focus on the business, not the business of data warehousing!

The Future of Data Management with Hadoop and the Enterprise Data Hub

Next Generation Data Warehousing Appliances

Integrating Netezza into your existing IT landscape

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

Big Data and Trusted Information

Delivering new insights and value to consumer products companies through big data

In-memory computing with SAP HANA

How To Use Hp Vertica Ondemand

IBM Big Data Platform

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Investor Presentation. Second Quarter 2015

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Smarter Analytics Leadership Summit Big Data. Real Solutions. Big Results.

III JORNADAS DE DATA MINING

IBM Analytics. Just the facts: Four critical concepts for planning the logical data warehouse

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

Cost-Effective Business Intelligence with Red Hat and Open Source

Big Data for Investment Research Management

BIG Data Analytics Move to Competitive Advantage

Oracle Big Data Building A Big Data Management System

SAP Real-time Data Platform. April 2013

Oracle Big Data Strategy Simplified Infrastrcuture

BIG DATA : PAST, PRESENT AND FUTURE - AN ANALYST S PERSPECTIVE

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Using Tableau Software with Hortonworks Data Platform

Raul F. Chong Senior program manager Big data, DB2, and Cloud IM Cloud Computing Center of Competence - IBM Toronto Lab, Canada

Welcome to The Future of Analytics In Action IBM Corporation

Ganzheitliches Datenmanagement

Big Data and Its Impact on the Data Warehousing Architecture

An Oracle White Paper September Oracle: Big Data for the Enterprise

IBM Data Retrieval Technologies: RDBMS, BLU, IBM Netezza, and Hadoop

Toronto 26 th SAP BI. Leap Forward with SAP

SAP HANA Reinventing Real-Time Businesses through Innovation, Value & Simplicity. Eduardo Rodrigues October 2013

Agile Business Intelligence Data Lake Architecture

Microsoft Big Data. Solution Brief

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Cisco Data Preparation

Oracle Database 11g Comparison Chart

A New Era Of Analytic

GigaSpaces Real-Time Analytics for Big Data

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

Transcription:

Ralph Behrens Client Technical Professional Big Data Certified Netezza Specialist IBM Software Group Deutschland IBM BIG Data Plattform

Data is the new Oil. Data is the just New like Oil crude. It s valuable, but if unrefined it cannot really be used. Clive Humby, DunnHumby WE'RE A CUSTOMER SCIENCE COMPANY 22

Das Verständnis der Daten ist entscheidend Entdecken Einfache Navigieren und Visualisieren aller internen und externen Daten als Einstieg in die Big Data Welt. Analysieren Den Informationsgehalt aller relevanten strukturierten oder unstrukturierten Daten vergleichen und analysieren. Verstehen Korrelationen und Kombinationen der Information aufdecken um bessere Entscheidungen zu treffen

IBM Big Data & Analytics Reference Architecture All Data Sources Big Data Platform Capabilities Advanced Analytics/ New Insights New/ Enhanced Applications Streaming Data Text Data Information platform Real-time Analytics Warehouse & Data Marts Analytic Appliances Cognitive Learn Dynamically? Watson Applications Data Time Series Geo Spatial Open Architecture/ Multiple Product Entry Points Real-time Analytics EDW Prescriptive Best Outcomes? Predictive What Could Happen? Alerts Fraud Automated Process Case Management Video & Image Relational Social Network Information Integration Landing Zone Data Exploration Archive Data Marts Information Governance, Security and Business Continuity Descriptive What Has Happened? Exploration and Discovery What Do You Have? Analytic Applications Cloud Services ISV Solutions

IBM Big Data PureData Systems Solutions Analytics and Decision Management IBM Big Data Platform Trend #1 Appliances Data Warehouse PureData Systems Expert integrated systems to make deep and operational analytics faster & simpler 5 Big Data Infrastructure

IBM PureData Systems overview Meeting Big Data Challenges Fast and Easy! System for Transactions For apps like E-commerce: Database cluster services optimized for transactional throughput and scalability DB2 purescale powered by System-P or System-X System for Analytics For apps like Customer Analysis: Data warehouse services optimized for high-speed, peta-scale analytics and simplicity Powered by Netezza technology 6 System for Operational Analytics DB2 powered by System-X For apps like Real-time Fraud Detection: Operational data warehouse services optimized to balance high performance analytics and real-time operational throughput

PureData for Analytics - Model N2001 12 IBM EXP3000 Disk Enclosures 288 x 600 GB SAS2 Drives (240 for User Data, 14 for S-Blades, 34 Spare) RAID 1 Mirroring 2 IBM x3650-m3 Hosts 2x 6-Core Intel 3.46 GHz CPUs Active-Passive Mode 7 IBM HX5 S-Blades 2x Intel 8 Core 2+ GHz CPUs New Netezza BPE4 Side Car 2x 8-Engine Xilinx Virtex-6 FPGAs 128 GB RAM + 8 GB slice buffer All components are fully redundant and able to have their workload redistributed to a set of alternate components. Loss of a blade, any storage component, even the host system that serves as the primary interface will not prevent the system from functioning. Linux 64-bit Kernel User Data Capacity: Data Scan Speed: Load Speed (per system): * Assuming 4X compression 192 TB* 450 TB/hr* 5+ TB/hr Power Requirements: 7.5 kw Cooling Requirements: 27,000 BTU/hr Footprint: 65x110x222 cm /1282 kg 7

PureData for Analytics - Model N2001 Appliance = Increase Data Center Efficiency With Faster, More Efficient Systems PureData uses Less Power than other systems 1 PureData has More Capacity than other systems 2,3 PureData has Out of the box Faster Scan Rates than other systems 8 *Unofficial customer test, **Exadata with/out SSD

IBM Platform for Big Data: BigInsights Solutions Analytics and Decision Management IBM Big Data Platform Trend #2 Analytical Intelligence on cheap standard HW Hadoop System Data Warehouse InfoSphere BigInsights Enterprise-grade Hadoop system enhanced with advanced text analytics, data visualization, tools, & performance features for analyzing massive volumes of structured and unstructured data. 9 Big Data Infrastructure

IBM Enriches Hadoop Scalable New nodes can be added on the fly Affordable Massively parallel computing on commodity servers Flexible Hadoop is schema-less, and can absorb any type of data Fault Tolerant Through MapReduce software framework Performance & reliability Adaptive MapReduce, Compression, Indexing, Flexible Scheduler, H Enterprise Hardening of Hadoop Productivity Accelerators Web-based Uis and tools End-user visualization Analytic Accelerators, H. Enterprise Integration To extend & enrich your information supply chain SQL Interface 10

Key Features and Specifications Key Features Hadoop Distribution InfoSphere BigInsights V2.1 Built-in Analytics/Accelerators Development / Administration Enterprise Readiness Data Warehouse Integration Specifications IBM BigSheets IBM Accelerator for Text Analytics IBM Accelerator for Social Data IBM Accelerator for Machine Data IBM Big SQL Eclipse-based Development Environment Exposed Node Management Security High Availability SW & HW Hardware management & monitoring Enterprise data warehouse connectors Archival capabilities Full Rack Management Nodes 1 primary, 1 standby (x3550 M4) Data Nodes 18 (x3630 M4) CPU Cores 216 Memory Raw Storage User Space 96 GB per node, 1728 GB total 216 drives, 3 TB each. 648 TB total 216 TB 11

Benefits of IBM PureData System for Hadoop Accelerate Big Data Time to Value Simplify Big Data Adoption & Consumption Deploy 8x Faster than custom-built solutions 1 Built-in Visualization to accelerate insight Built-in Analytic Accelerators 2 unlike big data appliances on the market Single System Console for full system administration Rapid Maintenance Updates with automation No Assembly Required data load ready in hours Implement Enterprise- Class Big Data Only Integrated Hadoop System with Built-in Archiving Tools 2 Delivered with More Robust Security than open source software Architected for High Availability 1 Based on IBM internal testing and customer feedback. "Custom built clusters" refer to clusters that are not professionally pre-built, pretested and optimized. Individual results may vary. 2 Based on current commercially available Big Data appliance product data sheets from large vendors. US ONLY CLAIM. 12

Neue Ansätze fürs Data Warehouse Use Case - Queryable Archive Immediate storage alternative of cold data Cost savings for cold data Compliance requirements Use Case do more! Using unstructured Data Explore new Data Super ETL- Landing-Zone Synchronous analyze the data PureData System for Hadoop PureData System for Analytics (Reporting, PredictionH) 13

IBM Platform for Big Data: Streams Solutions Analytics and Decision Management IBM Big Data Platform InfoSphere Streams Software enabling continuous analysis of massive volumes of streaming data with sub-millisecond response times Stream Computing Hadoop System Data Warehouse Trend #3 Processing of (machine) data in realtime Big Data Infrastructure

Stream Computing: A Paradigm Shift Traditional DWH Computing Stream Computing Search for historic facts Find and analyze information stored Batch -Paradigm, Pull -Model Query-driven. Queries are placed on static data 15 Search for recent facts Analysis of the data while moving, before storage "Real-Time -Paradigm, Push -Model Data-driven. Data is brought to the analysis Real-time Analytics

Streams Analyzes All Kinds of Data Text (listen, verb), (radio, noun) Mining in Microseconds (included with Streams) Simple & Advanced Text (included with Streams) Predictive (IBM Research) Geospatial (IBM Research) Acoustic (IBM Research) (Open Source) population R ( s t, a t ) Image & Video (Open Source) Advanced Mathematical Models (IBM Research) Statistics (included with Streams)

IBM Platform for Big Data: DB2 10.5 BLU Solutions Analytics and Decision Management IBM Big Data Platform Visualization & Discovery Application Development Systems Management DB2 10.5 with In- Memory Acceleration Stream Computing Hadoop System The DB2 release of the latest generation, which allows the transition of conventional database technology, to seamlessly implement in-memory analysis. In-Memory Database Data Warehouse Trend #4 In-Memory Databases Big Data Infrastructure

DB2 10.5 with In-Memory Acceleration: Typical Results Customer Speedup over DB2 10.1 Large Financial Services Company 46.8x Global ISV Mart Workload 37.4x Analytics Reporting Vendor 13.0x Global Retailer 6.1x Large European Bank 5.6x 10x-25x improvement is common It was amazing to see the faster query times compared to the performance results with our row-organized tables. The performance of four of our queries improved by over 100-fold! The best outcome was a query that finished 137x faster by using BLU Acceleration. - Kent Collins, Database Solutions Architect, BNSF Railway 1

IBM Platform for Big Data: Information Governance Govern data quality and manage the information lifecycle Solutions Analytics and Decision Management IBM Big Data Platform InfoSphere Information Server Cleanses data, monitors quality and integrates big data with existing systems Visualization & Discovery Application Development Systems Management InfoSphere Optim manages business information throughout its lifecycle InfoSphere Master Data Management manages and maintains trusted views of master and reference data Stream Computing In-Memory Database Information Integration & Governance Hadoop System Data Warehouse MustHave Integration And Security InfoSphere Guardium realtime database security and monitoring Big Data Infrastructure

IBM Platform for Big Data: Accelerators Stream Computing Solutions Analytics and Decision Management IBM Big Data Platform Accelerators Information Integration Hadoop System Speed time to value with analytic and application accelerators Analytic Accelerators text analytics, geospatial, time-series, data mining Application Accelerators financial services, machine data, social data, Telco event data In-Memory Database & Governance Data Warehouse Industry Models - comprehensive data models based on deep expertise and industry best practice Big Data Infrastructure

Example Big Data Analytics Application: Social Media Analytics Competitive Analysis Business Drivers Corporate Reputation Customer Care Campaign Effectiveness Product Insight Source Areas FACEBOOK BLOGS DISCUSSION FORUMS TWITTER NEWSGROUPS MULTILINGUAL COMPREHENSIVE ANALYSIS Ad-Hoc keyword searches Automatic detection changes consumer vocabulary AFFINITY ANALYTICS Relationship heatmaps to understand affinity Quantify strength of affinity Capabilities PREDICTIVE ANALYSIS Forward-looking detection of discussion topics Identify KPPs Predict impact of social interaction on business KPI s Predict ability to influence social interaction SENTIMENT Dimensional analysis and filtering Tunable sentiment rules EVOLVING TOPICS Detect and predict emerging topics and viral posting patterns Discover associated themes

IBM Platform for Big Data: Accelerators Solutions Analytics and Decision Management Discover, understand, search, and navigate federated sources of big data InfoSphereData Explorer Discovery and navigation software that provides real-time access and fusion of big data with rich and varied data from enterprise applications for greater insight Visualization & Discovery Stream Computing In-Memory Database IBM Big Data Platform Application Development Accelerators Information Integration & Governance Systems Management Hadoop System Data Warehouse Trend #5 Search and discover 22 Big Data Infrastructure

Leverage the full power of IBM s Big Data Platform Data access & integration Index structured & unstructured data in place Support existing security Federate to external sources Leverage MDM, governance, and taxonomies Discovery & navigation Clustering & categorization Contextual intelligence Easy-to-deploy applications All at the scale required for today s big data challenges Streams Connector Framework IBM Data Explorer & App Builder BigInsights UI / User Data Explorer CM, RM, DM RDBMS Feeds Web2.0 Email Web CRM, ERP File Systems Integration & Governance Warehous e Integration & Governance 23

Out-of-the-Box Funktionalitäten TabbedSearch(1)für Quellen basierte Suche. Alerts(2) um auf Veränderungen im Kontent hinzuweisen. Expertise Location(3) um schnell die richtigen Experten zu finden. Such Ergebnisse anreichern durch Ratings (4), Taggings (5) oder frei Text. SuchergebnisseSpeichern(6)und Bookmarken Schnelles und einfaches finden durch Text Clustering (7). Strukturierte Navigation (8), Filterung, Verteilung von Informationen und Zusammenarbeit. Grafische Navigation (9) in Datumsbereichen oder Häufigkeiten. Query Expansion (10) Einbindung von Thesauri oder Suchvorschlägen.

Data Explorer + Analytics = Complete Picture Data Explorer surfaces insights from the unstructured in context with the analytics. Data Explorer handles the qualitative on unstructured info. Analytics handles the quantitative on structured info. Significant data cleansing occurs on data collected before being run through systems like Cognos. Does not have any structure Web RSS Feed Social Media Content Mgt Unstructured Data Systems Enterprise Unstructured Sources Databases Data Warehouse s SCM SOA, ESB, Web Service Enterprise Systems & Content Stores Each system has its own but different structure World s Total Data 80% Unstructured 20% Structured

IBM End-to-End Big Data & Analytics Portfolio Data Sources + Insures ability to address broader requirements that may be needed now or in the future + Apply data security to Big Data (Guardium) + Enable a 360 view of all customer related Big Data (MDM) + Provide full information integration capabilities for Big Data (Information Server) + Integration enables use of existing tools and skills to start leveraging Big Data more quickly Information & Insight Real-Time Analytics Streaming Sensor Geospatial Time Series Structured Operational Landing, Exploration & Archive InfoSphere BigInsights InfoSphere Streams Enterprise Warehouse PureDatafor Operational Analytics Analytic Appliances PureDatafor Analytics Data Marts DB2 BLU, PureDatafor Analytics Predictive Analytics & Modeling SPSS BI & Performance Management Cognos Unstructured External Social Information Movement, Matching & Transformation InfoSphereData Click, Information Server, MDM, G2 Security, Governance and Business Continuity Guardium, Optim Exploration & Discovery InfoSphere Data Explorer

Big Data Use Cases Big Data Exploration Enhanced 360 o View of the Customer Security/Intelligence Extension Operations Analysis Data Warehouse Augmentation 27

Ralph Behrens Client Technical Professional IBM Big Data IBM Deutschland GmbH Wilhelm-Fay-Straße 30-34 65936 Frankfurt Phone +49 (0) 7034 / 6430680 Mobile +49 (0)172 / 6511333 ralph.behrens@de.ibm.com 29

Client Reference Base Digital Media Financial Services Health & Life Sciences Retail / Consumer Products Telecom 3 0 3 0 Other