Big Data Strategies with IMS



Similar documents
BIG DATA : PAST, PRESENT AND FUTURE - AN ANALYST S PERSPECTIVE

IBM Big Data Platform

What s Happening to the Mainframe? Mobile? Social? Cloud? Big Data?

Exploiting Data at Rest and Data in Motion with a Big Data Platform

The Future of Data Management

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

How the oil and gas industry can gain value from Big Data?

Big Data Use Case Deep Dive 5 Game Changing Use Cases for Big Data

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Luncheon Webinar Series May 13, 2013

HDP Hadoop From concept to deployment.

Tap into Hadoop and Other No SQL Sources

IBM BigInsights for Apache Hadoop

What s Happening to the Mainframe? Mobile? Social? Cloud? Big Data?

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

Big Data overview. Livio Ventura. SICS Software week, Sept Cloud and Big Data Day

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

A New Era Of Analytic

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Ganzheitliches Datenmanagement

Industry Impact of Big Data in the Cloud: An IBM Perspective

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

BIG DATA TRENDS AND TECHNOLOGIES

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Big Data & Analytics. The. Deal. About. Jacob Büchler jbuechler@dk.ibm.com Cand. Polit. IBM Denmark, Solution Exec IBM Corporation

How To Handle Big Data With A Data Scientist

III JORNADAS DE DATA MINING

Implement Hadoop jobs to extract business value from large and varied data sets

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

IBM Data Warehousing and Analytics Portfolio Summary

Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data

Oracle Big Data Strategy Simplified Infrastrcuture

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Smarter Analytics Leadership Summit Big Data. Real Solutions. Big Results.

Big Data and Trusted Information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

IBM InfoSphere BigInsights Enterprise Edition

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

Investor Presentation. Second Quarter 2015

Deploying Big Data to the Cloud: Roadmap for Success

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Driving Better Marketing Results with Big Data and Analytics David Corrigan, IBM, Director of Product Marketing

Making Sense of Big Data in Insurance

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Native Connectivity to Big Data Sources in MSTR 10

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

Interactive data analytics drive insights

HDP Enabling the Modern Data Architecture

The Future of Data Management with Hadoop and the Enterprise Data Hub

INTRODUCING: A NEW ENTERPRISE PLATFORM FOR SECURE BIG DATA

Evolution from Big Data to Smart Data

Apache Hadoop: The Big Data Refinery

MySQL and Hadoop Big Data Integration

Big Data Analytics Nokia

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Trafodion Operational SQL-on-Hadoop

Hadoop Big Data for Processing Data and Performing Workload

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Big Data on Microsoft Platform

SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

Please give me your feedback

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

CA Big Data Management: It s here, but what can it do for your business?

Leveraging Machine Data to Deliver New Insights for Business Analytics

Changing the Equation on Big Data Spending

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Big Data Big Data/Data Analytics & Software Development

An Oracle White Paper June Oracle: Big Data for the Enterprise

IMS Data Integration with Hadoop

An Oracle White Paper October Oracle: Big Data for the Enterprise

Modernizing Your Data Warehouse for Hadoop

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

Transcription:

Big Data Strategies with IMS #16103 Richard Tran IMS Development richtran@us.ibm.com Insert Custom Session QR if Desired.

Agenda Big Data in an Information Driven economy Why start with System z IMS strategies for big data Summary / Call to action

On a Smarter Planet, Unprecedented Changes are Occurring ZBs of transaction data Cloud Business models under constant pressure Customers are more demanding and connected Mobile Internet of Things Social Great relationships trump great products 3

And leaders are responding by Providing a Great Experience Offering Value In Every Interaction Innovating Across the Ecosystem 4

But what is Big Data? Google can give you nearly 2 Billion options Vendors have even more definitions Here is how Gartner defines Big Data Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative information processing for enhanced insight and decision making. 5 Gartner research note Survey Analysis - Big Data Adoption in 2013 Shows Substance Behind the Hype Sept 12 2013 Analyst(s): Lisa Kart, Nick Heudecker, Frank Buytendijk

We ve moved into a new era of computing - V 4 Radical Flexibility 12 terabytes of Tweets create daily 100 s Of video feeds from surveillance cameras Volume Variety Information from everywhere Velocity Veracity Extreme Scalability 5 million trade events per second Only 1 in 3 Decision makers trust their information We have for the first time an economy based on a key resource [Information] that is not only renewable, but selfgenerating. Running out of it is not a problem, but drowning in it is. John Naisbitt 6

Agenda Big Data in an Information Driven economy Why start with System z IMS strategies for big data Summary / Call to action

The Big Data starting point Types of Data Analysed Transactional sources are the dominant data types analyzed in big data initiatives Gartner research note Survey Analysis - Big Data Adoption in 2013 Shows Substance Behind the Hype Sept 12 2013 Analyst(s): Lisa Kart, Nick Heudecker, Frank Buytendijk 8

The Big Data starting point Types of Big Data Analysed by Industry Transactional sources are the dominant data types analyzed in big data initiatives 9 Gartner research note Survey Analysis - Big Data Adoption in 2013 Shows Substance Behind the Hype Sept 12 2013 Analyst(s): Lisa Kart, Nick Heudecker, Frank Buytendijk

The role of zenterprise in Big Data analytics A large percent of the data that is accessed for analytics originates/resides on IBM zenterprise 2/3 of business transactions for U.S. retail banks 80% of world s corporate data Businesses that run on zenterprise 66 of the top 66 worldwide banks 24 of the top 25 U.S. retailers 10 of the top 10 global life/health insurance providers 1,300+ ISVs run zenterprise today, more than 275 of these selling over 800 applications on Linux 10 The downtime of an application running on System z equates to approximately 5 minutes per year The System z mainframe can run over a thousand virtual Linux images on a single frame the size of a refrigerator

Majority of today s analytics based on relational / Structured Data Analytics and decision engines reside where the DWH / transaction data is Noise (veracity) surrounds the core business data Social Media, emails, docs, telemetry, voice, video, content What data are you prepared to TRUST? Where do you put your trusted Data? Circle of trust - Data Warehouse Integration Business Analytics DB2 for z/os IMS Information Governance - 11

Demand for differently structured data to be seamlessly integrated, to augment analytics / decisions Analytics and decision engines reside where the DWH / transaction data is Noise (veractity) surrounds the core business data Social Media, emails, docs, telemetry, voice, video, content Expanding our insights getting closer to the truth Lower risk and cost Increased profitability Circle of trust widens Data Warehouse Integration Business Analytics DB2 for z/os IMS Information Governance 12

Forward Thinking Organizations are Creating Value From Big Data The power of Data coming together to deliver Improved Business Outcomes Volume Variety Velocity Veracity 1. Enrich your information base with Big Data Exploration 2. Improve customer interaction with Enhanced 360º View of the Customer with the power of Technology 3. Optimize operations with Operations Analysis 4. Gain IT efficiency and scale with Data Warehouse Augmentation 5. Prevent crime with Security and Intelligence Extension 13

Fraud Detection Claiming disability allowance. Unable to work Dude awesome vacation Work Status Facebook Post Make payment or investigate zenterprise Deterrent for fraudsters - Cost Savings for the business Data from Social Media sites analyzed with Text analytics Hadoop or agency Refined Search parameters from OLTP environment Data Warehouse Integration - Integration Business Analytics DB2 for z/os IMS Information Governance Result Set for further processing Data Warehouse + modeling applications Result set uploaded or directly imported into OLTAP DBMS

Enterprise Integration and Governance the key to success of incorporating Big Data Information Integration 15 Insights from big data must be incorporated into the warehouse and analytics/ decision engines Information Governance Companies need to govern what comes in, and the insights that come out Data Warehouse Enterprise Integration Traditional Sources Big Data Platform New Sources

IBM BIG DATA PLATFORM Logical platform with many physical deployment options Watson Explorer Find, navigate, visualize all data Systems Management BIG DATA PLATFORM Application Development Accelerators Discovery Accelerators Speed time to value with analytic and application accelerators InfoSphere BigInsights Bringing Hadoop to the enterprise Hadoop System Stream Computing Data Warehouse InfoSphere Streams Analytics for data in-motion exploration Information Integration & Governance InfoSphere Data Warehouse Delivers deep insight with advanced database analytics & operational analytics Information Integration and Governance Governs data quality and manages the information lifecycle 16

Core data management solutions for the 21 st Century DB2 11 for z/os IMS 13 Unmatched availability, reliability, and security for business critical information Up to 40% CPU reductions and performance improvements for (OLTP), batch, and business analytics Improved data sharing performance and efficiency Integration with InfoSphere BigInsights / Hadoop and nosql support Improved utility performance and additional ziip eligible workload Delivering the highest levels of performance, availability, security, and scalability in the industry Breaking through 100k TPS 800% greater than Breaking IMS 12 through 100k TPS 800% greater than IMS 12 CPU reductions up to 62% for Java Apps CPU reductions up to 62% for Java Apps SQL access to IMS data from both.net and SQL COBOL access applications to IMS data from both.net and COBOL applications Greater flexibility and faster deployment for new Greater applications flexibility with and database faster versioning deployment for new applications with database versioning Big Data Ready exploitation of Hadoop / Big Insights, Big Data MDA, Ready Watson exploitation Explorer of Hadoop / Big MDA, Watson Explorer

IBM PureData System for Hadoop Accelerate Hadoop analytics with appliance simplicity Accelerate Big Data projects with built-in expertise Explore new ways to use all data Unlock new insights from unstructured data Establish a cost efficient on-line data archive Simplify with integrated system management InfoSphere BigInsights software Compute and Storage hardware Ensure production grade security and governance Easily integrate with other systems in the IBM big data platform Simplify Big Data for the enterprise! CIO:Insight Apr 29 2013 Issues surrounding how long it takes to get a Hadoop application into production coupled with a lack of real-time capabilities are proving to be important barriers to deployment. As a result, the respondents are reporting that both the number of Hadoop applications and the size of the overall Hadoop environment remain relatively small.

System z Approaches 1 2 3 Hadoop for Linux on Hadoop for Linux on System Apache zsite Veristorm Apache - zdoop Veristorm - zdoop Processing done outside z (Extract and move data) Processing in Appliance (z remains the master) Processing done on System z (MR cluster on zlinux) Move data over network MR jobs, result DB2 Log files in z/os Log files in z/os Log files in z/os z/os z/vm $$$$ $$ $$$ Additional infrastructure. Challenges with scale, governance, ingestion. Appliance approach with PureData System for Hadoop. High speed load. z is the control point. Provision new node quickly Near linear scale. High speed load. z is the control point.

Data Explorer : visualization & discovery across all your data sources : Integration at the glass InfoSphere Data Explorer Securely connect to and leverage data stored in DB2 for z/os & IMS Providing unified, real-time access and fusion of big data unlocks greater insight and ROI Help prioritize your System z big data integration and analytics projects Improve customer service & reduce call times Create unified view of ALL information for real-time monitoring 20 Increase productivity & leverage past work increasing speed to market Analyze customer information & data to unlock true customer value Identify areas of information risk & ensure data compliance 20

Agenda Big Data in an Information Driven economy Why start with System z IMS strategies for big data Summary / Call to action

What is An open source software framework that supports data-intensive distributed applications High throughput, batch processing runs on large clusters of commodity hardware Yahoo runs a 4000 nodes Hadoop cluster in 2008 Two main components Hadoop distributed file system self-healing, high-bandwidth clustered storage MapReduce engine 22

BIG DATA is not just HADOOP Understand and navigate federated big data sources Federated Discovery and Navigation Manage & store huge volume of any data Hadoop File System MapReduce Structure and control data Data Warehousing Manage streaming data Stream Computing Analyze unstructured data Text Analytics Engine Integrate and govern all data sources Integration, Data Quality, Security, Lifecycle Management, MDM 23

Enhancing IMS analytics on System z with Big Data Much of the world s operational data resides on z/os Unstructured data sources are growing fast There is a need to mergethis data with trusted OLTP data from System z data sources IMS provides the connectors and the DB capability to allow BigInsights v2.1.2.0 to easily and efficiently access the IMS data source BigInsights v2.1.2.0 is available on 3/13/2014 24

Enhancing IMS analytics on System z with Big Data Observation points lead to new business opportunities Observation points gleaned from both archived data and live data Score business events, track claims evolution, and more Make the data available to people who can do something meaningful with it 25

High level overview Analysis BigInsights Platform JDBC HDFS Discovery 26

Import options BigInsights Platform JDBC HDFS Sqoop Import POJO Import 27

Sqoop Import Command line interface application for transferring data between RDBMS and HDFS. Import into Hive and Hbase Export from HDFS back into RDBMS Import: Divides table into ranges using primary key max/min (can use split-by param) Creates mappers for each range Mappers write to multiple HDFS nodes Creates text or sequence files Export: Reads files in HDFS directory via MapReduce Bulk parallel insert into database table. 28

Sqoop Import Import into HDFS using the below command:./sqoop import --connect jdbc:ims://ecwas09.vmec.svl.ibm.com:5555/bigdatp1 --driver com.ibm.ims.jdbc.imsdriver --table EMPLOYEE -m 3 --split-by EDLEVEL --username 'OMVSADM' P 13/06/19 17:50:27 INFO db.datadrivendbinputformat: BoundingValsQuery: SELECT MIN(EDLEVEL), MAX(EDLEVEL) FROM EMPLOYEE 13/06/19 17:50:46 INFO mapreduce.importjobbase: Transferred 5.123 KB in 20.3762 seconds (257.4572 bytes/sec) 13/06/19 17:50:46 INFO mapreduce.importjobbase: Retrieved 43 records. HDFS Output (below) 29

BigInsights Database Import Application Utilize the built in Database Import Application by providing the database connection parameters: 30

31 BigInsights Database Import Application Once the run is completed, view the data in HDFS:

BigInsights BigSheets This data can be saved as BigSheets workbook for further analytics 32

Elevated demand for business analytics drives new requirements and focus More aggressive requirement s Enterprise-level scale & performance Mission critical availability Faster access to operational data Rapid, cost effective deployment & expansion More integrated view of data across the environment Driving new focuses Modernization Standardization & Consolidation Operational BI Data Governance Cloud Computing 33

Watson Explorer Watson Explorer is the visualization & discovery capability for IBM s comprehensive big data platform Watson Explorer is a key component of all the big data use cases with greatest impact in Big Data Exploration & Enhanced 360 View of the Customer 34

Watson Explorer : visualization & discovery across all your data sources : Integration at the glass Securely connect to and leverage data stored in DB2 for z/os & IMS Watson Explorer Providing unified, real-time access and fusion of big data unlocks greater insight and ROI Help prioritize your System z big data integration and analytics projects Improve customer service & reduce call times Create unified view of ALL information for real-time monitoring 35 Increase productivity & leverage past work increasing speed to market Analyze customer information & data to unlock true customer value Identify areas of information risk & ensure data compliance

Watson Explorer product architecture and differentiators Big Data Big Data Big Data Application Application Application Application Framework Differentiators User Profiles Authentication/Authorization Business Rules Personalization Display Query Routing Subscriptions Feeds Web Results Federated discovery and navigation Scalable architecture Accurate results Text Analytics Indexing and Search Engine Metadata Extraction Secure connectivity Powerful development tools CM, RM, DM RDBMS Feeds Web 2.0 Email Web CRM, ERP File Systems Integration Zone Connectors & APIs BigInsights Streams Unique application framework Fast time to value 36

Seamless IMS integration Big Data Application Big Data Application Big Data Application User Profiles Application Framework Authentication/Authorization Business Rules Personalization Display Query Routing Subscriptions Feeds Web Results Indexing and Search Engine Metadata Extraction JDBC connector for IMS JDBC connector for IMS JDBC API used to flow SQL to target IMS database(s) to retrieve all of the information of interest JDBC metadata discovery APIs used to define the IMS database resource(s) of interest for indexing IMS DBs IMS catalog 37

IMS + Watson Explorer -Configuring the IMS source After deploying the IMS JDBC driver, create a new Database seed 38

IMS + Watson Explorer -Setting up the data transformation After creating a new seed, a converter needs to be configured using standard XPATH 39

Original IMS hierarchy for hospital database Hierarchy: HOSPITAL->WARD->PATNAME Goal: Get a patient centric view 40

Why use Watson Explorer Previously to change the schema so that the PATIENT information is at the top, a logical database needs to be created This requires a DBA to be involved and a time window when the new database resources can be brought online Watson Explorer allows indexes to be created dynamically and for better searching that is not restricted to IMS Segment Search Arguments (SSAs) 41

42 Searching the IMS database with Watson Explorer Query: Who are the patients in the Alexandria hospital

43 Searching the IMS database with Watson Explorer Query: Who are the patients currently in dermatology

Machine Data Analytics Accelerator Custom Applications Shrink Wrap Solutions Health Care Networking Insurance Telco x2020 Unity IBM Big Data Platform MDA Accelerator Tools Client Specific Customizations, Visualization tools ( zinsights ) IT use cases: Server, performance, troubleshooting Business use cases: Click stream and transaction analysis Optimize production, advance planning Specific Domain Telco Financial services Retail Healthcare Generic Parsers and Extractors Federated Discovery, Pattern (applications, services, Discovery, Search, Visualization Tools servers and devices ) for root cause analysis Hadoop System Information Integration & Governance Stream Computing Data Warehouse IBM Big Data Platform Visualization & Discovery Application Development Accelerators Systems Management Hadoop System Stream Computing Data Warehouse Information Integration & Governance 44

IMS and IBM Accelerator for Machine Data Analytics Consume log data produced by Transaction Analysis Workbench Index and link transactions together across products (IMS, DB2, MQ, CICS, WebSphere) Make large amounts of IMS transactional log data available to the suite of BigInsights tools. 45

IMS log and MDA overview BigSheets BigInsights Platform HDFS Transaction Analysis Workbench MDA Index, Extraction SFTP Data Explorer Conversion to ASCII in CSV format 46

Transaction Analysis Workbench - Log conversion 47

48 Machine Data Analytics Accelerator - Watson Explorer Search

49 Machine Data Analytics Accelerator Data Analytics using BigSheets

Agenda Big Data in an Information Driven economy Why start with System z IMS strategies for big data Summary / Call to action

Conclusion / Call to action: For additional information including whitepapers and demos, please visit: IBM big data for z web site Smarter Computing Information Management System z Education Free online education at bigdatauniversity.com 145,000+ registered students Further developments: Future webcast and announcements World of DB2 for z/os Take Action Now! Wanting to experiment on a big data integration project? Partner with IBM Silicon Valley Laboratory. (richtran@us.ibm.com) Develop your own big data strategy Contact your local IBM sales representative to get started. 51 51

Thank You! Your feedback is important to us 52