VIEWPOINT. High Performance Analytics. Industry Context and Trends



Similar documents
Data Virtualization A Potential Antidote for Big Data Growing Pains

Big Data overview. Livio Ventura. SICS Software week, Sept Cloud and Big Data Day

The 4 Pillars of Technosoft s Big Data Practice

HDP Enabling the Modern Data Architecture

How the oil and gas industry can gain value from Big Data?

Mastering Big Data. Steve Hoskin, VP and Chief Architect INFORMATICA MDM. October 2015

Ganzheitliches Datenmanagement

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Achieving Business Value through Big Data Analytics Philip Russom

Getting Started Practical Input For Your Roadmap

The Future of Data Management

INTELLIGENT BUSINESS STRATEGIES WHITE PAPER

The Enterprise Data Hub and The Modern Information Architecture

BEYOND BI: Big Data Analytic Use Cases

Bringing the Power of SAS to Hadoop. White Paper

A New Era Of Analytic

Build Your Competitive Edge in Big Data with Cisco. Rick Speyer Senior Global Marketing Manager Big Data Cisco Systems 6/25/2015

Big Data Use Cases Update

Loss Prevention Data Mining Using big data, predictive and prescriptive analytics to enpower loss prevention

Analytics framework: creating the data-centric organisation to optimise business performance

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP

Apache Hadoop in the Enterprise. Dr. Amr Awadallah,

Industry Impact of Big Data in the Cloud: An IBM Perspective

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Traditional BI vs. Business Data Lake A comparison

The Lab and The Factory

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Introducing Oracle Exalytics In-Memory Machine

Big Data and Analytics in Government

How To Understand The Business Case For Big Data

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

perspective Progressive Organization

BPM for Structural Integrity Management in Oil and Gas Industry

SAP HANA Vora : Gain Contextual Awareness for a Smarter Digital Enterprise

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Cloudera Enterprise Data Hub in Telecom:

The big data business model: opportunity and key success factors

Are You Big Data Ready?

CONNECTING DATA WITH BUSINESS

Interactive data analytics drive insights

Mind Commerce. Commerce Publishing v3122/ Publisher Sample

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Teradata s Big Data Technology Strategy & Roadmap

Effective Data Governance

TABLE OF CONTENTS 1 Chapter 1: Introduction 2 Chapter 2: Big Data Technology & Business Case 3 Chapter 3: Key Investment Sectors for Big Data

The Potential of Big Data in the Cloud. Juan Madera Technology Consultant

Big Data Trends A Basis for Personalized Medicine

INVESTOR PRESENTATION. First Quarter 2014

Data Refinery with Big Data Aspects

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

The Future of Data Management with Hadoop and the Enterprise Data Hub

Safe Harbor Statement

Oracle Big Data Building A Big Data Management System

Investor Presentation. Second Quarter 2015

IN-MEMORY COMPUTING: THE NEXT BIG THING FOR BIG DATA

Advanced In-Database Analytics

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE

Big Data Analytics: Today's Gold Rush November 20, 2013

High-Performance Analytics

Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast,

Big Data Use Case Deep Dive 5 Game Changing Use Cases for Big Data

Survey of Big Data Architecture and Framework from the Industry

Actian SQL in Hadoop Buyer s Guide

Blueprints for Big Data Success

How To Use Big Data For Business

Wrap and Renew Digital SOA Catalog Offerings

How To Turn Big Data Into An Insight

Smarter Analytics. Barbara Cain. Driving Value from Big Data

Descriptive to Predictive to Prescriptive Analytics: Move Up the Value Chain. Suren Nathan CTO

Luncheon Webinar Series May 13, 2013

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Deploying Big Data to the Cloud: Roadmap for Success

Machina Research. Where is the value in IoT? IoT data and analytics may have an answer. Emil Berthelsen, Principal Analyst April 28, 2016

High Performance IT Insights. Building the Foundation for Big Data

Accelerate your Big Data Strategy. Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator

Integrating a Big Data Platform into Government:

SAP Predictive Analytics: An Overview and Roadmap. Charles Gadalla, SESSION CODE: 603

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics

HDP Hadoop From concept to deployment.

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Digital Marketing. SiMplifieD.

Smarter Analytics Leadership Summit Big Data. Real Solutions. Big Results.

ADVANTAGE YOU. Be more. Do more. With Infosys and Microsoft on your side!

Navigating Big Data business analytics

Transcription:

VIEWPOINT High Performance Analytics Industry Context and Trends In the digital age of social media and connected devices, enterprises have a plethora of data that they can mine, to discover hidden correlations and perform deeper real-time analytics in order to deliver better customer experience, improve operational efficiency and monetize their data assets to uncover new revenue streams. In addition to answering the traditional questions around sales, performance, profitability etc., enterprises are now looking to discover correlations between seemingly unrelated streams of data to gain a competitive advantage.

Utilities Smart grid analytics Preventive maintenance Distribution load forecasting and scheduling Create targeted customer offerings Condition-based maintenance Telecommunications Network data monetization Revenue assurance CDR analytics Location based services Smarter campaigns Oil & Gas Production loss minimization Advance condition monitoring Enable customer energy management Production surveillance & optimisation Banking & Financial Sector Fraud detection Risk modelling & management Insurance claim analysis Contact centre efficiency and problem management Counterparty credit risk management INDUSTRY USE CASES Government Threat prediction and prevention Social program fraud, waste and errors Tax compliance - fraud and abuse Crime prediction and prevention Figure 1: Analytics Industry Use Cases Retailers Online personalization Recommendation engines Marketing spend optimization Actionable customer insight Manufacturing & Automotive Proactive equipment maintenance Supply chain management Predictive asset optimisation Connected vehicle Actionable customer insight Discovery of such hidden correlations and insights require platforms that Ingest, Store and Process large volumes of high velocity in real time from a variety of data sources Generate rapid insights by co-relating multiple structured and un-structured data streams Enable data scientists to discover correlations and model algorithms that can deliver insights Self-learn based on new data patterns and improve accuracy of insights With the advent of new types of data emerging from web and connected devices, the challenges in delivering these insights have increased multi-fold and put a tremendous pressure on traditional Business Intelligence platforms. Structured data grew by more than 40% per year Types of Traditional content including unstructured data is growing by up to 80% per year An estimated 2.8ZB of data in 2012 is expected to grow to 40ZB by 2020. 85% of this data growth is expected to come from new types; with machine-generated data being projected to increase 15x by 2020. (Source IDC) This new form of data poses unique challenges for business and IT such as Effectively storing large multistructured data Capturing high-speed data and processing it in the right time Creating flexible yet highperformance data structures to answer new business questions Creating a platform which will provide an integrated and unified access to all the information Market Offerings and Gaps To address the challenges mentioned above, the industry is now looking at platforms to support the Volume, Velocity, Variety and Value needs of Next-Gen Business Intelligence. These platforms are typically a combination of Data and Discovery/Analytical Tools. Current offerings available in the market can be classified as Data, which offer open source platforms for building and managing Data Lakes and Big Data Warehouses. These platforms leverage components from open source Hadoop stack and provide platform administration, governance, security and basic discovery capabilities on top of the core stack. Some of the leading vendors in this space include Cloudera, HortonWorks and MapR External Document 2015 Infosys Limited

External Document 2015 Infosys Limited

Analytics, which offer point analytical solutions to industry use cases such as Customer Analytics, Network Analytics etc. These offerings provide pre-built analytics on top of Hadoop ecosystems to serve specific business needs. Some of the leading vendors in this space include Datameer, Platfora, Guvavus etc. Augmented Appliances, which offer high speed appliances on top of Hadoop storage to speed up data access. Teradata Aster, EMC Greenplum, SAP HANA, HP Vertica and IBM Big Insights are some of the leading vendors in this category Data Discovery tools, which provide exploratory capabilities on top of Big Data storage platforms. They help ingest data from a variety of data sources, model it and create consumable data sets out of the underlying data. Tableau, QlikSense, Tibco Spotfire etc. are some of the leading vendors in this space While these offerings help in addressing some of the challenges posed, they do not offer a silver bullet to solve all of the Big Data Challenges. Data Analytics Augmented Appliances Data Discovery Tools BENEFITS Low cost Simplify the platform management functions and allow organizations to focus on data Point solutions that come with pre-built analytics Industry standard algorithms MPP and In-memory capabilities offer fast response time for on-demand analytics Enable data scientists to access raw data and discover insights Cuts down on data preparation time significantly No pre-built analytics High cost High Cost No data platform GAPS Minimal discovery capabilities Point solution that caters to specific use cases and is not meant for integrated analytics No pre-built analytics Minimal discovery capabilities No pre-built analytics Infosys PoV Infosys believes that the challenges posed by Big Data need a High Performance Analytics (HPA) platform that provides a comprehensive set of building blocks to provision data, define storage structures, create data sets of consumption, enable exploration and run analytical models against the data. The solution should offer enough flexibility to extend the available analytical models to suit enterprise specific needs. The core of any HPA platform is a data management platform that can practically store unlimited amounts of data of any format, schema and type, that is relatively inexpensive and massively scalable. Data Lakes are designed to offer this capability. User Access In-Memory Performance Layer Enterprise ETL Framework Making Hadoop the primary component of DW is a game changing trend Data Factory Business Data Lake Data Pools Harmonized Data Zone Transformed Data Zone Actionable Information In-motion processing Near-Real time batch processing Leveraging the source-once and reuse approach improves efficiency, reduces data-silo, latency and time to value; massively improves analytics and discovery, and greatly reduces cost Data Pipeline management Integrated Data Management & Governance Enterprise ETL integration Real-Time 100% Source Data Transaction Master Data Machine Data Web Data Reference Data Lookup data Micro-Batch Public/ 3 rd party data Batch... Enriched Data Zone Raw Data Reservoir Discovery Lab / Analytics Sandbox Actionable Insights External Document 2015 Infosys Limited Figure 2: Logical Architecture of Data Lake

They help in two ways Information Discovery/Agility in Analytics simplify the data acquisition to initiate discovery on raw data by exposing business users through discovery tools. Data Warehouse Expansion helps in expanding data warehouse to capture data at a lower grain and higher diversity, which is then fed into upstream systems. Unlike traditional relational databases, data can be stored in the raw format where analysts and developers can then apply a structure to suit the needs of their applications at the time they access the data with Schema on Read instead of Schema on Write. Intelligent Data Discovery tools enable data scientists to build data models and views that can be used for the analysis of structured and unstructured data. They offer capabilities to search on metadata and create data sets for running analytics. They eliminate the need for IT involvement and reduce the time involved in data preparation. Analytical modules built on statistical tools, enable data scientists to build algorithms and models that can deliver predictive and prescriptive insights. While the analytical models will differ from enterprise to enterprise, a best-inclass analytics platform should have the basic building blocks such as Sentiment Analysis, Text Mining, Fault Prediction, Fraud Detection, Risk Analytics etc. which can be extended based on the enterprise s specific needs. Data Lakes combined with Data Discovery tools and Analytical Algorithms form the core of a High Performance Analytics Platform. Discovery Data Lakes/ Big Data DW High Performance Analytics Analytical Algorithms Figure 3: High Performance Analytics Platform Components NoSQL Maturity Phase Advanced Analytics Build Platform Build Business Data Lake platform Add basic capabilities like Data Ingestion Onboarding Data Sources Adding data sources Build Metadata Capabilities Build basic exploration capabilities Standardization Bringing in more Data Sources Build Data Governance Capabilities Enrich Data Sets with Reference Data Build a semantic layer Curative Layer Enable data hub layer for important operational reports Enable dimensional layer for analysis Sand pits for data discovery Build data services to/from existing data marts Create an analytics CoE Scale the analytics process by business area Build the pool of data analysts, scientists & domain experts Figure 4: High Performance Analytics Platform Implementation Phases High Performance Analytics Platform addresses the challenges posed by Big Data by providing Data Lakes built on commodity hardware that are cost-effective for storing large volumes of data Distributed processing architecture that efficiently processes large volumes of structured, semi-structured and unstructured data Horizontal scalability that can support future needs In-memory processing engines to deliver rapid insights Pre-built analytical models that can be extended to enterprise specific needs, reducing the time-to-insights Data discovery and deep analytics capability that can uncover hidden correlations and deliver deep insights based on all available data assets External Document 2015 Infosys Limited

Success Stories Implemented a Business Data Lake for an Australian telecom major to provide insights into various lines of business like Revenue Assurance, Marketing etc. and improved the ability to enable 11.2% of the total USD 26.8 billion revenue persistently, with coverage accelerated to 27% Implemented a Data Lake for a leading financial major in US which gave a 360 0 view of the wholesale customer that improved prospecting effectiveness, market segmentation and positioning Increased ARPU, reduced customer churn and identified new revenue streams by selling anonymized data to advertisers and retailers for a Singapore Telco by building a Big Data enabled customer analytics platform Created a model to predict ATM failure with 80% confidence level for over 8500 ATMs which resulted in a 14% increase in call center efficiency and 18% cost reduction For more information, contact askus@infosys.com 2015 Infosys Limited, Bangalore, India. All Rights Reserved. Infosys believes the information in this document is accurate as of its publication date; such information is subject to change without notice. Infosys acknowledges the proprietary rights of other companies to the trademarks, product names and such other intellectual property rights mentioned in this document. Except as expressly permitted, neither this documentation nor any part of it may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, printing, photocopying, recording or otherwise, without the prior permission of Infosys Limited and/ or any named intellectual property rights holders under this document. Stay Connected