Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services



Similar documents
Are You Ready for Big Data?

Are You Ready for Big Data?

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

BIG DATA TRENDS AND TECHNOLOGIES

Integrating a Big Data Platform into Government:

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Raul F. Chong Senior program manager Big data, DB2, and Cloud IM Cloud Computing Center of Competence - IBM Toronto Lab, Canada

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management

Bringing Big Data to People

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Industry Impact of Big Data in the Cloud: An IBM Perspective

Modernizing Your Data Warehouse for Hadoop

Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal

Native Connectivity to Big Data Sources in MSTR 10

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Tap into Hadoop and Other No SQL Sources

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Application and practice of parallel cloud computing in ISP. Guangzhou Institute of China Telecom Zhilan Huang

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Ganzheitliches Datenmanagement

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

BIG DATA What it is and how to use?

Talend Big Data. Delivering instant value from all your data. Talend

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

How To Understand The Benefits Of Big Data

Luncheon Webinar Series May 13, 2013

Big Data Big Data/Data Analytics & Software Development

Implement Hadoop jobs to extract business value from large and varied data sets

Applications for Big Data Analytics

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

ANALYTICS CENTER LEARNING PROGRAM

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

Oracle Big Data SQL Technical Update

Advanced Big Data Analytics with R and Hadoop

The Next Wave of Data Management. Is Big Data The New Normal?

Big Data and Trusted Information

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

BAO & Big Data Overview Applied to Real-time Campaign GSE. Joel Viale Telecom Solutions Lab Solution Architect. Telecom Solutions Lab

Information Builders Mission & Value Proposition

How To Scale Out Of A Nosql Database

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Deploying Big Data to the Cloud: Roadmap for Success

#TalendSandbox for Big Data

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

So What s the Big Deal?

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Teradata s Big Data Technology Strategy & Roadmap

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

The 3 questions to ask yourself about BIG DATA

Microsoft Big Data. Solution Brief

W H I T E P A P E R. Building your Big Data analytics strategy: Block-by-Block! Abstract

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Massive Cloud Auditing using Data Mining on Hadoop

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Big Data Use Cases Update

The 4 Pillars of Technosoft s Big Data Practice

White Paper: What You Need To Know About Hadoop

Big Data and the new trends for BI and Analytics Juha Teljo Business Intelligence and Predictive Solutions Executive IBM Europe

Transforming the Telecoms Business using Big Data and Analytics

Large scale processing using Hadoop. Ján Vaňo

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

HDP Hadoop From concept to deployment.

Please give me your feedback

Hadoop. Sunday, November 25, 12

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Cloudera Enterprise Data Hub in Telecom:

Big Data Explained. An introduction to Big Data Science.

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: Vol. 1, Issue 6, October Big Data and Hadoop

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Big Data Analytics Best Practices

Big Data and Hadoop for the Executive A Reference Guide

HDP Enabling the Modern Data Architecture

Big Data and Analytics in Government

Manifest for Big Data Pig, Hive & Jaql

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

The Canadian Realities of Big Data and Business Analytics. Utsav Arora February 12, 2014

Transcription:

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1

Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the Hadoop Gartner Hadoop Hype Cycle TCS view point Hadoop Eco System Landscape Examples of uses of Hadoop Transformational Platform Ad Hoc Analysis Analytics with Hadoop Applications of Hadoop Analytics Near Real Time Analysis What is the market Thank You TCS Confidential

What is Hadoop? Hadoop is the Name of a Toy Elephant Given To SCALE OUT COMPUTING PLATFORM WHICH PROCESSES INTENET SIZE DATA PARALLEL FILE SYSTEM MODLED AFTER GOOGLE FILE SYSTEM PARALLEL PROGRAMMNG ENVIRONMENT GOOGLE MAP/REDUCE OPEN SOURCE SOFTWARE COMMODITY HARDWARE 3 3

Why Hadoop? The Net Generation is here The Net Generation is inter-connected on a variety of Web based and Digital channels. Big Data : Web Scale 50 billion web pages 800 million Facebook users 1000 million Facebook pages 200 million Twitter accounts 100 million tweets per day 5 billion Google queries per day Millions of servers, Petabytes of data Varieties of Data Video / Audio Images / Pictures Diverse internal and external data Sources of Data News / Feeds / Blogs / forums Groups / Polls / Chats / Wiki Information is exploding all around But the challenge is to understand the it 4

Sizing the Hadoop Source: Pawyi Lee 5

Hadoop Hype Cycle Starts Gartner Hype Cycle 2012 6

TCS View Point: Hadoop Technology is here now Big Data Technology handles data at extreme scale and is characterized by Massive parallel computing to divide and conquer workloads. Extremely flexible to allow unlimited data manipulation and transformation Massively scalable in terms of both technology and cost Hadoop : Massively Parallel Processing Capability, running on commodity hardware Hbase and Hadoop/HDFS are designed to store and manage massive amounts of data Hive, Mahout and R, enable query, analysis and running in memory compute intensive applications The ecosystem of Hadoop Technology is affordable, and within the reach of companies 7

Hadoop Eco System Landscape Analytics / Visualization Search No SQL Query Oriented Data Warehouse Data Integration Data Integration CEP Languages / Libraries Tool s Hadoop Distributions Appliance / MR Rewrite Cloud Distributions Map Reduce Distributed File System 8

Examples of Uses of Hadoop Hi Tech Process control for Microchip fabrication Network Management Supply Chain Management and analysis New Product development Content management solutions Travel, Transportation & Hospitality Better Travel searches Geo fencing Cross selling and up selling Intelligent traffic management Energy, Resources & Utilities Weather impact analysis on power generation Oil Rig data monitoring Smart meter data analysis Terrain data analysis for wind energy Insurance Claims analysis & Premium forecasting Claims Fraud detection & Revenue comparison Overall risk analysis & Re insurance risk assessment Policy pricing & Customer retention Smart Grids Government Fraud detection and cyber security Compliance and regulatory analysis Energy consumption and carbon footprint management Disaster Management 9

Hadoop as Transformation Platform in ETL Transactional Systems Within Hadoop Ecosystem MapReduce / Hive / Pig could be used to transform data within the distributed file system (HDFS). Data Warehouse MapReduce / Hive /Pig HDFS Hadoop Cluster Less number of Higher end nodes Tools like SQOOP could be leveraged to load data from and to HDFS 10

Hadoop as an ad-hoc analysis platform Transaction al Systems Data Warehouse Hadoop as an ad-hoc analysis platform MapReduce / Hive / Pig could be used to transform data within the distributed file system (HDFS), this could provide the business analytics team a platform for innovation MapReduce / Hive /Pig HDFS Data at lowest grain Hadoop Cluster Higher number of nodes for larger storage Tools like SQOOP could be leveraged to load data from and to HDFS 11

Analytics With Hadoop Prescriptive (What should happen?) Predictive (What will happen?) Descriptive (What has happened?) Optimization Simulation Optimizing outcomes Identifying possible outcomes Domain Expertise Text Analytics Data Mining Knowledge Predictive Modeling Statistical Analysis Visual Analytics Forecasting Describing and analyzing outcomes Analysis, Drill Down, Ad Hoc Reporting Dashboards and Scorecards Visual Analytics 12

Applications for Hadoop Analytics Smarter Healthcare Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail: Churn, NBO 13

Hadoop Near Real Time Analytics External Inputs (incl Social Media) Complex Event Processing Rule / Pattern Matching on Streams. Fraud Detection Dist Processing : Processing is distributed Online Price Mgmt on a set of nodes and not the data. Yield Management Transactional Systems Rule Application Rule Discovery Learn Frauds Patterns Demand Signal Refinement Batch Map-Reduce Processing Rule / Pattern Discovery [on Time Series] Dist Processing : Map-Reduce or scalable time-series pattern mining. [Time Series] Mining and Rule Discovery Offline Online Real Time Self Learning Systems Complex / Dynamic Pattern Matching e.g. Trading Patterns, Mining Current Influencers Distributed Stream Processing [using MR] Rule / Pattern Discovery on Streams. Dist Processing : Both Processing and data are distributed on a set of nodes. e.g. C-MR (academic project) 14

What is the Market? 15

Thank You 5 December, 2012 16