Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies



Similar documents
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Big Data, Platforms, Tools, and Advanced Analytics Applications and Capabilities What do you Need

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Big Data Are You Ready? Thomas Kyte

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Introducing Oracle Exalytics In-Memory Machine

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

An Oracle White Paper June Oracle: Big Data for the Enterprise

Constructing a Data Lake: Hadoop and Oracle Database United!

An Oracle White Paper October Oracle: Big Data for the Enterprise

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Apache Hadoop: Past, Present, and Future

Big Data: Are You Ready? Kevin Lancaster

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

Modernizing Your Data Warehouse for Hadoop

BIG DATA TRENDS AND TECHNOLOGIES

The Future of Data Management

Architecting for the Internet of Things & Big Data

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

How To Create A Business Intelligence (Bi)

Safe Harbor Statement

An Oracle White Paper September Oracle: Big Data for the Enterprise

<Insert Picture Here> Big Data

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Oracle Big Data Essentials

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Are You Ready for Big Data?

The Future of Data Management with Hadoop and the Enterprise Data Hub

Trafodion Operational SQL-on-Hadoop

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Oracle Big Data SQL Technical Update

Geospatial Technology Innovations and Convergence

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Oracle Database 12c Plug In. Switch On. Get SMART.

So What s the Big Deal?

Are You Ready for Big Data?

1 Performance Moves to the Forefront for Data Warehouse Initiatives. 2 Real-Time Data Gets Real

HDP Hadoop From concept to deployment.

Big Data and Your Data Warehouse Philip Russom

An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP Oracle ESG Data Systems Architecture

IBM Big Data Platform

Where is... How do I get to...

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Luncheon Webinar Series May 13, 2013

Big Data Big Data/Data Analytics & Software Development

Big Data and Analytics in Government

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Oracle Big Data Strategy Simplified Infrastrcuture

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

TUT NoSQL Seminar (Oracle) Big Data

Big Data on Microsoft Platform

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Workshop on Hadoop with Big Data

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Business Intelligence for Big Data

#TalendSandbox for Big Data

Hadoop Ecosystem B Y R A H I M A.

Hadoop and Map-Reduce. Swati Gore

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Big Data for Official Statistics Processing Big and Fast Data Optimizing Results with a Multi-Model Database

Transforming the Telecoms Business using Big Data and Analytics

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

HDP Enabling the Modern Data Architecture

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

Using OBIEE for Location-Aware Predictive Analytics

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Oracle Big Data Building A Big Data Management System

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc.

BIG DATA What it is and how to use?

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

Implement Hadoop jobs to extract business value from large and varied data sets

Real Time Big Data Processing

High Performance IT Insights. Building the Foundation for Big Data

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Transcription:

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights

Big Data, Advanced Analytics: WHAT, WHY, HOW WHAT is Big Data? WHY: The Value of Big Data Technologies in Threat Identification, Situational Awareness, and Geospatially Enabled Applications HOW: Integrating Big Data Analytics with Spatial, Semantic and Traditional Business Intelligence Technologies HOW: Engineered Systems Simplify the Creation and Configuration, and lower the Cost of Big Data Environments 2 Copyright 2012, Oracle and/or its affiliates. All rights

Global Digital Data Growth Growing leaps and bounds by 40+% YoY! 2009 =.8 Zetabytes =.08 ZB Structured Data =.72 ZB Unstructured Data 2020 = 35 Zetabytes LEGEND Structured Data Unstructured Data = 3.5 ZB Structured Data = 31.5 ZB Unstructured Data* (1 Zetabyte = 1 Trillion Gigabytes) Chart conservatively assumes a constant 9:1 ratio of unstructured data vs. structured data (based upon IDC s estimate that 90% of all digital data is unstructured). Chart does not reflect IDC s projection that unstructured data is currently growing twice as fast as structured data at the rate of 63.7% vs. 32.3% CAGR. Source: IDC Digital Universe Study, A Digital Universe Decade Are Your Ready?, 2010 3 Copyright 2012, Oracle and/or its affiliates. All rights

Big Data - Diverse types of data Sensor data Clickstream data (weblogs) Social network data and logs Imagery Video surveillance feeds 4 Copyright 2012, Oracle and/or its affiliates. All rights

Big Data - Diverse Types Big Data: Decisions based on all your data Social Data Semantic Graphs Connect the Dots Documents Machine-Generated Data Geospatial Where is it Happening? In What Context? Video 5 Copyright 2012, Oracle and/or its affiliates. All rights

Definition: What is Big Data Anyway? The 4 Vs of Big Data? Volume It s about 100s TeraBytes and Petabytes... BUT could be smaller volumes size is in the eye of beholder Variety Highlights the new types of data from the internet of things that can be stored and analysed Velocity Relates to streams of high frequency data arriving e.g. From a sensor, network, Imagery, UAV, or high volume event Value At volume low value data can highlight high value patterns, trends and insights of significant commercial value 6 Copyright 2012, Oracle and/or its affiliates. All rights

Big Data in Public Sector Examples of Different Program Areas Use Cases 1. Fraud Prevention 2. Maintenance & Utilities 3. Constituent Sentiment 4. Threat Identification 5. Economic Analysis 6. Healthcare 7. Regulatory Compliance, Licensing & Enforcement 8. Open Government 9. Tax Collections 7 Copyright 2012, Oracle and/or its affiliates. All rights 7

Big Data Poses Big Challenge For Military Intelligence The amount of data generated is reaching epic proportions Number of sensors deployed on the ground or aloft on unmanned aerial vehicles (UAVs) and satellites continues to soar, as does the ability to keep this asset in place for longer periods Volume of data being gathered for intelligence, surveillance and reconnaissance (ISR) is already huge. In Afghanistan, ISR acquisition systems gather more than 53TB of data every day. There s a gap between our growing ability to collect data and our limited ability to process that data. We are collecting 1,500% more data than we did 5 years ago. At the same time, our head count has barely risen. Gen. Robert Kehler, commander of the U.S. Strategic Command, late 2011 Throughout the defense industry, there s a concerted effort to deal with socalled big data. There is a recognition that it is simpler to resolve the problems with technology than to boost the staff of skilled analysts. Source: Terry Coslow, Defense Systems, 29 March 2012 8 Copyright 2012, Oracle and/or its affiliates. All rights

What Can Federal Agencies Do With Big Data? Example: Weapons of Mass Destruction Tracking Combining Information on: Goods Materials People Money Transportation To give analysts awareness never before possible 9 Copyright 2012, Oracle and/or its affiliates. All rights

Secur ed Example: DHS Risk Analysis External Data Sources Transactional & Operational Systems Contents Repository Databases Web resources Blogs, Mails, news Real-time Data Streams Search, Presentation, Report, Visualization, Query Automatic Responses and Publishing SMS Console Alerts EV Grid Management Enterprise Data Management Infrastructure GeoSpatial Documents Historical Records POIs Demographics Customer Data Call Records Workflow Initiation Real-time Dashboards 10 Copyright 2011, Oracle and/or its affiliates. All rights

Maintenance and Utilities Structured, but massive data sets Condition Based Maintenance (CBM+) relies on sensors to collect information about the vehicle condition The condition data is combined with other data such as operational use and cost data Fleet-wide data sets need to be analyzed for trends that lead to increased costs, but analysts do not always know what they are looking for Big Data can help direct the analysts to the areas of most bang-for-the-buck Utilities can use these to spot trends in the massive time series data from millions of smart meter readings and lower costs or prevent blackouts 11 Copyright 2011, Oracle and/or its affiliates. All rights 11

Non-Defense & Security Uses Use Cases National Institutes of Health Disease outbreak detection & tracking USDA Food Stamp Fraud detection/tracking Housing and Urban Development Development and use of affordable housing 12 Copyright 2012, Oracle and/or its affiliates. All rights

Big Data Now Lets Go into HOW Given you know what it is Why You want to use it HOW can you identify all the technologies you need HOW to get all those technologies in an Engineered Solution for you 13 Copyright 2012, Oracle and/or its affiliates. All rights

Hadoop, MapReduce and Related Technologies Hadoop, HDFS, Hive, MapReduce, HBase Mahout, Sqoop, Zookeeper, Flume, Oozie, Pig, Whirr NoSQL, Key Value Stores What are these? When and why are they important? For what Use Cases? What Problems are YOU trying to solve? And Oracle really supports all of these Open Source Technologies? 14 Copyright 2012, Oracle and/or its affiliates. All rights

Hadoop Architecture Management/Monitoring Distributed file system MapReduce Map/Reduce programming paradigm Highly scalable data processing Hadoop Distributed File System (HDFS) 15 Copyright 2012, Oracle and/or its affiliates. All rights

In-Database Analytics Oracle Integrated Solution Stack for Big Data HDFS Hadoop (MapReduce) Oracle NoSQL Database Oracle Big Data Connectors Data Warehouse Analytic Applications Enterprise Applications Oracle Data Integrator ACQUIRE ORGANIZE ANALYZE DECIDE 16 Copyright 2012, Oracle and/or its affiliates. All rights

Big Data Processing & Query Characteristics Batch-Oriented Process data to use Real-Time Deliver a service Bulk storage Fast access to specific record Write once, read all Read, write, delete update 17 Copyright 2011, Oracle and/or its affiliates. All rights

Big Data Storage Options: Batch vs. Real-Time Hadoop Distributed File System (HDFS) File System Parallel scanning No inherent structure High volume writes Batch Oriented NoSQL Database (Oracle ) Database Indexed storage Simple data structure High volume random reads and writes Real-Time 18 Copyright 2011, Oracle and/or its affiliates. All rights

RDBMS & NoSQL Used Alongside Each Other RDBMS High value, high density, complex data NoSQL architectures Low value, low density, simple data Complex data relationships Schema-centric Designed to scale up & out Lots of general purpose features/functionality High overhead ($ per operation) Very simple relationships Schema-free, unstructured or semistructured data Distributed storage and processing Stripped down, special purpose data store Lower overhead ($ per operation) 19 Copyright 2012, Oracle and/or its affiliates. All rights

Oracle NoSQL Database A distributed, scalable key-value database Simple Data Model Key-value pair with major+sub-key paradigm Read/insert/update/delete operations Scalability Dynamic data partitioning and distribution Optimized data access via intelligent driver High availability One or more replicas Disaster recovery through location of replicas Resilient to partition master failures No single point of failure Transparent load balancing Reads from master or replicas Driver is network topology & latency aware Application NoSQLDB Driver Storage Nodes Data Center A Application NoSQLDB Driver Storage Nodes Data Center B 20 Copyright 2012, Oracle and/or its affiliates. All rights

Big Data Apparent Disconnect Two Separate Developer Worlds Unstructured Data Hadoop, HDFS, Hive, MapReduce, HBase Mahout, Sqoop, Zookeeper, Flume, Oozie, Pig, Whirr Distributed File Systems Transaction (Key-Value Stores) Acquire Big Data Environments Organize MapReduce Solutions Traditional Environments Analyze NoSQL Flexible Specialized Developer Centric Structured Data DBMS (OLTP) Extract Transform and Load (ETL) DBMS (DW) Advanced Analytics SQL Trusted Secure Administered Acquire Organize Analyze 21 Copyright 2012, Oracle and/or its affiliates. All rights

Oracle Data Integration for Big Data Improving Productivity and Efficiency for Big Data Transforms Via MapReduce(HIVE) Oracle Data Integrator Oracle Loader for Hadoop Activates Loads Benefits Consistent tooling across BI/DW, SOA, Integration and Big Data Reduce complexities of processing Hadoop through graphical tooling Improves productivity when processing Big Data (Structured + Unstructured) Oracle Big Data Appliance Hadoop, HDFS, Hive, MapReduce, HBase Mahout, Sqoop, Zookeeper, Flume,Oozie, Pig, Whirr Oracle Exadata 22 Copyright 2012, Oracle and/or its affiliates. All rights

Hadoop, HDFS, Hive, MapReduce, HBase Mahout, Sqoop, Zookeeper, Flume, Oozie, Pig, Whirr Unstructured Data Structured Data Oracle s Approach To Big Data Provide all Technologies, Engineered Together Distributed File Systems Oracle Big Data Appliance Transaction (Key-Value Stores) Acquire DBMS (OLTP) Acquire Big Data Environments Organize MapReduce Solutions Traditional Environments Extract Oracle Transform Exadata and Load Database (ETL) Machine Organize DBMS (DW) Analyze Advanced Analytics Analyze Oracle Exalytics In-Memory Machine NoSQL Flexible Specialized Developer Centric SQL Trusted Secure Administered 23 Copyright 2012, Oracle and/or its affiliates. All rights

Cloud Computing Oracle Engineered Systems 24 Copyright 2011, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8

Big Data and Cloud Computing Next Generation Enterprise Data Platform CIO Big Data Private Cloud Drive Business Value Deliver Top Line Growth Drive Business Efficiency and Agility Deliver Bottom Line Savings Business Value Secure 25 Copyright 2011, Oracle and/or its affiliates. All rights

Unstructured / Structured Metadata? How are we solving this? Leverage Hadoop to undertake alignment of structured and unstructured content to defined domain vocabularies Reconcile bottom-up differences in heterogeneous data types and domain semantics, with top-down business analysis and workflows definitions Leverage combined Big Data and RDBMS computing platforms to: Deliver massive pre-processing requirements Provide rich BI analysis, search and discovery 26 Copyright 2011, Oracle and/or its affiliates. All rights

SEMANTICS: RDF Graph Metadata Addresses metadata alignment challenge Apps RDF metadata (data & rules) Instance Data RDF introduces an enterprise metadata framework. The metadata graph associates underlying data, process and domain models based on their semantics. Linking of resources (via metadata registries) enables interoperability between apps that exchange data. 27 Copyright 2012, Oracle and/or its affiliates. All rights

Analysis with R R workspace console Oracle Database 11g statistics engine OBIEE, Web Services Function push-down data transformation & statistics Transparently leverage Hadoop for High Performance Analytics to Oracle Big Data Appliance (Oracle R Connector for Hadoop) 28 Copyright 2012, Oracle and/or its affiliates. All rights

Predictive Analytics with R, BI, Data Mining R workspace console Oracle statistics engine Function push-down data transformation & statistics OBIEE, Web Services No changes to the user experience Scale to large data sets Embed in operational systems Development Production Consumption 29 Copyright 2012, Oracle and/or its affiliates. All rights

Oracle In-Database Analytics Platform Oracle R Enterprise Oracle Data Mining Spatial Analytics Text and Search RDF Graph Analytics SQL Analytics Parallel Processing Engine XML Relational OLAP Spatial Data Layer RDF Media 30 Copyright 2012, Oracle and/or its affiliates. All rights

In-Database Analytics Oracle Complete Big Data Platform Big Data Exadata Exalytics Appliance Exadata Exalytics Big Data Appliance Cloudera s Distribution including Apache Hadoop Oracle NoSQL Database Open Source R Applications Oracle Big Data Connectors InfiniBand Oracle Data Integrator Oracle Advanced Analytics Data Warehouse Oracle Database InfiniBand Analytic Applications Alerts, Dashboards, MD- Analysis, Reports, Query Web Services BI Abstraction ACQUIRE ORGANIZE ANALYZE DECIDE 31 Copyright 2012, Oracle and/or its affiliates. All rights

Why Build A Hadoop Appliance? Time to Build Optimizations Maintenance 32 Copyright 2011, Oracle and/or its affiliates. All rights

Required Skills for MapReduce Development Java Hadoop Framework Parallel Algorithms & Manage Large Clusters of Machines 33 Copyright 2011, Oracle and/or its affiliates. All rights

Batch-Oriented Processing on Hadoop INPUT 1 OUTPUT 1 REDUCE REDUCE REDUCE REDUCE REDUCE SHUFFLE /SORT REDUCE REDUCE SHUFFLE /SORT INPUT 2 SHUFFLE /SORT REDUCE REDUCE SHUFFLE /SORT REDUCE SHUFFLE /SORT REDUCE REDUCE REDUCE OUTPUT 2 34 Copyright 2011, Oracle and/or its affiliates. All rights

Oracle Big Data Appliance Optimized and Complete All you need to store and integrate big data Integrated with Oracle Exadata Analyze all your data Easy to Deploy Risk Free, Quick Installation and Setup Single Vendor Support Full Oracle support for all hardware and software 35 Copyright 2012, Oracle and/or its affiliates. All rights

Oracle Exadata Database Machine Data Warehousing, Transaction Processing, Consolidation Fastest Data Warehouse & OLTP Best Cost/Performance Data Warehouse & OLTP Optimized Hardware (per rack) Processor: up to128 Intel Cores and 2 TB DRAM Network: 880 GB/Sec Throughput Storage: 5 TB Flash and up to 336 TB Disk Software Breakthroughs Exadata Smart Storage Grid Smart Flash Cache Hybrid Columnar Compression Parallel Scale-Out Database and Storage Scales from ¼ Rack to 8 Full Racks 36 Copyright 2012, Oracle and/or its affiliates. All rights

Oracle Exalytics In-Memory Machine First engineered system for analytics Visual Analysis without limits Smarter analytic applications 37 Copyright 2011, Oracle and/or its affiliates. All rights

Cloud Computing Oracle Engineered Systems 38 Copyright 2011, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8