Oracle Big Data for Dummies



Similar documents
Oracle Big Data for Dummies

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

The Future of Data Management

The Future of Data Management with Hadoop and the Enterprise Data Hub

The HP IT Transformation Story

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

<Insert Picture Here> Big Data

Are You Ready for Big Data?

Big Data Analytics: Today's Gold Rush November 20, 2013

HP BI Modernization - BI meets unstructured data

Introducing Oracle Exalytics In-Memory Machine

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Are You Ready for Big Data?

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Big Data: Are You Ready? Kevin Lancaster

So What s the Big Deal?

Oracle Big Data Fundamentals Ed 1 NEW

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Transforming the Telecoms Business using Big Data and Analytics

Big Data Are You Ready? Thomas Kyte

The Next Wave of Data Management. Is Big Data The New Normal?

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

An Oracle White Paper June Oracle: Big Data for the Enterprise

Microsoft SQL Server 2012 with Hadoop

Disrupt or be disrupted IT Driving Business Transformation

A Survey on Big Data Concepts and Tools

TUT NoSQL Seminar (Oracle) Big Data

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Oracle Big Data Essentials

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

#TalendSandbox for Big Data

BIG DATA TRENDS AND TECHNOLOGIES

What happens when Big Data and Master Data come together?

Certified Big Data and Apache Hadoop Developer VS-1221

Information Builders Mission & Value Proposition

Modernizing Your Data Warehouse for Hadoop

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

Workshop on Hadoop with Big Data

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Big Data Explained. An introduction to Big Data Science.

HDP Hadoop From concept to deployment.

Age of Big data. Presented by: Mohammad Iqbal BCM -2014

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Big Data: What You Should Know. Mark Child Research Manager - Software IDC CEMA

Big Data Realities Hadoop in the Enterprise Architecture

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Safe Harbor Statement

Constructing a Data Lake: Hadoop and Oracle Database United!

HDP Enabling the Modern Data Architecture

An Oracle White Paper October Oracle: Big Data for the Enterprise

Apache Hadoop's Role in Your Big Data Architecture

Apache Hadoop: Past, Present, and Future

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

BIG DATA: ARE YOU READY? Andy Kyiet Demand Flow Intelligence May, 2013

Introduction to Big Data Training

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Architecting for the Internet of Things & Big Data

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

Big Data for Banking. Kaleem Chaudhry Senior Director, Sales Consulting, ASEAN. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Doing Multidisciplinary Research in Data Science

Introduction to Predictive Analytics. Dr. Ronen Meiri

Qsoft Inc

BIG DATA CHALLENGES AND PERSPECTIVES

Oracle Big Data Strategy Simplified Infrastrcuture

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Big Data Use Cases Update

Oracle Big Data Handbook

Evolution from Big Data to Smart Data

An Oracle White Paper September Oracle: Big Data for the Enterprise

Oracle Big Data SQL Technical Update

Big Data for Big Science. Bernard Doering Business Development, EMEA Big Data Software

Data-Intensive Programming. Timo Aaltonen Department of Pervasive Computing

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

MySQL and Hadoop. Percona Live 2014 Chris Schneider

White Paper: What You Need To Know About Hadoop

Integrating a Big Data Platform into Government:

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Hadoop Big Data for Processing Data and Performing Workload

Peers Techno log ies Pv t. L td. HADOOP

Majed Al-Ghandour, PhD, PE, CPM Division of Planning and Programming NCDOT 2016 NCAMPO Conference- Greensboro, NC May 12, 2016

How Big Is Big Data Adoption? Survey Results. Survey Results Big Data Company Strategy... 6

Transcription:

Oracle Big Data for Dummies Sai Janakiram Penumuru WW Product Expert Cloud Platforms Hewlett-Packard, India

The Father of Microbiology first microbiologist Antonie Philips van Leeuwenhoek 2

Sai Janakiram Penumuru o Twelve years in Oracle DBA / Oracle Apps DBA / Cloud Architect o Current Position: WW Product Expert, Cloud Platform - Oracle in hp o Co-Fonder & Director of Finance - All India Oracle Users Group (AIOUG) o Oracle Database 12c Beta Tester o Oracle VM SIG Leader www.oraclevmsig.org 3 o Blog: www.oadba.com; www.oracle12c.info

Agenda A New Style of IT Defining Big Data: Market Drivers and Trends Why Big Data Now? Overview of Oracle Big Data Appliance Oracle Big Data Implementation Steps Oracle Integrated Software Solution Oracle Big Data Best Practices Demo The Call to Action (Next Steps)

A new era of accelerated innovation Forever changing how consumers and businesses interact, enabling new opportunities 2013 Every 60 seconds 98,000+ tweets 695,000 status updates 11million instant messages 698,445 Google searches 168 million+ emails sent 1,820TB of data created Growing Internet of Things (IoT) By 2020 Devices DATA Mobile Apps 30 Billion (1) 40 Trillion GB (2) 10 Million (3) 217 new mobile web users Pervasive Connectivity Smart Device Expansion Explosion of Information for 8 Billion (4) A new style of IT required for IoT solutions 5 (1) IDC Directions 2013: Why the Datacenter of the Future Will Leverage a Converged Infrastructure, March 2013, Matt Eastwood ; (2) & (3) IDC Predictions 2012: Competing for 2020, Document 231720, December 2011, Frank Gens; (4) http://en.wikipedia.org

Defining Big Data: Market Drivers and Trends

Big Data defined Big data is high-volume, -velocity and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making Velocity Variety Big Data Volume Value Information Sources $ CRM, SCM, ERP Video IT Ops Email Transactional Data Mobile Audio Texts Social Media Search Images Big Data is no longer just a buzzword ¹Source: Gartner, The Importance of 'Big Data': A Definition, June 2012 7

Information from the Internet of Things: We have gone beyond the decimal system Today data scientist uses Yottabytes to describe how much government data the NSA or FBI have on people altogether. In the near future, Brontobyte will be the measurement to describe the type of sensor data that will be generated from the IoT 8 Brontobyte This will be our digital universe tomorrow Yottabyte This is our digital universe today = 250 trillion of DVDs Exabyte 1 EB of data is created on the internet each day = 250 million DVDs worth of information. The proposed Square Kilometer Array telescope will generated an EB of data per day 10 27 10 24 Terabyte 500TB of new data per day are ingested in Facebook databases 10 18 Megabyte 10 12 10 6 Geopbyte 10 30 This will take us beyond our decimal system 10 21 Zettabyte 1.3 ZB of network traffic by 2016 10 15 Petabyte The CERN Large Hadron Collider generates 1PB per second 10 9 Gigabyte

Second half of the chessboard The wheat and chessboard problem On the entire chessboard there would be 2 64 1 = 18,446,744,073,709,551,615 grains of rice Weighing 461,168,602,000 metric tons Heap of rice larger than Mount Everest Around 1,000 times the global production of rice in 2010 (464,000,000 metric tons). http://en.wikipedia.org/wiki/wheat_and_chessboard_problem 9

A real world example: Sensor data collected from US commercial jet engines during 1 year 20 TB 2 2.5 28,537 365 20 terabytes of information per engine every hour twin-engine Boeing 737 Average duration for US flights in hours # of commercial flights in the sky in the United States on any given day days in a year 1,041,600,500 TB 10

Big data is derived from a variety of sources Variety Big Data Velocity Volume 11

A day in the life of Big Data An intelligent end-to-end approach delivers the right information to the right person at the right time Executive Dashboards Enterprise Search Customer Interaction Predictive Analytics Web Engagement Variety Velocity Volume Social Media Video Audio Email MGD Texts Transactional Data CRM (Sales) Transactional Operational Strategic Web ERP (Procurement) Supply Chain (Ops) Word, Excel Logs Clickstream Data HR Images Machine Generated Data 12

Why Big Data Now?

Command of information drives increased business performance Right information. Right person. Right time. Insight What is happening now? Better, Right Time Decisions Decisions Active information In Flight information Inactive information Foresight What will happen? 14 Hindsight what happened?

Big Data opportunities across industries and use cases Innovative analytic use cases are cutting across structured, unstructured and semi structured data Finance Government Telecom Manufacturing Energy Healthcare Fraud detection Anti-money laundering Risk management Law enforcement Counter terrorism Traffic flow optimization Broadcast monitoring Churn prevention Advertising optimization Supply chain optimization Defect tracking RFID Correlation Warranty management Weather forecasting Natural resource exploration Drug development Scientific research Evidence based medicine Healthcare outcomes analysis Sentiment analysis Social CRM / network analysis Churn mitigation Brand monitoring Cross and Up sell Loyalty & promotion analysis Web application optimization Horizontal use cases Marketing campaign optimization Brand management Social media analytics Pricing optimization Internal risk assessment Customer behavior analysis Revenue assurance Logistics optimization Clickstream analysis Influencer analysis IT infrastructure analysis Legal discovery Equipment monitoring Enterprise search 15 Sources: IDC: 2012 Worldwide Big Data Technology and Services Forecast: 2011-2015, Gartner: 2012 Big Data Drives Rapid Changes in Infrastructure and $232 Billion in IT Spending Through 2016

Which Big Data outcomes are your peers focusing on? Solutions which improve operational efficiency and/or enhance business effectiveness Improve service levels Track new and/or changing trends Maximize margins Identify cross sell and upsell opportunities Support corporate strategy development Analyze/form credit risk assessment Perform product line management 16

Overview of Oracle Big Data Appliance

Oracle Big Data Appliance 18

Cloudera s Distribution for Hadoop File System Mount UI Framework SDK FUSE-DFS HUE HUE SDK Workflow Scheduling Metadata APACHE OOZIE APACHE OOZIE APACHE HIVE Data Integration Languages / Compilers APACHE PIG, APACHE HIVE, APACHE MAHOUT Fast Read/Write Access APACHE FLUME, SQOOP HDFS, MAPREDUCE APACHE HBASE Coordination APACHE ZOOKEEPER 19

BDA Full Rack: Hadoop + Oracle NoSQL Database Node1 Node2 Node3 Node4 Node5 Node6 Node7 Node8 Node9 Node10 Node11 Node12 Node13 Node14 Node15 Node16 Node17 Node18 First NameNode, DataNode, ZooKeeper,failover controller, Balancer, Puppet master & Agent, NoSQL Second NameNode,DataNode,ZooKeeper,failover controller, MySQL backup Server, Puppet Agent, NoSQL Admin Job Tracker, DataNode, ZooKeeper, CMserver, ODI agent, MySQL primary server, Hue, Hive, Beeswax,Puppet Agent, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL DataNode, TaskTracker, Cloudera Mgr Agents,Puppet Agents, NoSQL 20

Oracle Big Data Implementation Steps

Implementation Step -1 Acquire Organize Analyze Visualize Collect tweets into HDFS using Twitter API Specify search terms Terms may be Categorized Bulk collect or stream 22

Implementation Step - 2 Acquire Organize Analyze Visualize MapReduce jobs compute sentiment Summarize by airline, category, time and user Load into Oracle Database using Oracle SQL Connector for HDFS Map Twitter ID to customer 23

Implementation Step - 3 Acquire Organize Analyze Visualize Combine other customer data sets to provide a deeper level of analysis, such as social and economic importance. 24

Implementation Step - 4 Acquire Organize Analyze Visualize Visualize social ad economic importance by using Oracle Database analytics and OBIEE dashboards to drive decision making 25

Oracle Integrated Software Solution

Oracle Big Data Appliance Software Overview Shows the relationships among the tools and identifies the tasks that they perform 27

Oracle Engineered Systems Data Variety Unstructured Schema-less Big Data Appliance Oracle BI Schema Exadata 28 Information Density Acquire Organize Analyze

Oracle Big Data Appliance Usage Model Oracle Big Data Appliance Oracle Exadata Oracle Exalytics InfiniBand InfiniBand Stream Acquire Organize Analyze & Visualize 29

Oracle Big Data Best Practices

Big Data Best Practices Few general guidelines to build a successful big data architecture foundation Align Big Data with Specific Business Goals Ease Skills Shortage with Standards and Governance Optimize Knowledge Transfer with a Center of Excellence Top Payoff is Aligning Unstructured with Structured Data Plan Your Sandbox For Performance Align with the Cloud Operating Model 31

Demo - How to get Oracle Big Data Lite VM - Start/Stop Services - Software Walk through - Getting data from Twitter

Oracle Big Data Lite Virtual Machine Prepare your host system Version 4.0 Technical Requirements: Dedicate 2 cores 5 GB memory 30GB disk space to the virtual machine Install will require ~45GB disk space including temporary files To get started: Download and install Oracle VM VirtualBox and 7-zip Download each of the 7-zip files Run the 7-zip extractor on the BigDataLite-4.0.001 file only. This will create the BigDataLite-4.0.ova VirtualBox appliance file In VirtualBox, import BigDataLite-4.0.ova Start BigDataLite-4.0 Log in as oracle/welcome1 33

Download Oracle Big Data Lite Virtual Machine http://www.oracle.com/technetwork/database/bigdata-appliance/oracle-bigdatalite-2104726.html Version 4.0 File Deployment Guide BigDataLite-4.0.001 (1048576000 bytes) BigDataLite-4.0.002 (1048576000 bytes) BigDataLite-4.0.003 (1048576000 bytes) BigDataLite-4.0.004 (1048576000 bytes) BigDataLite-4.0.005 (1048576000 bytes) BigDataLite-4.0.006 (1048576000 bytes) BigDataLite-4.0.007 (1048576000 bytes) BigDataLite-4.0.008 (1048576000 bytes) BigDataLite-4.0.009 (1048576000 bytes) BigDataLite-4.0.010 (1048576000 bytes) BigDataLite-4.0.011 (1048576000 bytes) BigDataLite-4.0.012 (1048576000 bytes) BigDataLite-4.0.013 (874757813 bytes) md5sum.txt (346 bytes) Cloudera JDBC Drivers Description Start Here! Deployment Guide provides step-by-step instructions for download and deployment. To get started: Download and install Oracle VM VirtualBox and 7- zip Download each of the 7-zip files Run the 7-zip extractor on the BigDataLite-4.0.001 file only. This will create the BigDataLite-4.0.ova VirtualBox appliance file In VirtualBox, import BigDataLite-4.0.ova Start BigDataLite-4.0 Log in as oracle/welcome1 See the Deployment Guide for details. Download and install the Cloudera JDBC drivers to enable Oracle SQL Developer and Data Modeler to connect to Hive. Oracle Enterprise Linux 6.4 Oracle Database 12c Release 1 Enterprise Edition (12.1.0.2) - including Oracle Big Data SQL-enabled external tables Cloudera Distribution including Apache Hadoop (CDH5.1.2) Cloudera Manager (5.1.2) Oracle Big Data Connectors 4.0 Oracle SQL Connector for HDFS 3.1.0 Oracle Loader for Hadoop 3.2.0 Oracle Data Integrator 12c Oracle R Advanced Analytics for Hadoop 2.4.1 Oracle XQuery for Hadoop 4.0.1 Oracle NoSQL Database Enterprise Edition 12cR1 (3.0.14) Oracle JDeveloper 12c (12.1.3) Oracle SQL Developer and Data Modeler 4.0.3 Oracle Data Integrator 12cR1 (12.1.3) Oracle GoldenGate 12c Oracle R Distribution 3.1.1 34

Installation Steps Prepare your host system Version 4.0 Run the 7-zip extractor on the BigDataLite-4.0.001 file only. This will create the BigDataLite-4.0.ova VirtualBox appliance file Start Oracle Virtual Box Manager and Import the Appliance 35

What s Next

Harness the power of Big Data the time is now Requires a cultural shift, enterprise-wide Build an information driven corporate culture By using information effectively to understand and align with your business needs and preferences of your customers is the key to creating a competitive advantage in today's customer-empowered environment. 37

Sangam14 The Largest Independent Oracle Conference In India Tom Kyte Maria Colgan 38 www.sangam14.info Sangam14-6th Annual Oracle Users Conference - INDIA. Friday, 7 th, Saturday 8 th & Sunday 9 th November 2014. Bangalore.

Thank you oracle.com/bigdata