Hadoop Meets Exadata. Presented by: Kerry Osborne. DW Global Leaders Program Decemeber, 2012



Similar documents
Big Data in a Relational World Presented by: Kerry Osborne JPMorgan Chase December, 2012

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

Oracle Big Data Essentials

Constructing a Data Lake: Hadoop and Oracle Database United!

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Hadoop Internals for Oracle Developers and DBAs Exploring the HDFS and MapReduce Data Flow

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Qsoft Inc

Deep Quick-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors Mark Rittman, CTO, Rittman Mead Oracle Openworld 2014, San Francisco

Oracle Big Data Handbook

Apache Hadoop: Past, Present, and Future

From Relational to Hadoop Part 1: Introduction to Hadoop. Gwen Shapira, Cloudera and Danil Zburivsky, Pythian

TUT NoSQL Seminar (Oracle) Big Data

Big Data Too Big To Ignore

Oracle Big Data Fundamentals Ed 1 NEW

Large scale processing using Hadoop. Ján Vaňo

Hadoop and Map-Reduce. Swati Gore

Hadoop Ecosystem B Y R A H I M A.

Hadoop implementation of MapReduce computational model. Ján Vaňo

MapReduce Job Processing

OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS)

Session: Big Data get familiar with Hadoop to use your unstructured data Udo Brede Dell Software. 22 nd October :00 Sesión B - DB2 LUW

<Insert Picture Here> Big Data

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Communicating with the Elephant in the Data Center

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Big Data: Are You Ready? Kevin Lancaster

Data Analyst Program- 0 to 100

How To Use A Data Center With A Data Farm On A Microsoft Server On A Linux Server On An Ipad Or Ipad (Ortero) On A Cheap Computer (Orropera) On An Uniden (Orran)

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

MapReduce, Hadoop and Amazon AWS

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Connecting Hadoop with Oracle Database

Using RDBMS, NoSQL or Hadoop?

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

A very short Intro to Hadoop

Cloudera Certified Developer for Apache Hadoop

Hadoop and its Usage at Facebook. Dhruba Borthakur June 22 rd, 2009

Big Data Introduction

<Insert Picture Here> Oracle and/or Hadoop And what you need to know

Hadoop Development & BI- 0 to 100

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Open source Google-style large scale data analysis with Hadoop

Hadoop Architecture. Part 1

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Apache HBase. Crazy dances on the elephant back

BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION

Oracle Big Data SQL Technical Update

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP Oracle ESG Data Systems Architecture

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Fundamentals Curriculum HAWQ

ITG Software Engineering

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

Certified Big Data and Apache Hadoop Developer VS-1221

Tuning Exadata. But Why?

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Big Data Course Highlights

Hadoop. History and Introduction. Explained By Vaibhav Agarwal

Deploying Hadoop with Manager

Internals of Hadoop Application Framework and Distributed File System

An Oracle White Paper June Oracle: Big Data for the Enterprise

Architecting for the Internet of Things & Big Data

MapReduce with Apache Hadoop Analysing Big Data

Hadoop 101. Lars George. NoSQL- Ma4ers, Cologne April 26, 2013

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

HadoopRDF : A Scalable RDF Data Analysis System

Apache Hadoop. Alexandru Costan

Hadoop Job Oriented Training Agenda

The Top 10 7 Hadoop Patterns and Anti-patterns. Alex

This article is the second

Exadata for Oracle DBAs. Longtime Oracle DBA

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

HOW TO LIVE WITH THE ELEPHANT IN THE SERVER ROOM APACHE HADOOP WORKSHOP

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

Complete Java Classes Hadoop Syllabus Contact No:

Hadoop. for Oracle database professionals. Alex Gorbachev Calgary, AB September 2013

Distributed Computing and Big Data: Hadoop and MapReduce

Successfully Deploying Alternative Storage Architectures for Hadoop Gus Horn Iyer Venkatesan NetApp

Realtime Apache Hadoop at Facebook. Jonathan Gray & Dhruba Borthakur June 14, 2011 at SIGMOD, Athens

Structured data meets unstructured data in Azure and Hadoop

Performance Baseline of Oracle Exadata X2-2 HR HC. Part II: Server Performance. Benchware Performance Suite Release 8.4 (Build ) September 2013

An Oracle White Paper October Oracle: Big Data for the Enterprise

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science

Oracle Big Data for Dummies

Hadoop & its Usage at Facebook

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

Oracle Big Data Spatial & Graph Social Network Analysis - Case Study

MySQL and Hadoop. Percona Live 2014 Chris Schneider

CURSO: ADMINISTRADOR PARA APACHE HADOOP

Mr. Apichon Witayangkurn Department of Civil Engineering The University of Tokyo

Transcription:

Hi Hadoop Meets Exadata Presented by: Kerry Osborne DW Global Leaders Program Decemeber, 2012

whoami Never Worked for Oracle Worked with Oracle DB Since 1982 (V2) Working with Exadata since early 2010 Work for Enkitec (www.enkitec.com) (Enkitec owns an Exadata Half Rack V2/X2) (Enkitec owns an Oracle Big Data Appliance) Exadata Book (recently translated to Chinese) Hadoop Aficionado Blog: kerryosborne.oracle-guy.com Twitter: @KerryOracleGuy 3

What s the Point? Data Volumes are Increasing Rapidly Cost of Processing / Storing is High Scalability is Big Concern And 4

Hadoop Is A Virus * Stolen from Orbitz 5

Google Trends 6

Google Trends 7

Google Trends 8

Disjointed Presentation??? Architectures Integration Approaches Oracle Products Exadoop Case Study 9

Traditional RDBMS Architecture w o r k DB Server Compute Plumbing Storage Storage 10

Traditional Oracle Architecture RAC w o r k workers Cache (SGA) dbwr lgwr etc Block Mapper (ASM) Storage 11

HDFS/Hadoop Architecture HA? w o r k Name Node Job Tracker datanode tasktracker datanode tasktracker workers workers Storage Storage 12

HDFS/Hadoop Architecture HA? w o r k Job Tracker Block Mapper (namenode) datanode tasktracker datanode tasktracker workers workers Storage Storage 13

Exadata Architecture RAC w o r k Block Mapper (ASM) workers Cache Storage Node Storage Node workers workers Storage Storage 14

HDFS/Hadoop Architecture HA? w o r k Job Tracker Block Mapper (namenode) datanode tasktracker datanode tasktracker workers workers Storage Storage 15

Oracle + Hadoop Integration 16

Obligatory Marketing Slide 17

Oracle Big Data Appliance Prebuilt Hadoop Stack in a Rack Engineered System Open Source Software Includes Cloudera Distribution 18

Oracle Big Data Appliance 19

BDA Software 20

Top Secret Feature of BDA 21

Integration Options Many Ways to Skin the Cat Fuse Sqoop Oracle Big Data Connectors 22

Fuse External Tables 23

Sqoop (SQL-to-Hadoop) Graduated from Incubator Status in March 2012 Slower (no direct path?) Quest has a plug-in (oraoop) Bi-Directional 24

Oracle Big Data Connectors Oracle Loader for Hadoop - OLH Oracle Direct Connector for HDFS - ODCH Oracle R Connector for Hadoop ORHC Oracle Data Integrator Application Adapter for Hadoop Note: All Connectors are One Way 25

Oracle Data Integrator Application Adapter for Hadoop ODIAAH? 26

Oracle R Connector for Hadoop (ORHC) Provides ability to pull data from Oracle RDBMS Provides ability to pull data from HDFS Provides access to local file system Not really a loader tool Most useful for analysts 27

Oracle Loader for Hadoop (OLH) Implemented as a MapReduce job (oraloader.jar) Saves CPU on DB Server Can convert to Oracle datatypes Can partition data and optionally sort it Online direct into Oracle tables Can load into Oracle via JDBC or OCI Direct Path Offline generate preprocessed files in HDFS (DP format) 28

Oracle Direct Connector for HDFS (ODCH) Uses External Tables Fastest - 12T per hour Can load DP files preprocessed by OLH Allows Oracle SQL to query HDFS data Doesn t require loading into Oracle Downside uses DB CPU s 29

Exadoop * Mad Scientist Project 30

Exadoop Unusual Situation! Exadata Half Rack with 4 Spare Storage Servers Company Playing with Big Data Technology Exadata Cells Very Similar to BDA Servers 4 Cells Mini BDA! (happy face) 31

Exadoop Layout Exa Half Rack Exadoop - 4 Compute Nodes - Exa Compute Nodes X X X X X X X X X X X X - 7 Storage Nodes (252TB) Big Fat Pipe - Exa Storage Nodes (108TB raw) - Hadoop Cluster (144TB raw) 32

Exadoop Applications Telecom Company Call Detail Records Dumped by Switches Loaded into HDFS via Flume 33

Exadoop Proposed Architecture SIP Server - Exa Compute Apex App Flume Agent Packet Sniffer - Exa Storage - Hadoop Cluster HDFS CDR Error Codes Java App Hbase 34

Exadoop Applications 35

Wrap Up Is Hadoop the right tool for the job? Maybe All the Cool Kids Are Doing It! 36

Questions? Contact Information : Kerry Osborne kerry.osborne@enkitec.com kerryosborne.oracle-guy.com www.enkitec.com 37