Oracle Big Data SQL Architectural Deep Dive

Size: px
Start display at page:

Download "Oracle Big Data SQL Architectural Deep Dive"

Transcription

1

2 Oracle Big Data SQL Architectural Deep Dive Dan McClary, Ph.D. Big Data Product Management Oracle

3 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle. Oracle Confidential Internal/Restricted/Highly Restricted 3

4 Agenda The Data Analytics Challenge Why Unified Query Matters SQL on Hadoop and More: Unifying Metadata Query Franchising: Smart Scan for Hadoop Oracle Confidential Internal/Restricted/Highly Restricted 4

5 Data Analytics Challenge Separate silos of information to analyze 5

6 Data Analytics Challenge Separate data access interfaces 6

7 SQL on Hadoop is Obvious Stinger Oracle Confidential Internal/Restricted/Highly Restricted 7

8 Data Analytics Challenge No comprehensive SQL interface across Oracle, Hadoop and NoSQL 8

9 Oracle Big Data Management System Rich, comprehensive SQL access to all enterprise data NoSQL 9

10 What Does Unified Query Mean for You? Before Data Science After PhD Anyone???

11 What Does Unified Query Mean for You? Before Application Development After

12 Use Rich Oracle SQL Dialect Over All Data Snapshot of Oracle SQL Analytic Functions Ranking functions rank, dense_rank, cume_dist, percent_rank, ntile Window Aggregate functions (moving and cumulative) Avg, sum, min, max, count, variance, stddev, first_value, last_value LAG/LEAD functions Direct inter-row reference using offsets Reporting Aggregate functions Sum, avg, min, max, variance, stddev, count, ratio_to_report Statistical Aggregates Correlation, linear regression family, covariance Linear regression Fitting of an ordinary-least-squares regression line to a set of number pairs. Frequently combined with the COVAR_POP, COVAR_SAMP, and CORR functions Descriptive Statistics DBMS_STAT_FUNCS: summarizes numerical columns of a table and returns count, min, max, range, mean, stats_mode, variance, standard deviation, median, quantile values, +/- n sigma values, top/bottom 5 values Correlations Pearson s correlation coefficients, Spearman's and Kendall's (both nonparametric). Cross Tabs Enhanced with % statistics: chi squared, phi coefficient, Cramer's V, contingency coefficient, Cohen's kappa Hypothesis Testing Student t-test, F-test, Binomial test, Wilcoxon Signed Ranks test, Chi-square, Mann Whitney test, Kolmogorov-Smirnov test, One-way ANOVA Distribution Fitting Kolmogorov-Smirnov Test, Anderson-Darling Test, Chi-Squared Test, Normal, Uniform, Weibull, Exponential

13 } next = linenext.getquantity(); if (!q.isempty() && (prev.isempty() (eq(q, prev) && gt(q, next)))) { state = "S"; return state; } if (gt(q, prev) && gt(q, next)) { state = "T"; return state; Pattern } Matching With Oracle SQL Snapshot of Oracle SQL Analytic Functions if (lt(q, prev) && lt(q, next)) { state = "B"; return state; } if (!q.isempty() && (next.isempty() (gt(q, prev) && eq(q, next)))) { state = "E"; return state; Simplified, } sophisticated, standards based syntax if (q.isempty() eq(q, prev)) { state = "F"; return state; } Finding Patterns in Stock Market Data - Double Bottom (W) Ticker 10:00 10:05 10:10 10:15 10:20 10:25 } return state; private boolean eq(string a, String b) { if (a.isempty() b.isempty()) { return false; } return a.equals(b); } private boolean gt(string a, String b) { if (a.isempty() b.isempty()) { return false; } return Double.parseDouble(a) > Double.parseDouble(b); } private boolean lt(string a, String b) { if (a.isempty() b.isempty()) { return false; } return Double.parseDouble(a) < Double.parseDouble(b); } public String getstate() { return this.state; } } BagFactory bagfactory = public Tuple exec(tuple input) throws IOException { SELECT first_x, last_z FROM ticker MATCH_RECOGNIZE ( PARTITION BY name ORDER BY time MEASURES FIRST(x.time) AS first_x, LAST(z.time) AS last_z ONE ROW PER MATCH PATTERN (X+ Y+ W+ Z+) DEFINE X AS (price < PREV(price)), Y AS (price > PREV(price)), W AS (price < PREV(price)), Z AS (price > PREV(price) AND z.time - FIRST(x.time) <= 7 )) long c = 0; String line = ""; String pbkey = ""; V0Line nextline; V0Line thisline; V0Line processline; V0Line evalline = null; V0Line prevline; boolean nomorevalues = false; String matchlist = ""; ArrayList<V0Line> linefifo = new ArrayList<V0Line>(); boolean finished = false; 250+ Lines of Java UDF 12 Lines of SQL 13 DataBag output = bagfactory.newdefaultbag(); if (input == null) { return null; } if (input.size() == 0) { return null; } Object o = input.get(0); if (o == null) { return null; } Copyright 2014, Oracle and/or its affiliates. All rights reserved. //Object o = input.get(0); if (!(o instanceof DataBag)) { int errcode = 2114; 20x less code

14 Oracle Big Data SQL A New Architecture Powerful, high-performance SQL on Hadoop Full Oracle SQL capabilities on Hadoop SQL query processing local to Hadoop nodes Simple data integration of Hadoop and Oracle Database Single SQL point-of-entry to access all data Scalable joins between Hadoop and RDBMS data Optimized hardware Balanced Configurations No bottlenecks Oracle Confidential Internal/Restricted/Highly Restricted 14

15 100% Want to know what this really means.

16 SQL on Hadoop and More: Unifying Metadata

17 Why Unify Metadata? CREATE TABLE customers SELECT name FROM customers sales Catalog customers Query CREATE across TABLE sources sales Integrate new metadata SELECT No changes customers.name, for users sales.amount and applications Seamlessly handle schema-on-read Exploit remote data distribution Holistically optimize queries SALES CUSTOMERS

18 How Data is Stored in Hadoop Example: 1TB File {"custid": ,"movieid":null,"genreid":null,"time":" :00:00:07","recommended":null,"activity":8} {"custid": ,"movieid":1948,"genreid":9,"time":" :00:00:22","recommended":"n","activity":7} {"custid": ,"movieid":null,"genreid":null,"time":" :00:00:26","recommended":null,"activity":9} Block B1 {"custid": ,"movieid":11547,"genreid":44,"time":" :00:00:32","recommended":"y","activity":7} {"custid": ,"movieid":11547,"genreid":44,"time":" :00:00:42","recommended":"y","activity":6} {"custid": ,"movieid":null,"genreid":null,"time":" :00:00:43","recommended":null,"activity":8} {"custid": ,"movieid":null,"genreid":null,"time":" :00:00:50","recommended":null,"activity":9} {"custid": ,"movieid":608,"genreid":6,"time":" :00:01:03","recommended":"n","activity":7} Block B2 {"custid": ,"movieid":null,"genreid":null,"time":" :00:01:07","recommended":null,"activity":9} {"custid": ,"movieid":27205,"genreid":9,"time":" :00:01:18","recommended":"y","activity":7} {"custid": ,"movieid":1124,"genreid":9,"time":" :00:01:26","recommended":"y","activity":7} {"custid": ,"movieid":16309,"genreid":9,"time":" :00:01:35","recommended":"n","activity":7} {"custid": ,"movieid":11547,"genreid":44,"time":" :00:01:39","recommended":"y","activity":7}} Block B3 {"custid": ,"movieid":424,"genreid":1,"time":" :00:05:02","recommended":"y","activity":4} 1 block = 256 MB Example File = 4096 blocks InputSplits = 4096 Potential scan parallelism Oracle Confidential Internal/Restricted/Highly Restricted 18

19 How MapReduce and Hive Read Data Consumer Scan and row creation needs to be able to work on any data format Create ROWS & COLUMNS SCAN Data Node disk Data definitions and column deserializations are needed to provide a table RecordReader => Scans data (keys and values) InputFormat => Defines parallelism SerDe => Makes columns Metastore => Maps DDL to Java access classes 19

20 SQL-on-Hadoop Engines Share Metadata, not MapReduce Hive Metastore Oracle Big Data SQL SparkSQL Hive Impala Hive Metastore Table Definitions: movieapp_log_json Tweets avro_log Metastore maps DDL to Java access classes Oracle Confidential Internal/Restricted/Highly Restricted 20

21 Extend Oracle External Tables CREATE TABLE movielog ( click VARCHAR2(4000)) ORGANIZATION EXTERNAL ( TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR ACCESS PARAMETERS ( com.oracle.bigdata.tablename logs com.oracle.bigdata.cluster mycluster )) REJECT LIMIT UNLIMITED; New types of external tables ORACLE_HIVE (inherit metadata) ORACLE_HDFS (specify metadata) Access parameters for Big Data Hadoop cluster Remote Hive database/table DBMS_HADOOP Package for automatic import 21

22 Enhance Oracle External Tables CREATE TABLE ORDER ( cust_num VARCHAR2(10), order_num VARCHAR2(20), order_total NUMBER(8,2)) ORGANIZATION EXTERNAL ( TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR ) PARALLEL 20 REJECT LIMIT UNLIMITED; Transparent schema-for-read Use fast C-based readers when possible Use native Hadoop classes otherwise Engineered to understand parallelism Map external units of parallelism to Oracle Architected for extensibility StorageHandler capability enables future support for other data sources Examples: MongoDB, HBase, Oracle NoSQL DB 22

23 Query Franchising: Smart Scan for Hadoop

24 What Can Big Data Learn from Exadata? Intelligent Storage Maximizes Performance SELECT name, SUM(purchase) FROM customers GROUP BY name; 1 Oracle SQL query issued Plan constructed Query executed Oracle Exadata Storage Server 2 Smart Scan Works on Storage Filter out unneeded rows Project only queried columns Score data models Bloom filters to speed up joins Oracle Exadata Storage Server CUSTOMERS

25 Query Franchising dispatch of query processing to self-similar compute agents on disparate systems without loss of operational fidelity

26 Big Data SQL Server: A New Hadoop Processing Engine MapReduce and Hive Processing Layer Spark Impala Search Big Data SQL Resource Management (YARN, cgroups) Storage Layer Filesystem (HDFS) NoSQL Databases (Oracle NoSQL DB, Hbase) Oracle Confidential Internal/Restricted/Highly Restricted 26

27 Smart Scan for Hadoop: Optimizing Performance Big Data SQL Server Smart Scan External Table Services Data Node Oracle on top Apply filter predicates Project columns Parse semi-structured data Hadoop on the bottom Work close to the data Schema-on-read with Hadoop classes Transformation into Oracle data stream Disk Oracle Confidential Internal/Restricted/Highly Restricted 27

28 Big Data SQL Query Execution How do we query Hadoop? HDFS NameNode 1 Hive Metastore HDFS Data Node BDS Server B B B HDFS Data Node BDS Server Query compilation determines: Data locations Data structure Parallelism Fast reads using Big Data SQL Server Schema-for-read using Hadoop classes Smart Scan selects only relevant data Process filtered result Move relevant data to database Join with database tables Apply database security policies

29 Mapping Hadoop to Oracle Parallel Query and Hadoop HDFS NameNode Hive Metastore B B B 1 InputSplits 2 PX Determine Hadoop Parallelism Determine schema-for-read Determine InputSplits Arrange splits for best performance Map to Oracle Parallelism Map splits to granules Assign granules to PX Servers PX Servers Route Work Offload work to Big Data SQL Servers Aggregate Join Apply PL/SQL

30 Big Data SQL Server Dataflow Big Data SQL Server Smart Scan External Table Services Read data from HDFS Data Node Direct-path reads C-based readers when possible Use native Hadoop classes otherwise Translate bytes to Oracle SerDe RecordReader Data Node Disks 1 3 Apply Smart Scan to Oracle bytes Apply filters Project Columns Parse JSON/XML Score models

31 But How Does Security Work? DBMS_REDACT.ADD_POLICY( object_schema => 'MCLICK', SELECT object_name * FROM => 'TWEET_V', my_bigdata_table WHERE column_name SALES_REP_ID => 'USERNAME', = SYS_CONTEXT('USERENV','SESSION_USER'); policy_name => 'tweet_redaction', function_type => DBMS_REDACT.PARTIAL, function_parameters => 'VVVVVVVVVVVVVVVVVVVVVVVVV,*,3,25', expression => '1=1' ); B B B *** Filter on SESSION_USER Database security for query access Virtual Private Databases Redaction Audit Vault and Database Firewall Hadoop security for Hadoop jobs Kerberos Authentication Apache Sentry (RBAC) Audit Vault System-specific encryption Database tablespace encryption BDA On-disk Encryption

32 SQL, Everywhere Futures

33 More Lessons from Exadata Move less data Go faster 1 Storage Indexes Skip reads on irrelevant data Big Hadoop Blocks ~ Big Speed Up 2 Caching Cache frequently accessed columns HDFS Caching

34 Oracle Big Data Management System Rich, comprehensive SQL access to all enterprise data NoSQL 34

35 Oracle Big Data Management System Unite Information Lifecycles NoSQL Shared REST APIs App-embedded schema NoSQL Shared schema Oracle Automatic ILM Roll off cold partitions to Hadoop Promote hot data to Oracle 35

36 Oracle Big Data Management System Unify All Query NoSQL 36

37

38

39 Big Data SQL SELECT w.sess_id,w.cust_id,c.name FROM web_logs w, customers c WHERE w.source_country = Brazil AND c.customer_id = w.cust_id WEB_LOGS Big Data SQL B B B Hadoop Cluster Relevant SQL runs on BDA nodes 10 s of Gigabytes of Data Only columns and rows needed to answer query are returned Oracle Database CUSTOMERS 39

Big Data: Are you ready?

Big Data: Are you ready? Big Data: Are you ready? Oracle Big Data SQL George Bourmas Enterprise Architect EMEA XLOB Enterprise Architects September 13, 2014 Oracle Confidential Internal/Restricted/Highly Restricted Thoughts Things

More information

Oracle Big Data SQL. Architectural Deep Dive. Dan McClary, Ph.D. Big Data Product Management Oracle

Oracle Big Data SQL. Architectural Deep Dive. Dan McClary, Ph.D. Big Data Product Management Oracle Oracle Big Data SQL Architectural Deep Dive Dan McClary, Ph.D. Big Data Product Management Oracle Copyright 2014, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The following is

More information

Big Data Management System Solution Overview

Big Data Management System Solution Overview Big Data Management System Solution Overview Pascal GUY Pre Sales Architect Business Unit Systems Oracle France Copyright 2014 Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The

More information

Big Data SQL and Query Franchising

Big Data SQL and Query Franchising Big Data SQL and Query Franchising An Architecture for Query Beyond Hadoop Dan McClary, Ph.D. Big Data Product Management Oracle Copyright 2014, Oracle and/or its affiliates. All rights reserved. Safe Harbor

More information

Oracle Big Data SQL Konference Data a znalosti 2015

Oracle Big Data SQL Konference Data a znalosti 2015 Oracle Big Data SQL Konference Data a znalosti 2015 Jakub ILLNER Information Management Architect XLOB Enterprise Cloud Architects 23 July 2015, version 2 Agenda 1 2 3 4 5 Is SQL Dead? Introducing Oracle

More information

Seamless Access from Oracle Database to Your Big Data

Seamless Access from Oracle Database to Your Big Data Seamless Access from Oracle Database to Your Big Data Brian Macdonald Big Data and Analytics Specialist Oracle Enterprise Architect September 24, 2015 Agenda Hadoop and SQL access methods What is Oracle

More information

SQL - the best analysis language for Big Data!

SQL - the best analysis language for Big Data! SQL - the best analysis language for Big Data! NoCOUG Winter Conference 2014 Hermann Bär, hermann.baer@oracle.com Data Warehousing Product Management, Oracle 1 The On-Going Evolution of SQL Introduction

More information

The Oracle Data Mining Machine Bundle: Zero to Predictive Analytics in Two Weeks Collaborate 15 IOUG

The Oracle Data Mining Machine Bundle: Zero to Predictive Analytics in Two Weeks Collaborate 15 IOUG The Oracle Data Mining Machine Bundle: Zero to Predictive Analytics in Two Weeks Collaborate 15 IOUG Presentation #730 Tim Vlamis and Dan Vlamis Vlamis Software Solutions 816-781-2880 www.vlamis.com Presentation

More information

Exadata V2 + Oracle Data Mining 11g Release 2 Importing 3 rd Party (SAS) dm models

Exadata V2 + Oracle Data Mining 11g Release 2 Importing 3 rd Party (SAS) dm models Exadata V2 + Oracle Data Mining 11g Release 2 Importing 3 rd Party (SAS) dm models Charlie Berger Sr. Director Product Management, Data Mining Technologies Oracle Corporation charlie.berger@oracle.com

More information

Oracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich

Oracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich Oracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich Agenda Introduction Old Times Exadata Big Data Oracle In-Memory Headquarters Conclusions 2 sumit

More information

Statistical Analysis of Gene Expression Data With Oracle & R (- data mining)

Statistical Analysis of Gene Expression Data With Oracle & R (- data mining) Statistical Analysis of Gene Expression Data With Oracle & R (- data mining) Patrick E. Hoffman Sc.D. Senior Principal Analytical Consultant pat.hoffman@oracle.com Agenda (Oracle & R Analysis) Tools Loading

More information

Sun / Oracle Life Science Platform From Deluge to Discovery. 2011 Oracle Corporation

Sun / Oracle Life Science Platform From Deluge to Discovery. 2011 Oracle Corporation Sun / Oracle Life Science Platform From Deluge to Discovery SGI and Sun 1996 2011 Graph Algorithims Social Media We re a very tiny circle in the middle of this big universe. So it s more likely interesting

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

OLSUG Workshop Oracle Data Mining

OLSUG Workshop Oracle Data Mining OLSUG Workshop Oracle Data Mining Charlie Berger Sr. Director of Product Mgmt, Life Sciences and Data Mining Oracle Corporation charlie.berger@oracle.com Dr. Lutz Hamel Asst. Professor, Computer Science

More information

Oracle Database 12c Plug In. Switch On. Get SMART.

Oracle Database 12c Plug In. Switch On. Get SMART. Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.

More information

How To Manage Big Data In A Microsoft Cloud (Hadoop)

How To Manage Big Data In A Microsoft Cloud (Hadoop) Oracle Database 12c and the Future of Data Warehousing in the Era of Big Data George Lumpkin Data Warehousing Neil Mendelson Big Data & Advanced AnalyEcs Vice Presidents Server Technologies September 29,

More information

Integrate Master Data with Big Data using Oracle Table Access for Hadoop

Integrate Master Data with Big Data using Oracle Table Access for Hadoop Integrate Master Data with Big Data using Oracle Table Access for Hadoop Kuassi Mensah Oracle Corporation Redwood Shores, CA, USA Keywords: Hadoop, BigData, Hive SQL, Spark SQL, HCatalog, StorageHandler

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Oracle Data Mining In-Database Data Mining Made Easy!

Oracle Data Mining In-Database Data Mining Made Easy! Oracle Data Mining In-Database Data Mining Made Easy! Charlie Berger Sr. Director Product Management, Data Mining and Advanced Analytics Oracle Corporation charlie.berger@oracle.com www.twitter.com/charliedatamine

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

Blazing BI: the Analytic Options to the Oracle Database. ODTUG Kscope 2013

Blazing BI: the Analytic Options to the Oracle Database. ODTUG Kscope 2013 Blazing BI: the Analytic Options to the Oracle Database ODTUG Kscope 2013 Dan Vlamis Tim Vlamis Vlamis Software Solutions 816-781-2880 http://www.vlamis.com Copyright 2013, Vlamis Software Solutions, Inc.

More information

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 1 Copyright 2011, Oracle and/or its affiliates. FPO In-Database Analytics: Predictive Analytics, Data Mining, Exadata & Business Intelligence Charlie Berger Sr. Director Product Management, Data Mining

More information

Analyzing Big Data. Heartland OUG Spring Conference 2014

Analyzing Big Data. Heartland OUG Spring Conference 2014 Analyzing Big Data Heartland OUG Spring Conference 2014 Dan Vlamis Vlamis Software Solutions 816-781-2880 http://www.vlamis.com Copyright 2014, Vlamis Software Solutions, Inc. Copyright 2014, Vlamis Software

More information

Cloudera Certified Developer for Apache Hadoop

Cloudera Certified Developer for Apache Hadoop Cloudera CCD-333 Cloudera Certified Developer for Apache Hadoop Version: 5.6 QUESTION NO: 1 Cloudera CCD-333 Exam What is a SequenceFile? A. A SequenceFile contains a binary encoding of an arbitrary number

More information

Integrating Apache Spark with an Enterprise Data Warehouse

Integrating Apache Spark with an Enterprise Data Warehouse Integrating Apache Spark with an Enterprise Warehouse Dr. Michael Wurst, IBM Corporation Architect Spark/R/Python base Integration, In-base Analytics Dr. Toni Bollinger, IBM Corporation Senior Software

More information

SQL - The Goto Language for Big Data Analy&cs!

SQL - The Goto Language for Big Data Analy&cs! SQL - The Goto Language for Big Data Analy&cs! Analy&cal SQL SQL Made Great Hermann Bär, hermann.baer@oracle.com Product Management Data Warehousing 1 Who Am I Not?.. But Who SHOULD be Here.. Keith Laker

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

Data Domain Profiling and Data Masking for Hadoop

Data Domain Profiling and Data Masking for Hadoop Data Domain Profiling and Data Masking for Hadoop 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or

More information

Semantic and Data Mining Technologies. Simon See, Ph.D.,

Semantic and Data Mining Technologies. Simon See, Ph.D., Semantic and Data Mining Technologies Simon See, Ph.D., Introduction to Semantic Web and Business Use Cases 2 Lots of Scientific Resources NAR 2009 over 1170 databases Reuse, Recycling, Repurposing Paul

More information

Oracle's In-Database Statistical Functions

Oracle's In-Database Statistical Functions Oracle 11g DB Data Warehousing Oracle's In-Database Statistical Functions OLAP Statistics Data Mining Charlie Berger Sr. Director Product Management, Data Mining Technologies

More information

Complete Java Classes Hadoop Syllabus Contact No: 8888022204

Complete Java Classes Hadoop Syllabus Contact No: 8888022204 1) Introduction to BigData & Hadoop What is Big Data? Why all industries are talking about Big Data? What are the issues in Big Data? Storage What are the challenges for storing big data? Processing What

More information

Unified Query for Big Data Management Systems

Unified Query for Big Data Management Systems Unified Query for Big Data Management Systems Integrating Big Data Systems with Enterprise Data Warehouses O R A C L E W H I T E P A P E R J A N U A R Y 2 0 1 5 Table of Contents Introduction 1 The Challenge

More information

BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig

BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig Contents Acknowledgements... 1 Introduction to Hive and Pig... 2 Setup... 2 Exercise 1 Load Avro data into HDFS... 2 Exercise 2 Define an

More information

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture

More information

Welkom! Copyright 2014 Oracle and/or its affiliates. All rights reserved.

Welkom! Copyright 2014 Oracle and/or its affiliates. All rights reserved. Welkom! WIE? Bestuurslid OGh met BI / WA ervaring Bepalen activiteiten van de vereniging Deelname in organisatie commite van 1 of meerdere events Faciliteren van de SIG s Redactie van OGh-Visie Onderhouden

More information

Big Data Analytics with Oracle Advanced Analytics In-Database Option

Big Data Analytics with Oracle Advanced Analytics In-Database Option Big Data Analytics with Oracle Advanced Analytics In-Database Option Charlie Berger Sr. Director Product Management, Data Mining and Advanced Analytics charlie.berger@oracle.com www.twitter.com/charliedatamine

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

The Hadoop Eco System Shanghai Data Science Meetup

The Hadoop Eco System Shanghai Data Science Meetup The Hadoop Eco System Shanghai Data Science Meetup Karthik Rajasethupathy, Christian Kuka 03.11.2015 @Agora Space Overview What is this talk about? Giving an overview of the Hadoop Ecosystem and related

More information

Architecting for the Internet of Things & Big Data

Architecting for the Internet of Things & Big Data Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to

More information

HADOOP. Revised 10/19/2015

HADOOP. Revised 10/19/2015 HADOOP Revised 10/19/2015 This Page Intentionally Left Blank Table of Contents Hortonworks HDP Developer: Java... 1 Hortonworks HDP Developer: Apache Pig and Hive... 2 Hortonworks HDP Developer: Windows...

More information

Oracle Big Data Fundamentals Ed 1 NEW

Oracle Big Data Fundamentals Ed 1 NEW Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big

More information

Real World Hadoop Use Cases

Real World Hadoop Use Cases Real World Hadoop Use Cases JFokus 2013, Stockholm Eva Andreasson, Cloudera Inc. Lars Sjödin, King.com 1 2012 Cloudera, Inc. Agenda Recap of Big Data and Hadoop Analyzing Twitter feeds with Hadoop Real

More information

A very short Intro to Hadoop

A very short Intro to Hadoop 4 Overview A very short Intro to Hadoop photo by: exfordy, flickr 5 How to Crunch a Petabyte? Lots of disks, spinning all the time Redundancy, since disks die Lots of CPU cores, working all the time Retry,

More information

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing

More information

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

TE's Analytics on Hadoop and SAP HANA Using SAP Vora TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. -

More information

I/O Considerations in Big Data Analytics

I/O Considerations in Big Data Analytics Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very

More information

Predictive Analytics for Better Business Intelligence

Predictive Analytics for Better Business Intelligence Oracle 11g DB Data Warehousing ETL OLAP Statistics Predictive Analytics for Better Business Intelligence Data Mining Charlie Berger Sr. Director Product Management, Data Mining Technologies

More information

Where is Hadoop Going Next?

Where is Hadoop Going Next? Where is Hadoop Going Next? Owen O Malley owen@hortonworks.com @owen_omalley November 2014 Page 1 Who am I? Worked at Yahoo Seach Webmap in a Week Dreadnaught to Juggernaut to Hadoop MapReduce Security

More information

DBMS / Business Intelligence, SQL Server

DBMS / Business Intelligence, SQL Server DBMS / Business Intelligence, SQL Server Orsys, with 30 years of experience, is providing high quality, independant State of the Art seminars and hands-on courses corresponding to the needs of IT professionals.

More information

Spring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE

Spring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE Spring,2015 Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE Contents: Briefly About Big Data Management What is hive? Hive Architecture Working

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

Oracle Big Data Strategy Simplified Infrastrcuture

Oracle Big Data Strategy Simplified Infrastrcuture Big Data Oracle Big Data Strategy Simplified Infrastrcuture Selim Burduroğlu Global Innovation Evangelist & Architect Education & Research Industry Business Unit Oracle Confidential Internal/Restricted/Highly

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Big Data Course Highlights

Big Data Course Highlights Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like

More information

How To Scale Out Of A Nosql Database

How To Scale Out Of A Nosql Database Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database - Engineered for Innovation Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database 11g Release 2 Shipping since September 2009 11.2.0.3 Patch Set now

More information

Workshop on Hadoop with Big Data

Workshop on Hadoop with Big Data Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly

More information

Big Data Technologies Compared June 2014

Big Data Technologies Compared June 2014 Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development

More information

Internals of Hadoop Application Framework and Distributed File System

Internals of Hadoop Application Framework and Distributed File System International Journal of Scientific and Research Publications, Volume 5, Issue 7, July 2015 1 Internals of Hadoop Application Framework and Distributed File System Saminath.V, Sangeetha.M.S Abstract- Hadoop

More information

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce

More information

Certified Big Data and Apache Hadoop Developer VS-1221

Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification

More information

Using distributed technologies to analyze Big Data

Using distributed technologies to analyze Big Data Using distributed technologies to analyze Big Data Abhijit Sharma Innovation Lab BMC Software 1 Data Explosion in Data Center Performance / Time Series Data Incoming data rates ~Millions of data points/

More information

Sector vs. Hadoop. A Brief Comparison Between the Two Systems

Sector vs. Hadoop. A Brief Comparison Between the Two Systems Sector vs. Hadoop A Brief Comparison Between the Two Systems Background Sector is a relatively new system that is broadly comparable to Hadoop, and people want to know what are the differences. Is Sector

More information

Big Data Introduction

Big Data Introduction Big Data Introduction Ralf Lange Global ISV & OEM Sales 1 Copyright 2012, Oracle and/or its affiliates. All rights Conventional infrastructure 2 Copyright 2012, Oracle and/or its affiliates. All rights

More information

Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction to Pig

Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction to Pig Introduction to Pig Agenda What is Pig? Key Features of Pig The Anatomy of Pig Pig on Hadoop Pig Philosophy Pig Latin Overview Pig Latin Statements Pig Latin: Identifiers Pig Latin: Comments Data Types

More information

Big Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016

Big Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016 Big Data Approaches Making Sense of Big Data Ian Crosland Jan 2016 Accelerate Big Data ROI Even firms that are investing in Big Data are still struggling to get the most from it. Make Big Data Accessible

More information

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

Using RDBMS, NoSQL or Hadoop?

Using RDBMS, NoSQL or Hadoop? Using RDBMS, NoSQL or Hadoop? DOAG Conference 2015 Jean- Pierre Dijcks Big Data Product Management Server Technologies Copyright 2014 Oracle and/or its affiliates. All rights reserved. Data Ingest 2 Ingest

More information

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES Vincent Garonne, Mario Lassnig, Martin Barisits, Thomas Beermann, Ralph Vigne, Cedric Serfon Vincent.Garonne@cern.ch ph-adp-ddm-lab@cern.ch XLDB

More information

What Next for DBAs in the Big Data Era

What Next for DBAs in the Big Data Era What Next for DBAs in the Big Data Era February 21 st, 2015 Copyright 2013. Apps Associates LLC. 1 Satyendra Kumar Pasalapudi Associate Practice Director IMS @ Apps Associates Co Founder & President of

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Cloudera Backup and Disaster Recovery

Cloudera Backup and Disaster Recovery Cloudera Backup and Disaster Recovery Important Note: Cloudera Manager 4 and CDH 4 have reached End of Maintenance (EOM) on August 9, 2015. Cloudera will not support or provide patches for any of the Cloudera

More information

<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database

<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database 1 Best Practices for Extreme Performance with Data Warehousing on Oracle Database Rekha Balwada Principal Product Manager Agenda Parallel Execution Workload Management on Data Warehouse

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets

More information

Xiaoming Gao Hui Li Thilina Gunarathne

Xiaoming Gao Hui Li Thilina Gunarathne Xiaoming Gao Hui Li Thilina Gunarathne Outline HBase and Bigtable Storage HBase Use Cases HBase vs RDBMS Hands-on: Load CSV file to Hbase table with MapReduce Motivation Lots of Semi structured data Horizontal

More information

AtScale Intelligence Platform

AtScale Intelligence Platform AtScale Intelligence Platform PUT THE POWER OF HADOOP IN THE HANDS OF BUSINESS USERS. Connect your BI tools directly to Hadoop without compromising scale, performance, or control. TURN HADOOP INTO A HIGH-PERFORMANCE

More information

Big Data: Using ArcGIS with Apache Hadoop. Erik Hoel and Mike Park

Big Data: Using ArcGIS with Apache Hadoop. Erik Hoel and Mike Park Big Data: Using ArcGIS with Apache Hadoop Erik Hoel and Mike Park Outline Overview of Hadoop Adding GIS capabilities to Hadoop Integrating Hadoop with ArcGIS Apache Hadoop What is Hadoop? Hadoop is a scalable

More information

Hadoop Job Oriented Training Agenda

Hadoop Job Oriented Training Agenda 1 Hadoop Job Oriented Training Agenda Kapil CK hdpguru@gmail.com Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Oracle Big Data Appliance Releases 2.5 and 3.0 Ralf Lange Global ISV & OEM Sales Agenda Quick Overview on BDA and its Positioning Product Details and Updates Security and Encryption New Hadoop Versions

More information

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All

More information

How To Create A Data Visualization With Apache Spark And Zeppelin 2.5.3.5

How To Create A Data Visualization With Apache Spark And Zeppelin 2.5.3.5 Big Data Visualization using Apache Spark and Zeppelin Prajod Vettiyattil, Software Architect, Wipro Agenda Big Data and Ecosystem tools Apache Spark Apache Zeppelin Data Visualization Combining Spark

More information

How To Use Cloudera Manager Backup And Disaster Recovery (Brd) On A Microsoft Hadoop 5.5.5 (Clouderma) On An Ubuntu 5.2.5 Or 5.3.5

How To Use Cloudera Manager Backup And Disaster Recovery (Brd) On A Microsoft Hadoop 5.5.5 (Clouderma) On An Ubuntu 5.2.5 Or 5.3.5 Cloudera Manager Backup and Disaster Recovery Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or

More information

Qsoft Inc www.qsoft-inc.com

Qsoft Inc www.qsoft-inc.com Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:

More information

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com Agenda The rise of Big Data & Hadoop MySQL in the Big Data Lifecycle MySQL Solutions for Big Data Q&A

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

Session# - AaS 2.1 Title SQL On Big Data - Technology, Architecture and Roadmap

Session# - AaS 2.1 Title SQL On Big Data - Technology, Architecture and Roadmap Session# - AaS 2.1 Title SQL On Big Data - Technology, Architecture and Roadmap Sumit Pal Independent Big Data and Data Science Consultant, Boston 1 Data Center World Certified Vendor Neutral Each presenter

More information

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah Apache Hadoop: The Pla/orm for Big Data Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah 1 The Problems with Current Data Systems BI Reports + Interac7ve Apps RDBMS (aggregated

More information

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.

More information

Sentimental Analysis using Hadoop Phase 2: Week 2

Sentimental Analysis using Hadoop Phase 2: Week 2 Sentimental Analysis using Hadoop Phase 2: Week 2 MARKET / INDUSTRY, FUTURE SCOPE BY ANKUR UPRIT The key value type basically, uses a hash table in which there exists a unique key and a pointer to a particular

More information

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

Oracle Database 11g Comparison Chart

Oracle Database 11g Comparison Chart Key Feature Summary Express 10g Standard One Standard Enterprise Maximum 1 CPU 2 Sockets 4 Sockets No Limit RAM 1GB OS Max OS Max OS Max Database Size 4GB No Limit No Limit No Limit Windows Linux Unix

More information

Big Data and Scripting map/reduce in Hadoop

Big Data and Scripting map/reduce in Hadoop Big Data and Scripting map/reduce in Hadoop 1, 2, parts of a Hadoop map/reduce implementation core framework provides customization via indivudual map and reduce functions e.g. implementation in mongodb

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

COURSE CONTENT Big Data and Hadoop Training

COURSE CONTENT Big Data and Hadoop Training COURSE CONTENT Big Data and Hadoop Training 1. Meet Hadoop Data! Data Storage and Analysis Comparison with Other Systems RDBMS Grid Computing Volunteer Computing A Brief History of Hadoop Apache Hadoop

More information

Deep Quick-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors Mark Rittman, CTO, Rittman Mead Oracle Openworld 2014, San Francisco

Deep Quick-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors Mark Rittman, CTO, Rittman Mead Oracle Openworld 2014, San Francisco Deep Quick-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors Mark Rittman, CTO, Rittman Mead Oracle Openworld 2014, San Francisco About the Speaker Mark Rittman, Co-Founder of Rittman Mead

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information