Seamless Access from Oracle Database to Your Big Data
|
|
- Austin Armstrong
- 8 years ago
- Views:
Transcription
1 Seamless Access from Oracle Database to Your Big Data Brian Macdonald Big Data and Analytics Specialist Oracle Enterprise Architect September 24, 2015
2 Agenda Hadoop and SQL access methods What is Oracle Big Data SQL Big Data SQL Architecture Big Data SQL Configuration Roadmap Customer Story Q&A 9/23/2015 2
3 First Lets Define Big Data & Structured & Unstructured Data
4 SQL on Hadoop is Obvious Although Implementations Vary Hive Impala, HAWQ, IBM Big SQL Oracle SQL Connector for Hadoop (OSCH) Oracle Big Data SQL A million more (Tez, Presto, Hadapt, Stinger, Polybase, Drill, Lots of start ups) Stinger
5 SQL Analytics Challenge Separate silos of information to analyze 5
6 SQL Analytics Challenge No comprehensive SQL interface 6
7 Oracle Big Data SQL Hadoop + NoSQL + Relational 7
8 Oracle Big Data SQL A New Architecture Powerful, high-performance SQL on Hadoop Full Oracle SQL capabilities on Hadoop SQL query processing local to Hadoop nodes Simple data integration of Hadoop and Oracle Database Single SQL point-of-entry to access all data Scalable joins between Hadoop and RDBMS data Oracle Security Govern all Data through a Single Set of Security Policies Redaction, VPD, etc. Tool Access 8
9 Use Rich Oracle SQL Dialect Over All Data Snapshot of Oracle SQL Analytic Functions Ranking functions rank, dense_rank, cume_dist, percent_rank, ntile Window Aggregate functions (moving and cumulative) Avg, sum, min, max, count, variance, stddev, first_value, last_value LAG/LEAD functions Direct inter-row reference using offsets Reporting Aggregate functions Sum, avg, min, max, variance, stddev, count, ratio_to_report Statistical Aggregates Correlation, linear regression family, covariance Linear regression Fitting of an ordinary-least-squares regression line to a set of number pairs. Frequently combined with the COVAR_POP, COVAR_SAMP, and CORR functions Descriptive Statistics DBMS_STAT_FUNCS: summarizes numerical columns of a table and returns count, min, max, range, mean, stats_mode, variance, standard deviation, median, quantile values, +/- n sigma values, top/bottom 5 values Correlations Pearson s correlation coefficients, Spearman's and Kendall's (both nonparametric). Cross Tabs Enhanced with % statistics: chi squared, phi coefficient, Cramer's V, contingency coefficient, Cohen's kappa Hypothesis Testing Student t-test, F-test, Binomial test, Wilcoxon Signed Ranks test, Chi-square, Mann Whitney test, Kolmogorov-Smirnov test, One-way ANOVA Distribution Fitting Kolmogorov-Smirnov Test, Anderson-Darling Test, Chi-Squared Test, Normal, Uniform, Weibull, Exponential
10 Oracle Big Data SQL Architecture Two components of Oracle Big Data SQL External Table extension Big Data SQL Server Software on Big Data Appliance
11 Oracle Big Data SQL Architecture Two components of Oracle Big Data SQL External Table extension Big Data SQL Server Software on Big Data Appliance
12 A Smarter Oracle External Table Oracle Table You define: Table name Oracle types Any Degree of Parallelism HDFS Data You get: Automatic discovery of Hive table metadata Automatic translation from Hadoop types Automatic conversion from any InputFormat Fan-out Parallelism across the Hadoop cluster 12
13 Unify Metadata: Publish Hive Metadata to Oracle Catalog Hive Metastore CREATE TABLE movieapp_log_json (click VARCHAR2(4000)) ORGANIZATION EXTERNAL (TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR ) REJECT LIMIT UNLIMITED; Oracle Catalog Hive metadata External Table External Table Big Data Appliance + Hadoop/NoSQL Exadata + Oracle Database 13
14 Accessible through Oracle Data Dictionary Immediately So the DBA doesn t need to go to Hadoop ALL_HIVE_DATABASES ALL_HIVE_TABLES ALL_HIVE_COLUMNS DBA_HIVE_DATABASES DBA_HIVE_TABLES DBA_HIVE_COLUMNS USER_HIVE_DATABASES USER_HIVE_TABLES USER_HIVE_COLUMNS
15 Extend Oracle External Tables CREATE TABLE movielog ( click VARCHAR2(4000)) ORGANIZATION EXTERNAL ( TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR ACCESS PARAMETERS ( com.oracle.bigdata.tablename logs com.oracle.bigdata.cluster mycluster )) REJECT LIMIT UNLIMITED; New types of external tables ORACLE_HIVE (inherit metadata) ORACLE_HDFS (specify metadata) Access parameters for Big Data Hadoop cluster Remote Hive database/table DBMS_HADOOP Package for automatic import SQLDeveloper Integration (Create Table) 15
16 SQLDeveloper Integration
17 How Data is Stored in Hadoop As files. Pretty Simple Example: 1TB File {"custid": ,"movieid":null,"genreid":null,"time":" :00:00:07","recommended":null,"activity":8} {"custid": ,"movieid":1948,"genreid":9,"time":" :00:00:22","recommended":"n","activity":7} {"custid": ,"movieid":null,"genreid":null,"time":" :00:00:26","recommended":null,"activity":9} {"custid": ,"movieid":11547,"genreid":44,"time":" :00:00:32","recommended":"y","activity":7} {"custid": ,"movieid":11547,"genreid":44,"time":" :00:00:42","recommended":"y","activity":6} {"custid": ,"movieid":null,"genreid":null,"time":" :00:00:43","recommended":null,"activity":8} {"custid": ,"movieid":null,"genreid":null,"time":" :00:00:50","recommended":null,"activity":9} {"custid": ,"movieid":608,"genreid":6,"time":" :00:01:03","recommended":"n","activity":7} {"custid": ,"movieid":null,"genreid":null,"time":" :00:01:07","recommended":null,"activity":9} {"custid": ,"movieid":27205,"genreid":9,"time":" :00:01:18","recommended":"y","activity":7} {"custid": ,"movieid":1124,"genreid":9,"time":" :00:01:26","recommended":"y","activity":7} {"custid": ,"movieid":16309,"genreid":9,"time":" :00:01:35","recommended":"n","activity":7} {"custid": ,"movieid":11547,"genreid":44,"time":" :00:01:39","recommended":"y","activity":7}} {"custid": ,"movieid":424,"genreid":1,"time":" :00:05:02","recommended":"y","activity":4} CREATE TABLE ORDER (custid VARCHAR2(10), recommended VARCHAR2(20), activity (NUMBER 8,2)) ORGANIZATION EXTERNAL (TYPE oracle_hdfs) LOCATION ("hdfs:/usr/cust/summary/*"); Assumes Default Values Table Options Fields Column Maps Delimiters Fileformats json, textfile, sequencefile, Serdes i.e regex More (See Docs) 17
18 Creating an External Table against Hive DBMS_HADOOP.CREATE_EXTDDL_FOR_HIVE ( cluster_id IN VARCHAR2, db_name IN VARCHAR2 := NULL, hive_table_name IN VARCHAR2, hive_partition IN BOOLEAN, table_name IN VARCHAR2 := NULL, perform_ddl IN BOOLEAN DEFAULT FALSE, text_of_ddl OUT VARCHAR2 ); set serveroutput on DECLARE DDLout VARCHAR2(4000); BEGIN dbms_hadoop.create_extddl_for_hive( CLUSTER_ID=>'bigdatalite', DB_NAME=>'brian', HIVE_TABLE_NAME=>'movie', HIVE_PARTITION=>FALSE, TABLE_NAME=>'movie', PERFORM_DDL=>FALSE, TEXT_OF_DDL=>DDLout); dbms_output.put_line(ddlout); END;
19 Oracle External Tables Flexibility for Varied File Structures CREATE TABLE ORDER ( cust_num VARCHAR2(10), order_num VARCHAR2(20), order_total NUMBER(8,2)) ORGANIZATION EXTERNAL ( TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR ) PARALLEL 20 REJECT LIMIT UNLIMITED; Transparent schema-for-read Use fast C-based readers when possible Use native Hadoop classes otherwise Engineered to understand parallelism Map external units of parallelism to Oracle Architected for extensibility StorageHandler capability enables support for other data sources Examples: MongoDB, HBase, Oracle NoSQL DB 19
20 StorageHandlers: Extensibility Beyond HDFS Oracle Big Data SQL StorageHandlers are a metadata bridge. Hive Metastore
21 Oracle Big Data SQL Architecture Two components of Oracle Big Data SQL External Table extension Big Data SQL Server Software on Big Data Appliance
22 What gives Exadata extreme performance? SQL Small data subset quickly returned Offload Query to Exadata Storage Servers Oracle Database 12c 22
23 Introducing Oracle Big Data SQL Massively Parallel SQL Query across Oracle, Hadoop and NoSQL Hadoop & NoSQL Oracle Database 12c 23
24 Big Data Appliance X5-2 Sun Oracle X5-2L Servers with per server: 2 * 18 Core Intel Xeon E5 Processors 128 GB Memory 96TB Disk space Integrated Software (4.2): Oracle Linux 6.6 Oracle Big Data SQL 1.1* Cloudera Distribution of Apache Hadoop 5.4 EDH Edition Cloudera Manager 5.4 Oracle R Distribution Oracle NoSQL Database CE * Oracle Big Data SQL is separately licensed 24
25 Introducing Oracle Big Data SQL Massively Parallel SQL Query across Oracle, Hadoop and NoSQL SQL SQL Offload Query to Data Nodes data subset Small data subset quickly returned Offload Query to Exadata Storage Servers Hadoop & NoSQL Oracle Database 12c 25
26 Big Data SQL Server: A New Hadoop Processing Engine Processing Layer MapReduce and Hive Spark Impala Search Big Data SQL Resource Management (YARN, cgroups) Storage Layer Filesystem (HDFS) NoSQL Databases (Oracle NoSQL DB, Hbase) 26
27 Big Data SQL Query Execution How do we query Hadoop? HDFS NameNode Query compilation determines: Data locations Data structure Parallelism Fast reads using Big Data SQL Server Schema-for-read using Hadoop classes Smart Scan selects only relevant data Hive Metastore HDFS Data Node BDSQL Process filtered result Move relevant data to database Join with database tables Apply database security policies HDFSData Node BDSQL
28 Apply Advanced Security on Hadoop & NoSQL Same security policies across all data Redaction JSON Raw JSON data in Hadoop SQL Customer data in Oracle Virtual Private Database Fine-grain Access Control Hadoop Redacted data subset Oracle Database 12c DBMS_REDACT.ADD_POLICY( object_schema => 'sales', object_name => 'customer_detail', column_name => 'last_name', policy_name => 'customer_privacy', function_type => DBMS_REDACT.FULL, expression => '1=1' ); 28
29 Configuration Install Oracle Big Data SQL on the BDA using Mammoth Run the Big Data SQL-Exadata installation script on each Oracle Exadata database node Sets up connectivity from Exadata to the Big Data SQL Servers on the BDA. Installs a Hadoop client Configure directories and files Big Data SQL Agent Oracle directory objects Others
30 Directories Two Types of directories are created Common Directory must be on cluster wide shared files system Subdirectories for jar files bigdata.properties (paths,etc.) Cluster Directory(s) Configuration details for each BDA Cluster Sub directory of Common directory Oracle Directories that point to these Dirs ORACLE_BIGDATA_CONFIG Common Directory ORACLE_BIGDATA_CL_XXXX One for each Cluster directory (case sensitive)
31 Big Data SQL Agents Created by Install Script This multi-threaded agent bridges the metadata between Oracle Database and Hadoop. It launches a single JVM - instead of one for every process (which can be quite slow). create public database link BDSQL$_XXXX using 'extproc_connection_data'; (XXXX is the name of each BDA cluster from Cluster Directories create public database link BDSQL$_DEFAULT_CLUSTER using 'extproc_connection_data';
32 If Kerberos is used on BDA Must create ticket (kinit) for BDS user BDS runs as Oracle User Need to renew tickets cron Other automation to be released soon
33 Requirements - For Now Exadata Oracle Storage Servers or Exadata configured on the same InfiniBand subnet as BDA Exadata and BDA connected by InfiniBand
34 Roadmap Subsequent content subject to change!
35 Enhanced Parallelism Today Hadoop DoP linked to RDBMS DoP Lead to many idle PQ processes Required explicit declaration Next Unlink Hadoop and RDBMS DoP Automatic max Hadoop parallelism Even on serial tables An average of 40% faster Even at equivalent DoP
36 Storage Indexing Today All blocks in a query must be read from disk Large (256MB) disk I/O for each block Next Automatically create Storage Indexes in Big Data SQL Agents Check index before reading blocks Skip unnecessary I/Os An average of 65% faster Up to 100x faster for highly selective queries
37 Customer Examples 37
38 Building Customer Loyalty Company Overview Customer loyalty marketing and programs for major retailers and consumer brands Challenges Deliver personalized multi-channel content to every customer (example: Kroger s MyMagazine ) Expand to a wide variety of interaction data to build customer profiles Benefits 2x improvements in campaign performance Large-scale concurrent processing of complex SQL 70% of analysis is done in SQL, uses R as well Solution Overview Oracle Exadata X3-8 Oracle Database with Advanced Analytics Oracle ZFS Backup Appliance Big Data Appliance Next: Big Data SQL SQL Analysis R-based Analysis Machine Learning ZFS X3-8 X3-8 Source Systems (at Client) BDA
39 Thank You & Q&A
Big Data: Are you ready?
Big Data: Are you ready? Oracle Big Data SQL George Bourmas Enterprise Architect EMEA XLOB Enterprise Architects September 13, 2014 Oracle Confidential Internal/Restricted/Highly Restricted Thoughts Things
More informationOracle Big Data SQL Architectural Deep Dive
Oracle Big Data SQL Architectural Deep Dive Dan McClary, Ph.D. Big Data Product Management Oracle Safe Harbor Statement The following is intended to outline our general product direction. It is intended
More informationBig Data Management System Solution Overview
Big Data Management System Solution Overview Pascal GUY Pre Sales Architect Business Unit Systems Oracle France Copyright 2014 Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The
More informationOracle Big Data SQL. Architectural Deep Dive. Dan McClary, Ph.D. Big Data Product Management Oracle
Oracle Big Data SQL Architectural Deep Dive Dan McClary, Ph.D. Big Data Product Management Oracle Copyright 2014, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The following is
More informationBig Data SQL and Query Franchising
Big Data SQL and Query Franchising An Architecture for Query Beyond Hadoop Dan McClary, Ph.D. Big Data Product Management Oracle Copyright 2014, Oracle and/or its affiliates. All rights reserved. Safe Harbor
More informationOracle Big Data SQL Konference Data a znalosti 2015
Oracle Big Data SQL Konference Data a znalosti 2015 Jakub ILLNER Information Management Architect XLOB Enterprise Cloud Architects 23 July 2015, version 2 Agenda 1 2 3 4 5 Is SQL Dead? Introducing Oracle
More informationThe Oracle Data Mining Machine Bundle: Zero to Predictive Analytics in Two Weeks Collaborate 15 IOUG
The Oracle Data Mining Machine Bundle: Zero to Predictive Analytics in Two Weeks Collaborate 15 IOUG Presentation #730 Tim Vlamis and Dan Vlamis Vlamis Software Solutions 816-781-2880 www.vlamis.com Presentation
More informationExadata V2 + Oracle Data Mining 11g Release 2 Importing 3 rd Party (SAS) dm models
Exadata V2 + Oracle Data Mining 11g Release 2 Importing 3 rd Party (SAS) dm models Charlie Berger Sr. Director Product Management, Data Mining Technologies Oracle Corporation charlie.berger@oracle.com
More informationSun / Oracle Life Science Platform From Deluge to Discovery. 2011 Oracle Corporation
Sun / Oracle Life Science Platform From Deluge to Discovery SGI and Sun 1996 2011 Graph Algorithims Social Media We re a very tiny circle in the middle of this big universe. So it s more likely interesting
More informationSQL - the best analysis language for Big Data!
SQL - the best analysis language for Big Data! NoCOUG Winter Conference 2014 Hermann Bär, hermann.baer@oracle.com Data Warehousing Product Management, Oracle 1 The On-Going Evolution of SQL Introduction
More informationOracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
More informationStatistical Analysis of Gene Expression Data With Oracle & R (- data mining)
Statistical Analysis of Gene Expression Data With Oracle & R (- data mining) Patrick E. Hoffman Sc.D. Senior Principal Analytical Consultant pat.hoffman@oracle.com Agenda (Oracle & R Analysis) Tools Loading
More informationHow To Manage Big Data In A Microsoft Cloud (Hadoop)
Oracle Database 12c and the Future of Data Warehousing in the Era of Big Data George Lumpkin Data Warehousing Neil Mendelson Big Data & Advanced AnalyEcs Vice Presidents Server Technologies September 29,
More informationOracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich
Oracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich Agenda Introduction Old Times Exadata Big Data Oracle In-Memory Headquarters Conclusions 2 sumit
More informationSafe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
More informationOLSUG Workshop Oracle Data Mining
OLSUG Workshop Oracle Data Mining Charlie Berger Sr. Director of Product Mgmt, Life Sciences and Data Mining Oracle Corporation charlie.berger@oracle.com Dr. Lutz Hamel Asst. Professor, Computer Science
More informationBlazing BI: the Analytic Options to the Oracle Database. ODTUG Kscope 2013
Blazing BI: the Analytic Options to the Oracle Database ODTUG Kscope 2013 Dan Vlamis Tim Vlamis Vlamis Software Solutions 816-781-2880 http://www.vlamis.com Copyright 2013, Vlamis Software Solutions, Inc.
More informationIntegrate Master Data with Big Data using Oracle Table Access for Hadoop
Integrate Master Data with Big Data using Oracle Table Access for Hadoop Kuassi Mensah Oracle Corporation Redwood Shores, CA, USA Keywords: Hadoop, BigData, Hive SQL, Spark SQL, HCatalog, StorageHandler
More informationOracle Data Mining In-Database Data Mining Made Easy!
Oracle Data Mining In-Database Data Mining Made Easy! Charlie Berger Sr. Director Product Management, Data Mining and Advanced Analytics Oracle Corporation charlie.berger@oracle.com www.twitter.com/charliedatamine
More informationAnalyzing Big Data. Heartland OUG Spring Conference 2014
Analyzing Big Data Heartland OUG Spring Conference 2014 Dan Vlamis Vlamis Software Solutions 816-781-2880 http://www.vlamis.com Copyright 2014, Vlamis Software Solutions, Inc. Copyright 2014, Vlamis Software
More informationConstructing a Data Lake: Hadoop and Oracle Database United!
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
More informationCopyright 2012, Oracle and/or its affiliates. All rights reserved.
1 Oracle Big Data Appliance Releases 2.5 and 3.0 Ralf Lange Global ISV & OEM Sales Agenda Quick Overview on BDA and its Positioning Product Details and Updates Security and Encryption New Hadoop Versions
More informationBig Data Analytics with Oracle Advanced Analytics In-Database Option
Big Data Analytics with Oracle Advanced Analytics In-Database Option Charlie Berger Sr. Director Product Management, Data Mining and Advanced Analytics charlie.berger@oracle.com www.twitter.com/charliedatamine
More information1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.
1 Copyright 2011, Oracle and/or its affiliates. FPO In-Database Analytics: Predictive Analytics, Data Mining, Exadata & Business Intelligence Charlie Berger Sr. Director Product Management, Data Mining
More informationOracle Big Data Handbook
ORACLG Oracle Press Oracle Big Data Handbook Tom Plunkett Brian Macdonald Bruce Nelson Helen Sun Khader Mohiuddin Debra L. Harding David Segleau Gokula Mishra Mark F. Hornick Robert Stackowiak Keith Laker
More informationI/O Considerations in Big Data Analytics
Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very
More informationHadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
More informationOracle's In-Database Statistical Functions
Oracle 11g DB Data Warehousing Oracle's In-Database Statistical Functions OLAP Statistics Data Mining Charlie Berger Sr. Director Product Management, Data Mining Technologies
More informationBig Data Introduction
Big Data Introduction Ralf Lange Global ISV & OEM Sales 1 Copyright 2012, Oracle and/or its affiliates. All rights Conventional infrastructure 2 Copyright 2012, Oracle and/or its affiliates. All rights
More informationIntegrating Apache Spark with an Enterprise Data Warehouse
Integrating Apache Spark with an Enterprise Warehouse Dr. Michael Wurst, IBM Corporation Architect Spark/R/Python base Integration, In-base Analytics Dr. Toni Bollinger, IBM Corporation Senior Software
More informationSemantic and Data Mining Technologies. Simon See, Ph.D.,
Semantic and Data Mining Technologies Simon See, Ph.D., Introduction to Semantic Web and Business Use Cases 2 Lots of Scientific Resources NAR 2009 over 1170 databases Reuse, Recycling, Repurposing Paul
More informationOracle Big Data Building A Big Data Management System
Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following
More informationOracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationNative Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
More informationAn Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database
An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct
More informationDell Cloudera Syncsort Data Warehouse Optimization ETL Offload
Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload
More informationBig Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
More informationHadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
More informationIBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look
IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based
More informationIntroduction to Hadoop. New York Oracle User Group Vikas Sawhney
Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop
More informationOracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya
Oracle Database - Engineered for Innovation Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database 11g Release 2 Shipping since September 2009 11.2.0.3 Patch Set now
More informationOracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
More informationSession# - AaS 2.1 Title SQL On Big Data - Technology, Architecture and Roadmap
Session# - AaS 2.1 Title SQL On Big Data - Technology, Architecture and Roadmap Sumit Pal Independent Big Data and Data Science Consultant, Boston 1 Data Center World Certified Vendor Neutral Each presenter
More informationHadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
More informationQuick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine
Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine Version 3.0 Please note: This appliance is for testing and educational purposes only; it is unsupported and not
More informationTUT NoSQL Seminar (Oracle) Big Data
Timo Raitalaakso +358 40 848 0148 rafu@solita.fi TUT NoSQL Seminar (Oracle) Big Data 11.12.2012 Timo Raitalaakso MSc 2000 Work: Solita since 2001 Senior Database Specialist Oracle ACE 2012 Blog: http://rafudb.blogspot.com
More informationOracle Big Data Essentials
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 40291196 Oracle Big Data Essentials Duration: 3 Days What you will learn This Oracle Big Data Essentials training deep dives into using the
More informationData Domain Profiling and Data Masking for Hadoop
Data Domain Profiling and Data Masking for Hadoop 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or
More informationApache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com
Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationNews and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
More informationArchitecting for the Internet of Things & Big Data
Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to
More information<Insert Picture Here> Big Data
Big Data Kevin Kalmbach Principal Sales Consultant, Public Sector Engineered Systems Program Agenda What is Big Data and why it is important? What is your Big
More informationAn Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise
An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to outline our
More informationextreme Datamining mit Oracle R Enterprise
extreme Datamining mit Oracle R Enterprise Oliver Bracht Managing Director eoda Matthias Fuchs Senior Consultant ISE Information Systems Engineering GmbH extreme Datamining with Oracle R Enterprise About
More informationTE's Analytics on Hadoop and SAP HANA Using SAP Vora
TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. -
More informationMySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering
MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation
More informationUsing RDBMS, NoSQL or Hadoop?
Using RDBMS, NoSQL or Hadoop? DOAG Conference 2015 Jean- Pierre Dijcks Big Data Product Management Server Technologies Copyright 2014 Oracle and/or its affiliates. All rights reserved. Data Ingest 2 Ingest
More information2009 Oracle Corporation 1
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,
More information<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database
1 Best Practices for Extreme Performance with Data Warehousing on Oracle Database Rekha Balwada Principal Product Manager Agenda Parallel Execution Workload Management on Data Warehouse
More informationGAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.
More informationOracle Big Data Strategy Simplified Infrastrcuture
Big Data Oracle Big Data Strategy Simplified Infrastrcuture Selim Burduroğlu Global Innovation Evangelist & Architect Education & Research Industry Business Unit Oracle Confidential Internal/Restricted/Highly
More informationOracle Big Data Fundamentals Ed 1 NEW
Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big
More informationCloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationCyber Security With Big Data
Cyber Security With Big Data Fast. Complete. Cost-Effec1ve. Harry J Foxwell, PhD Principal Consultant Oracle Public Sector Oct 2015 Safe Harbor Statement The following is intended to outline our general
More informationInge Os Sales Consulting Manager Oracle Norway
Inge Os Sales Consulting Manager Oracle Norway Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database Machine Oracle & Sun Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database
More informationHortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
More informationPredictive Analytics for Better Business Intelligence
Oracle 11g DB Data Warehousing ETL OLAP Statistics Predictive Analytics for Better Business Intelligence Data Mining Charlie Berger Sr. Director Product Management, Data Mining Technologies
More informationBIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig
BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig Contents Acknowledgements... 1 Introduction to Hive and Pig... 2 Setup... 2 Exercise 1 Load Avro data into HDFS... 2 Exercise 2 Define an
More informationGoogle Bing Daytona Microsoft Research
Google Bing Daytona Microsoft Research Raise your hand Great, you can help answer questions ;-) Sit with these people during lunch... An increased number and variety of data sources that generate large
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationBig Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect
Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate
More informationBig Data Are You Ready? Thomas Kyte http://asktom.oracle.com
Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated
More informationORACLE BIG DATA APPLIANCE X3-2
ORACLE BIG DATA APPLIANCE X3-2 BIG DATA FOR THE ENTERPRISE KEY FEATURES Massively scalable infrastructure to store and manage big data Big Data Connectors delivers load rates of up to 12TB per hour between
More informationExecutive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...
Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data
More informationUsing distributed technologies to analyze Big Data
Using distributed technologies to analyze Big Data Abhijit Sharma Innovation Lab BMC Software 1 Data Explosion in Data Center Performance / Time Series Data Incoming data rates ~Millions of data points/
More informationDeep Quick-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors Mark Rittman, CTO, Rittman Mead Oracle Openworld 2014, San Francisco
Deep Quick-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors Mark Rittman, CTO, Rittman Mead Oracle Openworld 2014, San Francisco About the Speaker Mark Rittman, Co-Founder of Rittman Mead
More informationBig Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016
Big Data Approaches Making Sense of Big Data Ian Crosland Jan 2016 Accelerate Big Data ROI Even firms that are investing in Big Data are still struggling to get the most from it. Make Big Data Accessible
More informationA Performance Analysis of Distributed Indexing using Terrier
A Performance Analysis of Distributed Indexing using Terrier Amaury Couste Jakub Kozłowski William Martin Indexing Indexing Used by search
More informationArchitectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
More informationOracle Big Data Appliance X5-2
Oracle Big Data Appliance X5-2 Oracle Big Data Appliance is a high-performance, secure platform for running diverse workloads on Hadoop and NoSQL systems. With Oracle Big Data SQL, Oracle Big Data Appliance
More informationSelf-service BI for big data applications using Apache Drill
Self-service BI for big data applications using Apache Drill 2015 MapR Technologies 2015 MapR Technologies 1 Data Is Doubling Every Two Years Unstructured data will account for more than 80% of the data
More informationReal Time Fraud Detection With Sequence Mining on Big Data Platform. Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May 6 2014 Santa Clara, CA
Real Time Fraud Detection With Sequence Mining on Big Data Platform Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May 6 2014 Santa Clara, CA Open Source Big Data Eco System Query (NOSQL) : Cassandra,
More informationHadoop. http://hadoop.apache.org/ Sunday, November 25, 12
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
More informationApache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah
Apache Hadoop: The Pla/orm for Big Data Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah 1 The Problems with Current Data Systems BI Reports + Interac7ve Apps RDBMS (aggregated
More informationWell packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances
INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA
More informationApache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com
Apache Hadoop in the Enterprise Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Cloudera The Leader in Big Data Management Powered by Apache Hadoop The Leading Open Source Distribution of Apache
More informationJun Liu, Senior Software Engineer Bianny Bian, Engineering Manager SSG/STO/PAC
Jun Liu, Senior Software Engineer Bianny Bian, Engineering Manager SSG/STO/PAC Agenda Quick Overview of Impala Design Challenges of an Impala Deployment Case Study: Use Simulation-Based Approach to Design
More informationINTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe
More informationThe Hadoop Eco System Shanghai Data Science Meetup
The Hadoop Eco System Shanghai Data Science Meetup Karthik Rajasethupathy, Christian Kuka 03.11.2015 @Agora Space Overview What is this talk about? Giving an overview of the Hadoop Ecosystem and related
More informationThe Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson
The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1 A New Platform for Pervasive Analytics Multiple big data opportunities
More informationUsing MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com
Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com Agenda The rise of Big Data & Hadoop MySQL in the Big Data Lifecycle MySQL Solutions for Big Data Q&A
More informationHow to Choose Between Hadoop, NoSQL and RDBMS
How to Choose Between Hadoop, NoSQL and RDBMS Keywords: Jean-Pierre Dijcks Oracle Redwood City, CA, USA Big Data, Hadoop, NoSQL Database, Relational Database, SQL, Security, Performance Introduction A
More informationCloudera Backup and Disaster Recovery
Cloudera Backup and Disaster Recovery Important Note: Cloudera Manager 4 and CDH 4 have reached End of Maintenance (EOM) on August 9, 2015. Cloudera will not support or provide patches for any of the Cloudera
More informationExadata for Oracle DBAs. Longtime Oracle DBA
Exadata for Oracle DBAs Longtime Oracle DBA Why this Session? I m an Oracle DBA Familiar with RAC, 11gR2 and ASM About to become a Database Machine Administrator (DMA) How much do I have to learn? How
More informationHadoop Big Data for Processing Data and Performing Workload
Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer
More informationHadoop: Embracing future hardware
Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop
More informationEnabling High performance Big Data platform with RDMA
Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery
More information