SQL - The Goto Language for Big Data Analy&cs!

Size: px
Start display at page:

Download "SQL - The Goto Language for Big Data Analy&cs!"

Transcription

1 SQL - The Goto Language for Big Data Analy&cs! Analy&cal SQL SQL Made Great Hermann Bär, hermann.baer@oracle.com Product Management Data Warehousing 1

2 Who Am I Not?.. But Who SHOULD be Here.. Keith Laker keith.laker@oracle.com 19 years with Oracle Worked in consulsng, global support, onsite support and product mgnt. Part of Data Warehouse product management team Product Manager for analyscal SQL Based in Manchester, UK oracle-big-data blog twixer feed

3 Safe Harbor Statement The following is intended to outline our general product direc&on. It is intended for informa&on purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or funcsonality, and should not be relied upon in making purchasing decisions. The development, release, and Sming of any features or funcsonality described for Oracle s products remains at the sole discreson of Oracle. 3

4 Dear Reader Due to legal constraints we are not allowed to provide any material about Oracle s excisng upcoming new technology. We sincerely apologize for this. However, there is no reason to wait and not to use the current available version, Oracle Database 12c Release 12.1 up to its full potensal. We hope you will find this material useful. Please visit us also on OTN for further informason. Your Oracle team.

5 Agenda Some things you will be able to do today and SQL and Big Data CalculaSng approximate answers SQL made for analysis Summary some things you will be able to do tomorrow... All new features marked as. 5

6 SQL For Big Data

7 Requires Conceptual Common View of AnalyScal Big Data Architecture Language = SQL Data Streams ExecuSon InnovaSon AcSonable AcSonable AcSonable Events Insights InformaSon SQL Event Engine Data Data Factory Enterprise ReporSng Reservoir InformaSon Store Structured Enterprise Data Other Data Events & Data Discovery Lab Discovery Output 7

8 Approximate result-sets When good enough is in fact good enough

9 Exploring Today s Big Data Lakes Key business challenges Many queries rely on counts and/or stassscal calculasons NDVs, Pareto s 80:20 rule, idensfying outliers etc. Exact processing of large data sets is resource intensive Exploratory queries don t require completely accurate result Trending analysis, social analysis, sessionizason analyscs Oracle s solu&ons Provide approximate result capabilises in SQL Key objec&ves Return approximate results faster, minimal deviason from actual Use fewer resources, allowing more queries to run 9

10 Genng Approximate Uniqueness Counts Answer How many type quessons How many unique sessions today How many unique customers logged on How many unique events occurred COUNT (DISTINCT...) Returns the exact number of rows that contain dissnct values of specified expression Can be resource intensive because requires sorsng significantly faster solu&on is APPROX_COUNT_DISTINCT (expr) Processes large amounts of data significantly faster Uses HyperLogLog algorithm Negligible deviason from exact result Ignores rows containing null values Supports any scalar data type Does not support BFILE, BLOB, CLOB, LONG, LONG RAW, or NCLOB 10

11 Accuracy and Performance Results for accuracy Real world customer workload Accuracy that is typically 97% with Performance Results Real world customer workload 5-50x improvement 95% confidence Notes: This approach does not use sampling, it uses a hashbased approach Ignores rows that contain a null value for specified expression Supports any scalar data type other than BFILE, BLOB, CLOB, LONG, LONG RAW, or NCLOB 11

12 Count DisSnct Processing

13 Count DisSnct Processing COUNT(DISTINCT ) processing: SORT operasons 8GB of memory (PGA) 164GB of temp 13

14 Approximate Query Processing

15 Approximate Query Processing With approx query processing: No sort Only 540MB PGA Zero temp 1. 50x Faster 2. 15X Less Memory 3. No temp 15

16 Approximate Count DisSnct Comparison of COUNT(DISTINCT) with APPROX_COUNT_DISTINCT Independent exemplary performance and accuracy analysis Data courtesy of ChrisSan Antognini High accuracy with superior performance, using bounded memory For complete test details, see hxp://antognini.ch/2014/10/the-approx_count_dissnct-funcson-a-test-case/

17 SQL is made for Analysis Simpler code, faster results Copyright 2015, Oracle and/or its affiliates. All rights reserved. 17

18 The On-Going EvoluSon of AnalyScal SQL Introduction of window functions Statistical functions SQL model clause Partition Outer Join Data mining Pattern matching Top N clause Approx. count distinct JSON support 8i 9i 10g 11g 12c Enhanced window functions (percentile, etc) Rollup, grouping sets, cube SQL Pivot Recursive WITH ListAgg Nth value window Slide - 18

19 Making Code Simpler: Pattern Matching with SQL Java vs. SQL: Searching for W Patterns in Stock Trade Data package pigstuff; import java.io.ioexception; import private java.util.arraylist; class V0Line { import java.util.iterator; String state = null; public import org.apache.pig.evalfunc; String[] setstate(v0line attributes; lineprev, V0Line linenext) { import private org.apache.pig.pigexception; String boolean prev eq(string = " ; a, String b) { import org.apache.pig.backend.executionengine.execexception; private String boolean next = gt(string ; a, String b) { import org.apache.pig.data.bagfactory; public V0Line(String[] atts) { import org.apache.pig.data.databag; public attributes Tuple exec(tuple = atts; input) throws IOException { import org.apache.pig.data.datatype; } import import org.apache.pig.data.tuplefactory; public public String[] Schema getattributes() outputschema(schema { input) { import org.apache.pig.impl.logicallayer.frontendexception; return attributes; Schema.FieldSchema linenumber = new import org.apache.pig.impl.logicallayer.schema.schema; } Schema.FieldSchema("linenumber", DataType.CHARARRAY); /** Schema.FieldSchema pbykey = new * public void setstate(string state) { Schema.FieldSchema("pbykey", DataType.CHARARRAY); nbayliss this.state Schema.FieldSchema = state; count = new Schema.FieldSchema("count", */ } DataType.LONG); } } Schema tupleschema = new Schema(); tupleschema.add(linenumber); tupleschema.add(pbykey); tupleschema.add(count); return new Schema(tupleSchema); 250+ Lines of Java and PIG Find a W-shape paxern in a Scker stream: Output the beginning and ending date of the paxern Calculate average price each the W-shape Find only paxerns that lasted less than a week 19

20 Making Code Simpler: Pattern Matching with SQL Java vs. SQL: Searching for W Patterns in Stock Trade Data package pigstuff; import java.io.ioexception; import private java.util.arraylist; class V0Line { import java.util.iterator; String state = null; public import org.apache.pig.evalfunc; String[] setstate(v0line attributes; lineprev, V0Line linenext) { import private org.apache.pig.pigexception; String boolean prev eq(string = " ; a, String b) { import org.apache.pig.backend.executionengine.execexception; private String boolean next = gt(string ; a, String b) { import org.apache.pig.data.bagfactory; public V0Line(String[] atts) { import org.apache.pig.data.databag; public attributes Tuple exec(tuple = atts; input) throws IOException { import org.apache.pig.data.datatype; } import import org.apache.pig.data.tuplefactory; public public String[] Schema getattributes() outputschema(schema { input) { import org.apache.pig.impl.logicallayer.frontendexception; return attributes; Schema.FieldSchema linenumber = new import org.apache.pig.impl.logicallayer.schema.schema; } Schema.FieldSchema("linenumber", DataType.CHARARRAY); /** Schema.FieldSchema pbykey = new * public void setstate(string state) { Schema.FieldSchema("pbykey", DataType.CHARARRAY); nbayliss this.state Schema.FieldSchema = state; count = new Schema.FieldSchema("count", */ } DataType.LONG); } } Schema tupleschema = new Schema(); tupleschema.add(linenumber); tupleschema.add(pbykey); tupleschema.add(count); return new Schema(tupleSchema); SELECT first_x, last_z FROM ticker MATCH_RECOGNIZE ( PARTITION BY name ORDER BY time MEASURES FIRST(x.time) AS first_x, LAST(z.time) AS last_z ONE ROW PER MATCH PATTERN (X+ Y+ W+ Z+) DEFINE X AS (price < PREV(price)), Y AS (price > PREV(price)), W AS (price < PREV(price)), Z AS (price > PREV(price) AND z.time - FIRST(x.time) <= 7 )) 250+ Lines of Java and PIG SQL - 20x less code, 5x faster 12 Lines of SQL 20

21 Data Streams ExecuSon InnovaSon AcSonable Events Event Engine Events & Data Data Reservoir AcSonable Insights Discovery Lab Data Factory Discovery Output Enterprise InformaSon Store AcSonable InformaSon ReporSng Structured Enterprise Data Other Data SQL is Made for AnalyScs SQL analy&cs for all your data delivers faster processing, richer analyscs over all your data relasonal, Hadoop, Hive, NoSQL, event streams etc Faster and Smarter SQL Faster and resource efficient and processing for counsng and stassscal driven queries Richer SQL New richer funcsons for deeper analysis of big data, IoT data sets MATCH_RECOGNIZE 21

22 Safe Harbor Statement The preceding is intended to outline our general product direcson. It is intended for informason purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or funcsonality, and should not be relied upon in making purchasing decisions. The development, release, and Sming of any features or funcsonality described for Oracle s products remains at the sole discreson of Oracle. 22

23 23

SQL - the best analysis language for Big Data!

SQL - the best analysis language for Big Data! SQL - the best analysis language for Big Data! NoCOUG Winter Conference 2014 Hermann Bär, hermann.baer@oracle.com Data Warehousing Product Management, Oracle 1 The On-Going Evolution of SQL Introduction

More information

Oracle Database 12c Plug In. Switch On. Get SMART.

Oracle Database 12c Plug In. Switch On. Get SMART. Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.

More information

DW & Big Data on your smartphone

DW & Big Data on your smartphone DW & Big Data on your smartphone Smartphone app helping you get the most from this year s OpenWorld h7p://:nyurl.com/kmbsxbu Access to all the most important informaaon Presenter profiles Must- see sessions

More information

Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction to Pig

Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction to Pig Introduction to Pig Agenda What is Pig? Key Features of Pig The Anatomy of Pig Pig on Hadoop Pig Philosophy Pig Latin Overview Pig Latin Statements Pig Latin: Identifiers Pig Latin: Comments Data Types

More information

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All

More information

Facebook s Petabyte Scale Data Warehouse using Hive and Hadoop

Facebook s Petabyte Scale Data Warehouse using Hive and Hadoop Facebook s Petabyte Scale Data Warehouse using Hive and Hadoop Why Another Data Warehousing System? Data, data and more data 200GB per day in March 2008 12+TB(compressed) raw data per day today Trends

More information

Big Data Management System Solution Overview

Big Data Management System Solution Overview Big Data Management System Solution Overview Pascal GUY Pre Sales Architect Business Unit Systems Oracle France Copyright 2014 Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The

More information

Connecting Hadoop with Oracle Database

Connecting Hadoop with Oracle Database Connecting Hadoop with Oracle Database Sharon Stephen Senior Curriculum Developer Server Technologies Curriculum The following is intended to outline our general product direction.

More information

3.GETTING STARTED WITH ORACLE8i

3.GETTING STARTED WITH ORACLE8i Oracle For Beginners Page : 1 3.GETTING STARTED WITH ORACLE8i Creating a table Datatypes Displaying table definition using DESCRIBE Inserting rows into a table Selecting rows from a table Editing SQL buffer

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

How To Manage Big Data In A Microsoft Cloud (Hadoop)

How To Manage Big Data In A Microsoft Cloud (Hadoop) Oracle Database 12c and the Future of Data Warehousing in the Era of Big Data George Lumpkin Data Warehousing Neil Mendelson Big Data & Advanced AnalyEcs Vice Presidents Server Technologies September 29,

More information

Oracle Big Data SQL Architectural Deep Dive

Oracle Big Data SQL Architectural Deep Dive Oracle Big Data SQL Architectural Deep Dive Dan McClary, Ph.D. Big Data Product Management Oracle Safe Harbor Statement The following is intended to outline our general product direction. It is intended

More information

Big Data: Are you ready?

Big Data: Are you ready? Big Data: Are you ready? Oracle Big Data SQL George Bourmas Enterprise Architect EMEA XLOB Enterprise Architects September 13, 2014 Oracle Confidential Internal/Restricted/Highly Restricted Thoughts Things

More information

Oracle Database 12c: Introduction to SQL Ed 1.1

Oracle Database 12c: Introduction to SQL Ed 1.1 Oracle University Contact Us: 1.800.529.0165 Oracle Database 12c: Introduction to SQL Ed 1.1 Duration: 5 Days What you will learn This Oracle Database: Introduction to SQL training helps you write subqueries,

More information

Using distributed technologies to analyze Big Data

Using distributed technologies to analyze Big Data Using distributed technologies to analyze Big Data Abhijit Sharma Innovation Lab BMC Software 1 Data Explosion in Data Center Performance / Time Series Data Incoming data rates ~Millions of data points/

More information

Teradata Unified Big Data Architecture

Teradata Unified Big Data Architecture Teradata Unified Big Data Architecture Agenda Recap the challenges of Big Analytics The 2 analytical gaps for most enterprises Teradata Unified Data Architecture - How we bridge the gaps - The 3 core elements

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

COSC 6397 Big Data Analytics. 2 nd homework assignment Pig and Hive. Edgar Gabriel Spring 2015

COSC 6397 Big Data Analytics. 2 nd homework assignment Pig and Hive. Edgar Gabriel Spring 2015 COSC 6397 Big Data Analytics 2 nd homework assignment Pig and Hive Edgar Gabriel Spring 2015 2 nd Homework Rules Each student should deliver Source code (.java files) Documentation (.pdf,.doc,.tex or.txt

More information

Safe Harbor Statement

Safe Harbor Statement Defining a Roadmap to Big Data Success Robert Stackowiak, Oracle Vice President, Big Data 17 November 2015 Safe Harbor Statement The following is intended to outline our general product direction. It is

More information

Genera&ng Value from Big Data in the Internet of Things

Genera&ng Value from Big Data in the Internet of Things Genera&ng Value from Big Data in the Internet of Things THT10421 Cheng Kian Khor Global Industry Solu&on Leader - IoT/M2M for CSPs Communica&ons & Media Industry Solu&ons Group September 30, 2014 Safe

More information

Teradata s Big Data Technology Strategy & Roadmap

Teradata s Big Data Technology Strategy & Roadmap Teradata s Big Data Technology Strategy & Roadmap Artur Borycki, Director International Solutions Marketing 18 March 2014 Agenda > Introduction and level-set > Enabling the Logical Data Warehouse > Any

More information

Architecting for the Internet of Things & Big Data

Architecting for the Internet of Things & Big Data Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to

More information

Oracle SQL. Course Summary. Duration. Objectives

Oracle SQL. Course Summary. Duration. Objectives Oracle SQL Course Summary Identify the major structural components of the Oracle Database 11g Create reports of aggregated data Write SELECT statements that include queries Retrieve row and column data

More information

Zebra and MapReduce. Table of contents. 1 Overview...2 2 Hadoop MapReduce APIs...2 3 Zebra MapReduce APIs...2 4 Zebra MapReduce Examples...

Zebra and MapReduce. Table of contents. 1 Overview...2 2 Hadoop MapReduce APIs...2 3 Zebra MapReduce APIs...2 4 Zebra MapReduce Examples... Table of contents 1 Overview...2 2 Hadoop MapReduce APIs...2 3 Zebra MapReduce APIs...2 4 Zebra MapReduce Examples... 2 1. Overview MapReduce allows you to take full advantage of Zebra's capabilities.

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

The Internet of Things and Big Data: Intro

The Internet of Things and Big Data: Intro The Internet of Things and Big Data: Intro John Berns, Solutions Architect, APAC - MapR Technologies April 22 nd, 2014 1 What This Is; What This Is Not It s not specific to IoT It s not about any specific

More information

SQL the natural language for analysis ORACLE WHITE PAPER JUNE 2015

SQL the natural language for analysis ORACLE WHITE PAPER JUNE 2015 SQL the natural language for analysis ORACLE WHITE PAPER JUNE 2015 Contents Overview 1 Introduction 1 Powerful Framework 1 Simplified Optimization 6 Continuous Evolution 8 Standards Based 9 Why Oracle

More information

Continuous Integration Part 2

Continuous Integration Part 2 1 Continuous Integration Part 2 This blog post is a follow up to my blog post Continuous Integration (CI), in which I described how to execute test cases in Code Tester (CT) in a CI environment. What I

More information

MOC 20461C: Querying Microsoft SQL Server. Course Overview

MOC 20461C: Querying Microsoft SQL Server. Course Overview MOC 20461C: Querying Microsoft SQL Server Course Overview This course provides students with the knowledge and skills to query Microsoft SQL Server. Students will learn about T-SQL querying, SQL Server

More information

CS54100: Database Systems

CS54100: Database Systems CS54100: Database Systems Cloud Databases: The Next Post- Relational World 18 April 2012 Prof. Chris Clifton Beyond RDBMS The Relational Model is too limiting! Simple data model doesn t capture semantics

More information

Delivery Method: Instructor-led, group-paced, classroom-delivery learning model with structured, hands-on activities.

Delivery Method: Instructor-led, group-paced, classroom-delivery learning model with structured, hands-on activities. Course Code: Title: Format: Duration: SSD024 Oracle 11g DBA I Instructor led 5 days Course Description Through hands-on experience administering an Oracle 11g database, you will gain an understanding of

More information

This presentation is for informational purposes only and may not be incorporated into a contract or agreement.

This presentation is for informational purposes only and may not be incorporated into a contract or agreement. This presentation is for informational purposes only and may not be incorporated into a contract or agreement. The following is intended to outline our general product direction. It is intended for information

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

Oracle Architecture, Concepts & Facilities

Oracle Architecture, Concepts & Facilities COURSE CODE: COURSE TITLE: CURRENCY: AUDIENCE: ORAACF Oracle Architecture, Concepts & Facilities 10g & 11g Database administrators, system administrators and developers PREREQUISITES: At least 1 year of

More information

MapReduce. MapReduce and SQL Injections. CS 3200 Final Lecture. Introduction. MapReduce. Programming Model. Example

MapReduce. MapReduce and SQL Injections. CS 3200 Final Lecture. Introduction. MapReduce. Programming Model. Example MapReduce MapReduce and SQL Injections CS 3200 Final Lecture Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. OSDI'04: Sixth Symposium on Operating System Design

More information

Data Domain Profiling and Data Masking for Hadoop

Data Domain Profiling and Data Masking for Hadoop Data Domain Profiling and Data Masking for Hadoop 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or

More information

The Five Most Common Big Data Integration Mistakes To Avoid O R A C L E W H I T E P A P E R A P R I L 2 0 1 5

The Five Most Common Big Data Integration Mistakes To Avoid O R A C L E W H I T E P A P E R A P R I L 2 0 1 5 The Five Most Common Big Data Integration Mistakes To Avoid O R A C L E W H I T E P A P E R A P R I L 2 0 1 5 Executive Summary Big Data projects have fascinated business executives with the promise of

More information

Oracle Big Data SQL Konference Data a znalosti 2015

Oracle Big Data SQL Konference Data a znalosti 2015 Oracle Big Data SQL Konference Data a znalosti 2015 Jakub ILLNER Information Management Architect XLOB Enterprise Cloud Architects 23 July 2015, version 2 Agenda 1 2 3 4 5 Is SQL Dead? Introducing Oracle

More information

Introducing Oracle Exalytics In-Memory Machine

Introducing Oracle Exalytics In-Memory Machine Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle

More information

Oracle Database 10g: Introduction to SQL

Oracle Database 10g: Introduction to SQL Oracle University Contact Us: 1.800.529.0165 Oracle Database 10g: Introduction to SQL Duration: 5 Days What you will learn This course offers students an introduction to Oracle Database 10g database technology.

More information

Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services. By Ajay Goyal Consultant Scalability Experts, Inc.

Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services. By Ajay Goyal Consultant Scalability Experts, Inc. Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services By Ajay Goyal Consultant Scalability Experts, Inc. June 2009 Recommendations presented in this document should be thoroughly

More information

OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS)

OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS) Use Data from a Hadoop Cluster with Oracle Database Hands-On Lab Lab Structure Acronyms: OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS) All files are

More information

Cost-Effective Business Intelligence with Red Hat and Open Source

Cost-Effective Business Intelligence with Red Hat and Open Source Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,

More information

Real World Hadoop Use Cases

Real World Hadoop Use Cases Real World Hadoop Use Cases JFokus 2013, Stockholm Eva Andreasson, Cloudera Inc. Lars Sjödin, King.com 1 2012 Cloudera, Inc. Agenda Recap of Big Data and Hadoop Analyzing Twitter feeds with Hadoop Real

More information

Chapter 9 Joining Data from Multiple Tables. Oracle 10g: SQL

Chapter 9 Joining Data from Multiple Tables. Oracle 10g: SQL Chapter 9 Joining Data from Multiple Tables Oracle 10g: SQL Objectives Identify a Cartesian join Create an equality join using the WHERE clause Create an equality join using the JOIN keyword Create a non-equality

More information

SQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases.

SQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases. SQL Databases Course by Applied Technology Research Center. 23 September 2015 This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases. Oracle Topics This Oracle Database: SQL

More information

SQL Server to Oracle A Database Migration Roadmap

SQL Server to Oracle A Database Migration Roadmap SQL Server to Oracle A Database Migration Roadmap Louis Shih Superior Court of California County of Sacramento Oracle OpenWorld 2010 San Francisco, California Agenda Introduction Institutional Background

More information

Fact Sheet In-Memory Analysis

Fact Sheet In-Memory Analysis Fact Sheet In-Memory Analysis 1 Copyright Yellowfin International 2010 Contents In Memory Overview...3 Benefits...3 Agile development & rapid delivery...3 Data types supported by the In-Memory Database...4

More information

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer Alejandro Vaisman Esteban Zimanyi Data Warehouse Systems Design and Implementation ^ Springer Contents Part I Fundamental Concepts 1 Introduction 3 1.1 A Historical Overview of Data Warehousing 4 1.2 Spatial

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Introduction To Hive

Introduction To Hive Introduction To Hive How to use Hive in Amazon EC2 CS 341: Project in Mining Massive Data Sets Hyung Jin(Evion) Kim Stanford University References: Cloudera Tutorials, CS345a session slides, Hadoop - The

More information

BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig

BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig Contents Acknowledgements... 1 Introduction to Hive and Pig... 2 Setup... 2 Exercise 1 Load Avro data into HDFS... 2 Exercise 2 Define an

More information

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process ORACLE OLAP KEY FEATURES AND BENEFITS FAST ANSWERS TO TOUGH QUESTIONS EASILY KEY FEATURES & BENEFITS World class analytic engine Superior query performance Simple SQL access to advanced analytics Enhanced

More information

DATA WAREHOUSING - OLAP

DATA WAREHOUSING - OLAP http://www.tutorialspoint.com/dwh/dwh_olap.htm DATA WAREHOUSING - OLAP Copyright tutorialspoint.com Online Analytical Processing Server OLAP is based on the multidimensional data model. It allows managers,

More information

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.

More information

f...-. I enterprise Amazon SimpIeDB Developer Guide Scale your application's database on the cloud using Amazon SimpIeDB Prabhakar Chaganti Rich Helms

f...-. I enterprise Amazon SimpIeDB Developer Guide Scale your application's database on the cloud using Amazon SimpIeDB Prabhakar Chaganti Rich Helms Amazon SimpIeDB Developer Guide Scale your application's database on the cloud using Amazon SimpIeDB Prabhakar Chaganti Rich Helms f...-. I enterprise 1 3 1 1 I ; i,acaessiouci' cxperhs;;- diotiilea PUBLISHING

More information

Integrating a Big Data Platform into Government:

Integrating a Big Data Platform into Government: Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government

More information

Spark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. www.spark- project.org. University of California, Berkeley UC BERKELEY

Spark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. www.spark- project.org. University of California, Berkeley UC BERKELEY Spark in Action Fast Big Data Analytics using Scala Matei Zaharia University of California, Berkeley www.spark- project.org UC BERKELEY My Background Grad student in the AMP Lab at UC Berkeley» 50- person

More information

Parquet. Columnar storage for the people

Parquet. Columnar storage for the people Parquet Columnar storage for the people Julien Le Dem @J_ Processing tools lead, analytics infrastructure at Twitter Nong Li nong@cloudera.com Software engineer, Cloudera Impala Outline Context from various

More information

HiBench Introduction. Carson Wang (carson.wang@intel.com) Software & Services Group

HiBench Introduction. Carson Wang (carson.wang@intel.com) Software & Services Group HiBench Introduction Carson Wang (carson.wang@intel.com) Agenda Background Workloads Configurations Benchmark Report Tuning Guide Background WHY Why we need big data benchmarking systems? WHAT What is

More information

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical

More information

Apache Pig Joining Data-Sets

Apache Pig Joining Data-Sets 2012 coreservlets.com and Dima May Apache Pig Joining Data-Sets Originals of slides and source code for examples: http://www.coreservlets.com/hadoop-tutorial/ Also see the customized Hadoop training courses

More information

An Oracle White Paper June 2013. Patterns everywhere Find them fast! SQL pattern matching in Oracle Database 12c

An Oracle White Paper June 2013. Patterns everywhere Find them fast! SQL pattern matching in Oracle Database 12c An Oracle White Paper June 2013 Patterns everywhere Find them fast! SQL pattern matching in Oracle Database 12c Executive Overview... 2 Introduction to pattern matching... 3 Patterns everywhere... 3 How

More information

Cloudera Certified Developer for Apache Hadoop

Cloudera Certified Developer for Apache Hadoop Cloudera CCD-333 Cloudera Certified Developer for Apache Hadoop Version: 5.6 QUESTION NO: 1 Cloudera CCD-333 Exam What is a SequenceFile? A. A SequenceFile contains a binary encoding of an arbitrary number

More information

Optimizing Your Data Warehouse Design for Superior Performance

Optimizing Your Data Warehouse Design for Superior Performance Optimizing Your Data Warehouse Design for Superior Performance Lester Knutsen, President and Principal Database Consultant Advanced DataTools Corporation Session 2100A The Problem The database is too complex

More information

Disrupt or be disrupted IT Driving Business Transformation

Disrupt or be disrupted IT Driving Business Transformation Disrupt or be disrupted IT Driving Business Transformation Gokula Mishra VP, Big Data & Advanced Analytics Business Analytics Product Group Copyright 2014 Oracle and/or its affiliates. All rights reserved.

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014

AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014 AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014 Career Details Duration 105 hours Prerequisites This career requires that you meet the following prerequisites: Working knowledge

More information

Internals of Hadoop Application Framework and Distributed File System

Internals of Hadoop Application Framework and Distributed File System International Journal of Scientific and Research Publications, Volume 5, Issue 7, July 2015 1 Internals of Hadoop Application Framework and Distributed File System Saminath.V, Sangeetha.M.S Abstract- Hadoop

More information

Oracle Big Data Building A Big Data Management System

Oracle Big Data Building A Big Data Management System Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following

More information

How To Use Big Data For Telco (For A Telco)

How To Use Big Data For Telco (For A Telco) ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA David Vanderfeesten, Bell Labs Belgium ANNO 2012 YOUR DATA IS MONEY BIG MONEY! Your click stream, your activity stream, your electricity consumption, your call

More information

Big Data Too Big To Ignore

Big Data Too Big To Ignore Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction

More information

Introducing Microsoft SQL Server 2012 Getting Started with SQL Server Management Studio

Introducing Microsoft SQL Server 2012 Getting Started with SQL Server Management Studio Querying Microsoft SQL Server 2012 Microsoft Course 10774 This 5-day instructor led course provides students with the technical skills required to write basic Transact-SQL queries for Microsoft SQL Server

More information

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features Charlie Berger, MS Eng, MBA Sr. Director Product Management, Data Mining and Advanced Analytics charlie.berger@oracle.com www.twitter.com/charliedatamine

More information

Querying Microsoft SQL Server

Querying Microsoft SQL Server Course 20461C: Querying Microsoft SQL Server Module 1: Introduction to Microsoft SQL Server 2014 This module introduces the SQL Server platform and major tools. It discusses editions, versions, tools used

More information

Partitioning under the hood in MySQL 5.5

Partitioning under the hood in MySQL 5.5 Partitioning under the hood in MySQL 5.5 Mattias Jonsson, Partitioning developer Mikael Ronström, Partitioning author Who are we? Mikael is a founder of the technology behind NDB

More information

Building a BI Solution in the Cloud

Building a BI Solution in the Cloud Building a BI Solution in the Cloud Stacia Varga, Principal Consultant Email: stacia@datainspirations.com Twitter: @_StaciaV_ 2 SQLSaturday #467 Sponsors Stacia (Misner) Varga Over 30 years of IT experience,

More information

Oracle Database: SQL and PL/SQL Fundamentals NEW

Oracle Database: SQL and PL/SQL Fundamentals NEW Oracle University Contact Us: + 38516306373 Oracle Database: SQL and PL/SQL Fundamentals NEW Duration: 5 Days What you will learn This Oracle Database: SQL and PL/SQL Fundamentals training delivers the

More information

Data-cubing made-simple! with Spark, Algebird and HBase goo.gl/dbgr0h

Data-cubing made-simple! with Spark, Algebird and HBase goo.gl/dbgr0h Data-cubing made-simple! with Spark, Algebird and HBase goo.gl/dbgr0h Vidmantas Zemleris Agenda Intro Analytics at Vinted What is Data-cubing? Why did we build it? Architecture Preliminaries Metric computation

More information

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS 464 Spring 2003 Topic 23 Database

More information

MapR: Best Solution for Customer Success

MapR: Best Solution for Customer Success 2015 MapR Technologies 2015 MapR Technologies 1 MapR: Best Solution for Customer Success Best Product High Growth 700+ Customers Premier Investors Apache Open Source 2X 2X Growth In Direct Customers Growth

More information

SQL. Short introduction

SQL. Short introduction SQL Short introduction 1 Overview SQL, which stands for Structured Query Language, is used to communicate with a database. Through SQL one can create, manipulate, query and delete tables and contents.

More information

Oracle Big Data SQL. Architectural Deep Dive. Dan McClary, Ph.D. Big Data Product Management Oracle

Oracle Big Data SQL. Architectural Deep Dive. Dan McClary, Ph.D. Big Data Product Management Oracle Oracle Big Data SQL Architectural Deep Dive Dan McClary, Ph.D. Big Data Product Management Oracle Copyright 2014, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The following is

More information

Data warehousing in Oracle. SQL extensions for data warehouse analysis. Available OLAP functions. Physical aggregation example

Data warehousing in Oracle. SQL extensions for data warehouse analysis. Available OLAP functions. Physical aggregation example Data warehousing in Oracle Materialized views and SQL extensions to analyze data in Oracle data warehouses SQL extensions for data warehouse analysis Available OLAP functions Computation windows window

More information

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc. Oracle9i Data Warehouse Review Robert F. Edwards Dulcian, Inc. Agenda Oracle9i Server OLAP Server Analytical SQL Data Mining ETL Warehouse Builder 3i Oracle 9i Server Overview 9i Server = Data Warehouse

More information

Chapter 13: Query Processing. Basic Steps in Query Processing

Chapter 13: Query Processing. Basic Steps in Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices

Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices Proc. of Int. Conf. on Advances in Computer Science, AETACS Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices Ms.Archana G.Narawade a, Mrs.Vaishali Kolhe b a PG student, D.Y.Patil

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

Data Warehouse design

Data Warehouse design Data Warehouse design Design of Enterprise Systems University of Pavia 21/11/2013-1- Data Warehouse design DATA PRESENTATION - 2- BI Reporting Success Factors BI platform success factors include: Performance

More information

Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services Length: Delivery Method: 3 Days Instructor-led (classroom) About this Course Elements of this syllabus are subject

More information

Introduction and Overview for Oracle 11G 4 days Weekends

Introduction and Overview for Oracle 11G 4 days Weekends Introduction and Overview for Oracle 11G 4 days Weekends The uses of SQL queries Why SQL can be both easy and difficult Recommendations for thorough testing Enhancing query performance Query optimization

More information

Prof. Edwar Saliba Júnior

Prof. Edwar Saliba Júnior package Conexao; 2 3 /** 4 * 5 * @author Cynthia Lopes 6 * @author Edwar Saliba Júnior 7 */ 8 import java.io.filenotfoundexception; 9 import java.io.ioexception; 10 import java.sql.sqlexception; 11 import

More information

An Oracle White Paper June 2013. Migrating Applications and Databases with Oracle Database 12c

An Oracle White Paper June 2013. Migrating Applications and Databases with Oracle Database 12c An Oracle White Paper June 2013 Migrating Applications and Databases with Oracle Database 12c Disclaimer The following is intended to outline our general product direction. It is intended for information

More information

Big Data: Using ArcGIS with Apache Hadoop. Erik Hoel and Mike Park

Big Data: Using ArcGIS with Apache Hadoop. Erik Hoel and Mike Park Big Data: Using ArcGIS with Apache Hadoop Erik Hoel and Mike Park Outline Overview of Hadoop Adding GIS capabilities to Hadoop Integrating Hadoop with ArcGIS Apache Hadoop What is Hadoop? Hadoop is a scalable

More information

Big Data and Its Impact on the Data Warehousing Architecture

Big Data and Its Impact on the Data Warehousing Architecture Big Data and Its Impact on the Data Warehousing Architecture Sponsored by SAP Speaker: Wayne Eckerson, Director of Research, TechTarget Wayne Eckerson: Hi my name is Wayne Eckerson, I am Director of Research

More information

Oracle Big Data Spatial & Graph Social Network Analysis - Case Study

Oracle Big Data Spatial & Graph Social Network Analysis - Case Study Oracle Big Data Spatial & Graph Social Network Analysis - Case Study Mark Rittman, CTO, Rittman Mead OTN EMEA Tour, May 2016 info@rittmanmead.com www.rittmanmead.com @rittmanmead About the Speaker Mark

More information

Oracle Database 12c: SQL Tuning for Developers. Sobre o curso. Destinatários. Oracle - Linguagens. Nível: Avançado Duração: 18h

Oracle Database 12c: SQL Tuning for Developers. Sobre o curso. Destinatários. Oracle - Linguagens. Nível: Avançado Duração: 18h Oracle Database 12c: SQL Tuning for Developers Oracle - Linguagens Nível: Avançado Duração: 18h Sobre o curso In the Oracle Database: SQL Tuning for Developers course, you learn about Oracle SQL tuning

More information