Implementing the Logical Data Warehouse with Oracle Big Data SQL
|
|
|
- Arline Ellis
- 10 years ago
- Views:
Transcription
1 Implementing the Logical Data Warehouse with Oracle Big Data SQL Matthias Fuchs DWH Architekt ISE Information Systems Engineering GmbH
2 ISE Information Systems Engineering Gegründet 1991 Mitarbeiteranzahl: 60 Hauptsitz in Gräfenberg, Niederlassungen in München und Nürnberg Schwerpunkte: Oracle Engineered Systems (Exadata / Exalogic / Exalytics) Data Warehousing & Business Intelligence Oracle DB Migrationen, Optimierungen, Hochverfügbarkeit Managed Service für Datenbanken, BI und Middlewareapplikationen Oracle Partner Engineered Systems Award 2013 Copyright (C) ISE GmbH - All Rights Reserved 2
3 ISE Oracle Technology Center Copyright (C) ISE GmbH - All Rights Reserved 3
4 ISE Oracle Technology Center Erstes und einziges Exastack Technology Center in Deutschland in Nürnberg Coming soon ODA X5 Copyright (C) ISE GmbH - All Rights Reserved 4
5 Agenda LDW - Logical Datawarehouse Big Data SQL Infrastructure Sqoop - der Anfang Customer case Copyright (C) ISE GmbH - All Rights Reserved 5
6 LDW Logical Datawarehouse Copyright (C) ISE GmbH - All Rights Reserved 6
7 Logical Data Warehouse Gartner Hype Cycle for Information Infrastructure, 2012, the Logical Data Warehouse (LDW) is a new data management architecture for analytics which combines the strengths of traditional repository warehouses with alternative data management and access strategy. The LDW will form a new best practices by the end of Copyright (C) ISE GmbH - All Rights Reserved 7
8 Gartner: Logical Dataware House Repository Management Verschiedene Typen u.a. Metadaten Konsolidierung Data Virtualization Virtuelle Daten Schicht Distributed Processes Aufruf externer Prozesse z.b. Bilder oder Content Analyse, aber auch MapReduce Cloud Auditing statistics and performance Evaluation Statistik über Performance End User, Applikationen oder Verbindungen SLA Management Metadataset über erwartete Ausführungenzeiten etc. Überwachung und ggf. Änderung der Ausführung Taxonomy - Ontology resolution a taxonomy tree in an ontological forest Metadata Management Copyright (C) ISE GmbH - All Rights Reserved 8
9 Gartner: Logical Dataware House Repository Management Verschiedene Typen u.a. Metadaten Konsolidierung Data Virtualization Virtuelle Daten Schicht Distributed Processes Aufruf externer Prozesse z.b. Bilder oder Content Analyse, aber auch MapReduce Cloud Inhalte einzubeziehen Auditing statistics and performance Evaluation Statistik über Performance End User, Applikationen Höhere oder Verbindungen Flexibilität SLA Management Metadataset über erwartete Ausführungenzeiten etc. Überwachung und ggf. Änderung der Ausführung Taxonomy - Ontology resolution a taxonomy tree in an ontological forest Metadata Management Data-to-insight cycle ' schneller günstiges Framework um neue Copyright (C) ISE GmbH - All Rights Reserved 9
10 Gartner: Übersicht Aus Gartner Newsletter Logical Data Warehousing for Big Data Copyright (C) ISE GmbH - All Rights Reserved 10
11 Virtualisation & Query Federation Information Management Reference Architecture Oracle Data Reservoir & Enterprise Information Store complete view Data Sources Enterprise Performance Management Data Engines & Poly-structured sources Structured Data Sources Operational Data COTS Data Streaming & BAM Data Ingestion Access & Performance Layer Foundation Data Layer Past, current and future interpretation of enterprise data. Structured to support agile access & navigation Immutable modelled data. Business Process Neutral form. Abstracted from business process changes Pre-built & Ad-hoc BI Assets Master & Reference Data Sources Raw Data Reservoir Immutable raw data reservoir Raw data at rest is not interpreted Information Interpretation Information Services Content Docs SMS Web & Social Media Discovery Lab Sandboxes Project based data stores to support specific discovery objectives Rapid Development Sandboxes Project based data stored to facilitate rapid content / presentation delivery Data Science Auditing statistics/performance Evaluation SLA Management Copyright (C) ISE GmbH - All Rights Reserved 11
12 Big Data SQL Infrastructure Copyright (C) ISE GmbH - All Rights Reserved 12
13 Big Data Sql - Übersicht Cloudera Hadoop NOSQL R Advanced Analytics Oracle Big Data SQL Connectors ODI Exadata Advanced Analytics Advanced Security Or BigData Lite VM Copyright (C) ISE GmbH - All Rights Reserved 13
14 Big Data Systemübersicht Processing Layer Big Data SQL Resource Management YARN + MapReduce Storage Layer Filesystem (HDFS) Copyright (C) ISE GmbH - All Rights Reserved 14
15 Big Data und DB im LDW Repository Management Oracle Big Data Appliance Data Virtualization Distributed Processes Auditing statistics and performance SLA Management ODI, BPM, SOA Enterprise Metadata Management Taxonomy - Ontology resolution Copyright (C) ISE GmbH - All Rights Reserved 15
16 Sqoop - der Anfang Copyright (C) ISE GmbH - All Rights Reserved 16
17 Sqoop Sqoop = SQL- to Hadoop Paralleles kopieren von JDBC <-> HDFS MapReduce jobs zum Daten laden/schreiben HDFS DB Map Reduce Copyright (C) ISE GmbH - All Rights Reserved 17
18 Sqoop mit Oracle OraOOP Guy Harrison team Quest (Dell) Ab version (CDH 5.1) Oracle direct path (non-buffered) IO for all reads Auf mappers werden Anzahl Blöcke verteilt Bei partitionierten Tabellen, kann der Mapper pro Partition arbeiten HDFS HADOOP MAPPER ORACLE SESSION ORACLE TABLE HADOOP MAPPER ORACLE SESSION Copyright (C) ISE GmbH - All Rights Reserved 18
19 Real Time Oracle Change Data Capture Supported in 11.2 but not recommended by Oracle Desupported in 12.1 Oracle Golden Gate 1. RDBMS to HIVE 2. RDBMS to Flume 3. RDBMS to HDFS Andere Hersteller: (Dell) Quest SharePlex Auslesen redologs (VMWare) Continuent Tungsten uses CDC im Hintergrund Libelle Copyright (C) ISE GmbH - All Rights Reserved 19
20 Customer case Copyright (C) ISE GmbH - All Rights Reserved 20
21 Analyse von Infrastrukturdaten Ziel Daten von Servicecalls (OSB) auswerten Daten Historisieren Feststellen von Anomalien Mappen von Strukturierten und Unstrukturierten Daten Tabellen/View und Datei Import Auswertung mit ausgewählten Werkzeugen Analytic output R Elasticsearch YARN/MR Weblogs Flume HDFS SQOOP CC RDBMS Copyright (C) ISE GmbH - All Rights Reserved 21
22 Vorbereitung Wahl der Hadoop Distribution Cloudera Oracle supported Ohne -> sehr aufwendig Filedaten Flume Weblogic und Apache Logs Gut dokumentiert im Netz Ggf. Realtime Auswertung mit Elasticsearch or Solr Hive CDH 5.1 OCRFile Format Copyright (C) ISE GmbH - All Rights Reserved 22
23 Hive ORCFile Optimized Row Columnar File Format light-weight indexes bereits im Fileformat block-mode compression auf basis des Datentyps Größenvergleich über verschiedene Typen Encoded Text CSV File RCFile Record Columnar File Parquet Columnar Storage Format, impala ORCFile Hive TPC-DS Scale 500 Dataset GB, Hortonworks Copyright (C) ISE GmbH - All Rights Reserved 23
24 Ablauf Datenintegration Teil 1 Datenladen DB HDFS HIVE Oracle Big Data SQL Teil 2 Create Big Data SQL Layer Copyright (C) ISE GmbH - All Rights Reserved 24
25 Prozess Teil 1 DB Start sqoop job to HDFS Create external table on HDFS Files insert as select in hive ocr data table HDFS HIVE Import parallel 1, da view daten Kein primary key, keine parallelen MapReduce Prozesse Direct read notwendig, da sonst tmp Tablespace zu klein Start mit sqoop2, ende mit sqoop1 inklusiv Optimierung ODI statt oozie Copyright (C) ISE GmbH - All Rights Reserved 25
26 Prozess Teil 2 Suche Tabelle in Hive aus DB select table_name, input_format, Location from ALL_HIVE_tables where table_name like '%oem%'; Copyright (C) ISE GmbH - All Rights Reserved 26
27 Prozess Teil 2 Create Table in DB (nur in Test VM) DDL mit CREATE_EXTDDL_FOR_HIVE erzeugen DDL ausführen DDL Erzeugen dbms_hadoop.create_extddl_for_hive( CLUSTER_ID=>'bigdatalite', DB_NAME=>'default', HIVE_TABLE_NAME=>'oem_data', HIVE_PARTITION=>FALSE, TABLE_NAME=>'oem_data', PERFORM_DDL=>FALSE, TEXT_OF_DDL=>DDLout ); DDL Asuführen CREATE TABLE OEM_DATA ( target_name VARCHAR2(4000), target_guid.. key_value6 VARCHAR2(4000), collection_timestamp VARCHAR2(4000)) ORGANIZATION EXTERNAL (TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR ACCESS PARAMETERS ( com.oracle.bigdata.cluster=bigdatalite com.oracle.bigdata.tablename=default.oem_ data) ) ; Copyright (C) ISE GmbH - All Rights Reserved 27
28 Ausführungsplan Copyright (C) ISE GmbH - All Rights Reserved 28
29 Ergebnisse: Laden der Daten Daten für einen Tag ~ Zeilen/12 Spalten TXT Files ~100 G unkomprimiert Ladezeit ca. 1h aus CC DB OCR Files in hive ~ 27 M komprimiert ~ Ladezeit ca. 30 Minuten Teil 1 Type Größe Select count Where Oem_data BigDataSQL 2,8 MB 2,1 Mio 11s 8s Teil 2 Oem_data local kopiert Oracle 558 MB 2,1 Mio 0,5s 0,5s Oem_data Hive 57s 50s Copyright (C) ISE GmbH - All Rights Reserved 29
30 Lastverteilung Big Data SQL Only data retrieval (TABLE ACCESS FULL und Filter ) werden offloaded! Datenbearbeitung im DB Layer GROUP BY, ORDER BY, JOIN, PL/SQL etc BigDataSQL 2.0 (Aggregation in Hadoop?) Alternativ Connect über ODBC Tool Beschreibung Decompress CPU Filtering CPU Datatype Conversion Sqoop Hadoop Oracle Oracle Oracle SQL Connector für HDFS Big Data SQL Text Dateien HDFS oder DataPump HDFS 12c Exadata&BDA Oracle Oracle Hadoop Hadoop Hadoop ODBC Hadoop Hadoop Oracle Copyright (C) ISE GmbH - All Rights Reserved 30
31 Zusammenfassung Vorher: Exadata DB/EMC Nacher: Hadoop Integration Layer Exadata DB/EMC Copyright (C) ISE GmbH - All Rights Reserved 31
32 Q & A Copyright (C) ISE GmbH - All Rights Reserved 32
Constructing a Data Lake: Hadoop and Oracle Database United!
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
Oracle Big Data Essentials
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 40291196 Oracle Big Data Essentials Duration: 3 Days What you will learn This Oracle Big Data Essentials training deep dives into using the
Oracle Big Data Fundamentals Ed 1 NEW
Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big
Oracle Big Data SQL. Architectural Deep Dive. Dan McClary, Ph.D. Big Data Product Management Oracle
Oracle Big Data SQL Architectural Deep Dive Dan McClary, Ph.D. Big Data Product Management Oracle Copyright 2014, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The following is
Evolution of Information Management Architecture and Development
Evolution of Information Management Architecture and Development Stewart Bryson Chief Innovation Officer, Rittman Mead! Andrew Bond Head of Enterprise Architecture, Oracle EMEA Oracle Information Management
Architecting for the Internet of Things & Big Data
Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to
Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.
#TalendSandbox for Big Data
Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND
IOT & Big Data: The Future Information Processing Architecture
IOT & Big Data: The Future Information Processing Architecture Dr. Michael Faden Dirk Weise BASEL BERN BRUGG GENF LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN
Hadoop Job Oriented Training Agenda
1 Hadoop Job Oriented Training Agenda Kapil CK [email protected] Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module
Oracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
Cloudera Certified Developer for Apache Hadoop
Cloudera CCD-333 Cloudera Certified Developer for Apache Hadoop Version: 5.6 QUESTION NO: 1 Cloudera CCD-333 Exam What is a SequenceFile? A. A SequenceFile contains a binary encoding of an arbitrary number
Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
Hadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
<Insert Picture Here> Big Data
Big Data Kevin Kalmbach Principal Sales Consultant, Public Sector Engineered Systems Program Agenda What is Big Data and why it is important? What is your Big
Big Data SQL and Query Franchising
Big Data SQL and Query Franchising An Architecture for Query Beyond Hadoop Dan McClary, Ph.D. Big Data Product Management Oracle Copyright 2014, Oracle and/or its affiliates. All rights reserved. Safe Harbor
Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam [email protected]
Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam [email protected] Agenda The rise of Big Data & Hadoop MySQL in the Big Data Lifecycle MySQL Solutions for Big Data Q&A
Native Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
MySQL and Hadoop Big Data Integration
MySQL and Hadoop Big Data Integration Unlocking New Insight A MySQL White Paper December 2012 Table of Contents Introduction... 3 The Lifecycle of Big Data... 4 MySQL in the Big Data Lifecycle... 4 Acquire:
Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
Connecting Hadoop with Oracle Database
Connecting Hadoop with Oracle Database Sharon Stephen Senior Curriculum Developer Server Technologies Curriculum The following is intended to outline our general product direction.
Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies
Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Big Data, Advanced Analytics:
Datenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database
An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct
Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
Safe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
Move Data from Oracle to Hadoop and Gain New Business Insights
Move Data from Oracle to Hadoop and Gain New Business Insights Written by Lenka Vanek, senior director of engineering, Dell Software Abstract Today, the majority of data for transaction processing resides
Deep Quick-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors Mark Rittman, CTO, Rittman Mead Oracle Openworld 2014, San Francisco
Deep Quick-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors Mark Rittman, CTO, Rittman Mead Oracle Openworld 2014, San Francisco About the Speaker Mark Rittman, Co-Founder of Rittman Mead
Qsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
Copyright 2012, Oracle and/or its affiliates. All rights reserved.
1 Oracle Big Data Appliance Releases 2.5 and 3.0 Ralf Lange Global ISV & OEM Sales Agenda Quick Overview on BDA and its Positioning Product Details and Updates Security and Encryption New Hadoop Versions
Big Data Too Big To Ignore
Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction
From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten
From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten MC Brown, Director of Documentation Linas Virbalas, Senior Software Engineer. About Tungsten Replicator Open source drop-in
Database Performance with In-Memory Solutions
Database Performance with In-Memory Solutions ABS Developer Days January 17th and 18 th, 2013 Unterföhring metafinanz / Carsten Herbe The goal of this presentation is to give you an understanding of in-memory
Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
Big Data. Marriage of RDBMS-DWH and Hadoop & Co. Author: Jan Ott Trivadis AG. 2014 Trivadis. Big Data - Marriage of RDBMS-DWH and Hadoop & Co.
Big Data Marriage of RDBMS-DWH and Hadoop & Co. Author: Jan Ott Trivadis AG BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN 1 Mit über 600 IT- und Fachexperten
Are You Big Data Ready?
ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain
Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
The Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
Oracle Big Data Discovery Unlock Potential in Big Data Reservoir
Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All
Oracle R zum Anfassen: Die Themen
R zum Anfassen: Die Themen 09:30 Begrüßung 09:45 R Zum Anfassen Einführung 10:15 Minikurs in der Sprache R Sprachmittel, Hilfen, GUIs zum Erstellen der Skripte Schnell und einfach ansprechende Grafiken
Self-service BI for big data applications using Apache Drill
Self-service BI for big data applications using Apache Drill 2015 MapR Technologies 2015 MapR Technologies 1 Data Is Doubling Every Two Years Unstructured data will account for more than 80% of the data
Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview
Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce
Talend Big Data. Delivering instant value from all your data. Talend 2014 1
Talend Big Data Delivering instant value from all your data Talend 2014 1 I may say that this is the greatest factor: the way in which the expedition is equipped. Roald Amundsen race to the south pole,
Hadoop. for Oracle database professionals. Alex Gorbachev Calgary, AB September 2013
Hadoop for Oracle database professionals Alex Gorbachev Calgary, AB September 2013 Alex Gorbachev Chief Technology Officer at Pythian Blogger Cloudera Champion of Big Data OakTable Network member Oracle
SharePlex for SQL Server
SharePlex for SQL Server Improving analytics and reporting with near real-time data replication Written by Susan Wong, principal solutions architect, Dell Software Abstract Many organizations today rely
Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...
Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data
Self-service BI for big data applications using Apache Drill
Self-service BI for big data applications using Apache Drill 2015 MapR Technologies 2015 MapR Technologies 1 Management - MCS MapR Data Platform for Hadoop and NoSQL APACHE HADOOP AND OSS ECOSYSTEM Batch
Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management
Big Data and New Paradigms in Information Management Vladimir Videnovic Institute for Information Management 2 "I am certainly not an advocate for frequent and untried changes laws and institutions must
Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy
Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics
brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 PART 2 PART 3 BIG DATA PATTERNS...253 PART 4 BEYOND MAPREDUCE...385
brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 1 Hadoop in a heartbeat 3 2 Introduction to YARN 22 PART 2 DATA LOGISTICS...59 3 Data serialization working with text and beyond 61 4 Organizing and
From Relational to Hadoop Part 1: Introduction to Hadoop. Gwen Shapira, Cloudera and Danil Zburivsky, Pythian
From Relational to Hadoop Part 1: Introduction to Hadoop Gwen Shapira, Cloudera and Danil Zburivsky, Pythian Tutorial Logistics 2 Got VM? 3 Grab a USB USB contains: Cloudera QuickStart VM Slides Exercises
Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,[email protected]
Data Warehousing and Analytics Infrastructure at Facebook Ashish Thusoo & Dhruba Borthakur athusoo,[email protected] Overview Challenges in a Fast Growing & Dynamic Environment Data Flow Architecture,
HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM
HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM 1. Introduction 1.1 Big Data Introduction What is Big Data Data Analytics Bigdata Challenges Technologies supported by big data 1.2 Hadoop Introduction
Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks
Hadoop Introduction Olivier Renault Solution Engineer - Hortonworks Hortonworks A Brief History of Apache Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2013
SQL on NoSQL (and all of the data) With Apache Drill
SQL on NoSQL (and all of the data) With Apache Drill Richard Shaw Solutions Architect @aggress Who What Where NoSQL DB Very Nice People Open Source Distributed Storage & Compute Platform (up to 1000s of
Big Data Course Highlights
Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like
Luncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
Hadoop 101. Lars George. NoSQL- Ma4ers, Cologne April 26, 2013
Hadoop 101 Lars George NoSQL- Ma4ers, Cologne April 26, 2013 1 What s Ahead? Overview of Apache Hadoop (and related tools) What it is Why it s relevant How it works No prior experience needed Feel free
Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances
INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA
Oracle Big Data Strategy Simplified Infrastrcuture
Big Data Oracle Big Data Strategy Simplified Infrastrcuture Selim Burduroğlu Global Innovation Evangelist & Architect Education & Research Industry Business Unit Oracle Confidential Internal/Restricted/Highly
An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise
An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to outline our
An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP Oracle ESG Data Systems Architecture
An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP ESG Data Systems Architecture Big Data & Analytics as a Service Components Unstructured Data / Sparse Data of Value
BIG DATA HADOOP TRAINING
BIG DATA HADOOP TRAINING DURATION 40hrs AVAILABLE BATCHES WEEKDAYS (7.00AM TO 8.30AM) & WEEKENDS (10AM TO 1PM) MODE OF TRAINING AVAILABLE ONLINE INSTRUCTOR LED CLASSROOM TRAINING (MARATHAHALLI, BANGALORE)
Hadoop Meets Exadata. Presented by: Kerry Osborne. DW Global Leaders Program Decemeber, 2012
Hi Hadoop Meets Exadata Presented by: Kerry Osborne DW Global Leaders Program Decemeber, 2012 whoami Never Worked for Oracle Worked with Oracle DB Since 1982 (V2) Working with Exadata since early 2010
HDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
MySQL and Hadoop. Percona Live 2014 Chris Schneider
MySQL and Hadoop Percona Live 2014 Chris Schneider About Me Chris Schneider, Database Architect @ Groupon Spent the last 10 years building MySQL architecture for multiple companies Worked with Hadoop for
Data Warehousing Metadata Management
Data Warehousing Metadata Management Spring Term 2014 Dr. Andreas Geppert Credit Suisse [email protected] Spring 2014 Slide 1 Outline of the Course Introduction DWH Architecture DWH-Design and multi-dimensional
Data Warehousing Metadata Management
Data Warehousing Metadata Management Spring Term 2014 Dr. Andreas Geppert Credit Suisse [email protected] Spring 2014 Slide 1 Outline of the Course Introduction DWH Architecture DWH-Design and multi-dimensional
Upcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC [email protected] Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
The Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
Big Data: Are You Ready? Kevin Lancaster
Big Data: Are You Ready? Kevin Lancaster Director, Engineered Systems Oracle Europe, Middle East & Africa 1 A Data Explosion... Traditional Data Sources Billing engines Custom developed New, Non-Traditional
Hadoop and MySQL for Big Data
Hadoop and MySQL for Big Data Alexander Rubin October 9, 2013 About Me Alexander Rubin, Principal Consultant, Percona Working with MySQL for over 10 years Started at MySQL AB, Sun Microsystems, Oracle
The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson
The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1 A New Platform for Pervasive Analytics Multiple big data opportunities
A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle
A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle Growth in Data Diversity and Usage 1.8 Zettabytes of Data in 2011, 20x Growth by 2020
Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. [email protected], twicer: @awadallah
Apache Hadoop: The Pla/orm for Big Data Amr Awadallah CTO, Founder, Cloudera, Inc. [email protected], twicer: @awadallah 1 The Problems with Current Data Systems BI Reports + Interac7ve Apps RDBMS (aggregated
Where is... How do I get to...
Big Data, Fast Data, Spatial Data Making Sense of Location Data in a Smart City Hans Viehmann Product Manager EMEA ORACLE Corporation August 19, 2015 Copyright 2014, Oracle and/or its affiliates. All rights
Bringing Big Data to People
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering
MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation
Seamless Access from Oracle Database to Your Big Data
Seamless Access from Oracle Database to Your Big Data Brian Macdonald Big Data and Analytics Specialist Oracle Enterprise Architect September 24, 2015 Agenda Hadoop and SQL access methods What is Oracle
How To Manage Big Data In A Microsoft Cloud (Hadoop)
Oracle Database 12c and the Future of Data Warehousing in the Era of Big Data George Lumpkin Data Warehousing Neil Mendelson Big Data & Advanced AnalyEcs Vice Presidents Server Technologies September 29,
BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION
FACT SHEET BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION BIGDATA & HADOOP CLASS ROOM SESSION GreyCampus provides Classroom sessions for Big Data & Hadoop Developer Certification. This course will
Hadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
ITG Software Engineering
Introduction to Cloudera Course ID: Page 1 Last Updated 12/15/2014 Introduction to Cloudera Course : This 5 day course introduces the student to the Hadoop architecture, file system, and the Hadoop Ecosystem.
Ganzheitliches Datenmanagement
Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist
Real-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH
Real-time Data Analytics mit Elasticsearch Bernhard Pflugfelder inovex GmbH Bernhard Pflugfelder Big Data Engineer @ inovex Fields of interest: search analytics big data bi Working with: Lucene Solr Elasticsearch
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
Building Scalable Big Data Pipelines
Building Scalable Big Data Pipelines NOSQL SEARCH ROADSHOW ZURICH Christian Gügi, Solution Architect 19.09.2013 AGENDA Opportunities & Challenges Integrating Hadoop Lambda Architecture Lambda in Practice
Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco
Decoding the Big Data Deluge a Virtual Approach Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco High-volume, velocity and variety information assets that demand
Oracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich
Oracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich Agenda Introduction Old Times Exadata Big Data Oracle In-Memory Headquarters Conclusions 2 sumit
Next-Generation Cloud Analytics with Amazon Redshift
Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional
The Top 10 7 Hadoop Patterns and Anti-patterns. Alex Holmes @
The Top 10 7 Hadoop Patterns and Anti-patterns Alex Holmes @ whoami Alex Holmes Software engineer Working on distributed systems for many years Hadoop since 2008 @grep_alex grepalex.com what s hadoop...
Big Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
