Lofan Abrams Data Services for Big Data Session # 2987
|
|
|
- Opal Emmeline Walker
- 10 years ago
- Views:
Transcription
1 Lofan Abrams Data Services for Big Data Session # 2987
2 Big Data Are you ready for blast-off?
3 Big Data, for better or worse: 90% of world s data generated over last two years. ScienceDaily, ScienceDaily May 22, 2013.
4 Barriers to operational effectiveness Scattered Information Scattered Information Heterogeneous / Complex Sources Data explosion Trustworthiness of Information Handling Unstructured Content Stored Data Structured 15% Unstructured 85% SAP 2008 / Page 4
5 SAP Solutions for Enterprise Information Management Information Ready for Action Analytics Business Processes Before After Data Quality Management Data Integration GOVER N Master Data Management Big Data & IoT Data Discovery Information Lifecycle Management Content Management Compliance
6 SAP SOLUTIONS FOR TRUSTED DATA PROVEN LEADER IN EVERY CATEGORY
7 Data Services and Big Data Sources Hadoop MongoDB Google BigQuery
8 Big Data in SAP Data Services Value Proposition Use one single ETL tool to move data (structured and unstructured) to big data stores and data warehouses. Simple to use with same dataflow designer for all types of sources/targets, with data preview capabilities to enhance developers productivity. Use Cases Extract data (with the right filters pushed down to the source) from mongodb or Hadoop into a DWH for analytics (HANA, Teradata, Google Big Query, ). ETL experts don t have knowledge of languages like Pig script or MongoDB syntax and need a code-free UI. How it works Native datastores for MongoDB, Google Big Query, Hadoop (HDFS, Hive) + adapter SDK open to partners to build more adapters. Data preview and data profiling for Hadoop sources built into Designer user interface. 8
9 Hadoop Datastore Certified with top two Hadoop distributions Hadoop HortonWorks 2.2 (source, target) Hadoop Cloudera CDH 5.3 (source, target)
10 Hadoop/Hive Support since Data Services 4.1 Files HANA, IQ, other Target Systems Databases Data Services Hadoop Data Services Web & Others Same familiar, easy-to-use UI design paradigm for Hadoop/Hive as other database systems but with specific behind-the-scenes extensions to leverage the power, scale and unique functionality of Hadoop High-performance reading from and loading into both Hadoop (HDFS) and Hive Makes use of Hadoop capabilities by delegating operations to Hadoop/Hive systems (T-E-L) Extended Optimizer fully HiveQL and PIG aware and generates optimized scripts for Hive and Hadoop
11 Hive Support Full metadata support via JDBC, browse and explore Hive tables DS generates HiveQL and pushes down operations to Hive Joins, Sorting, Filters, Functions including aggregation functions High-performance, scalable reading from Hive Multi-threaded, parallel reading of Hive results (not JDBC) All types of column partitioning is supported High performance loading into Hive Support for Inserts and Updates Support for both Static and Dynamic partitioning Multi-threaded (parallel) loading Reading/Loading Hive Metadata (JDBC) Data Services HDFS Files
12 HDFS Support Access to metadata and structure of files in HDFS DS generates PIG and pushes down operations to Hadoop, operations include: Joins Sorting Filters and Projections Functions including aggregation functions Reading from Hadoop High performance, parallel reading of files produced by above PIG script Ability to invoke pre-defined or custom PIG scripts High-performance File-based loading into Hadoop
13 Data Preview for Hadoop Hive Tables Hive Table Preview, includes Data Preview Profile Preview Column Profile Preview Filtering
14 Data Preview for Hadoop HDFS Files Offer View Data (no profiling) for Hadoop HDFS files: In the datastore When used as source or target in a dataflow Including filtering and sorting pushed down to HDFS.
15 Enable SSL Certificate for Hadoop To enable SSL in HIVE adapter, set SSL Enabled = yes SSL Trusted Store and Password The name of the Trust Store you are using to verify credentials and store certificates. TrustStore stores certificates from third party, your Java application communicate or certificates signed by certificate authorities like Verisign, Thawte, Geotrust etc.) which can be used to identify third party. The password associated with the Trust Store. Additional Properties Specifies any additional connection properties. Property value pairs must be separated by a semi-colon.
16 Support SQL() function for Hadoop Added Support for HIVE data stores Used for Data Definition Language (DDL) and Data Manipulation Language (DML) on HIVE databases Useful for managing database objects as precursor to DS code execution. Can also be used for post process database information retrieval.
17 Support SQL Transform for Hadoop SQL Transform supports a single Select statement only Used for standard SQL selects from existing scripts outside DS Select statements can be parameterized.
18 Support Join pushdown operation for Hadoop Why pushdown? Pushdown of transforms and functions to source or target database will leverage the database power instead of doing these operations in the Data Services engine. Specially if source and target tables are in the same database, this will give best performance since no data is extracted from the database. Support join push-down operations for Hadoop e. g. Using a Data Transfer transform to stage data from non-hive source to HIVE
19 MongoDB Datastore
20 What is MongoDB? MongoDB is: a popular document oriented (open source) database. It s a nosql database with dynamic schemas that stores data in a (nested) JSON-like format. MongoDB ranked #4 on most popular database in February 2015 ( ).
21 MongoDB use case for Data Services Enable our customers to be able to extract data from MongoDB (coinnovation with US customer) as a source and load it to a target for analytics.
22 MongoDB adapter in the Management Console Implemented as a new adapter leveraging the Data Services adapter SDK. Adapter needs to be added and started in Management Console before it can be used in a datastore.
23 MongoDB datastore MongoDB adapter supports: Single (Primary) Replica set (Secondary) Shared Cluster Sharding is the process of storing data across multiple machines MongoDB uses this approach to support large data sets deployament and high throughput operations. MongoDB Credential, LDAP and Kerberos authentications SSL Certificate Since MongoDB does not have a schema definition, Data Services will scan a sample set of documents ( Rows to scan ) in the collection and create a schema based on the superset of all fields.
24 MongoDB documents as source in a dataflow Collections are imported as Documents in the repository. The nested structure is preserved, with XML_Map in Data Services you can manipulate the data. Filters defined in the WHERE clause are pushed down to the database. More advanced filter conditions can be defined in the adapter parameter Query criteria using the MongoDB syntax.
25 Google BigQuery Datastore
26 What is Google Big Query? Google BigQuery is using Google s data storage in the cloud, for fast interactive analysis on huge amounts of data: Google BigQuery enables super-fast, SQL-like queries against append-only tables, using the processing power of Google's infrastructure. Main use case for Data Services is to load data into BigQuery for analytics. Note: in this release, BigQuery can be used as target only, not as source.
27 Google Big Query Datastore Native Google Big Query datastore (in the Applications category) Certificate based login: import private key file + provide password The private key is generated from Google Big Query account page. When exporting a GBQ datastore, the private key is NOT exported and needs to be imported again in the target repository (with correct passphrase).
28 Google Big Query as target in a dataflow Browse metadata and import tables Tables contain nested data Google Big Query can be a target table only Note: template tables are not supported, but from a query you can generate a JSON structure which can be used to create the target via the BigQuery web console.
29 Roadmap Current, Planned Innovations and Future
30 SAP Data Services Product road map overview - key themes and capabilities Simple Today Planned Innovations Future Direction (Release 4.2 SP5) Simple Simple Enhanced runtime troubleshooting process by introducing Bypass dataflows and workflows feature Enabled Switch repositories capability in Designer Big Data Enhanced support for IQ, HANA, and other Big Data sources Simplified real-time CDC with SAP Replication Server New connectivity for OData, JSON, REST, MongoDB, Google Big Query and JDBC Certified Hadoop Cloudera and HortonWork Support DDL and DML and data preview for Hadoop Added Sharded Cluster support for MongoDB Security enhancement for Hadoop & MongoDB SSL certificate and role-based authentications (LDAP, Kerberos) Enterprise Support Support pattern variance in Data Masking transform Added Secured Remote File Adapter Built-in functions for file transfer (SFTP) and file manipulation Simplify Data Services software upgrades Improve Substitution Parameters management Add preview and select capability for importing objects into DS repository from a file Merge DS workbench capabilities into DS Designer Big Data Support Hadoop on Windows platform Enhance existing connectivity (source/target) Token based security Enterprise Support Integrate comprehensive runtime stats of DS batch/real-time jobs with SAP Solution Manager Native integration with SAP NetWeaver CTS+ to deliver single transport tool for DS, SAP and other applications Integrate TA 5.x to enhance Text Data Processing engine Data Quality global expansion in Asia Pacific Show graphical dataflow monitor and identify bottlenecks Self-Guiding user interfaces to enhance user experience Big data Expanding support for new sources/targets based on market traction Data Model advisor for HANA database Tight integration with Big Data solutions (SPARK, YARN ) Enterprise support Resource Advisor to provide clarity of system usage Data Services components health monitor with proactive job alerts and analysis. Data Services datastore as a service
31 Demo
32 Why SAP?
33 SAP Solutions for Enterprise Information Management Proven and trusted 12,000+ SAP EIM customers worldwide Winner Swiss Re, Kraft Foods Inc. and Lexmark Intl. are winners of Gartner MDM Excellence Awards Leader In every EIM Category: Master Data Management Data Quality Data Integration Enterprise Architecture Enterprise Content Management Enterprise Data Virtualization 90% customer satisfaction rating #2 Market share for data integration and data quality
34 STAY INFORMED Follow the ASUGNews team: Tom Chris Craig
35 SESSION CODE 2987
Understanding and Leveraging Improvements in SAP Data Integration and Data Services Platform 4.2
September 9 11, 2013 Anaheim, California Understanding and Leveraging Improvements in SAP Data Integration and Data Services Platform 4.2 Tanya Milanovic Enterprise Information Management with SAP Understand
SAP Data Services 4.X. An Enterprise Information management Solution
SAP Data Services 4.X An Enterprise Information management Solution Table of Contents I. SAP Data Services 4.X... 3 Highlights Training Objectives Audience Pre Requisites Keys to Success Certification
Implement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
Enhance your Analytics using Logical Data Warehouse and Data Virtualization thru SAP HANA smart data access SESSION CODE: 0210
Enhance your Analytics using Logical Data Warehouse and Data Virtualization thru SAP HANA smart data access Balaji Krishna, Product Management SAP HANA Platform. SAP Labs @balajivkrishna SESSION CODE:
Data Integration Checklist
The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media
SAP HANA SPS 09 - What s New? HANA IM Services: SDI and SDQ
SAP HANA SPS 09 - What s New? HANA IM Services: SDI and SDQ (Delta from SPS 08 to SPS 09) SAP HANA Product Management November, 2014 2014 SAP SE or an SAP affiliate company. All rights reserved. 1 Agenda
XpoLog Competitive Comparison Sheet
XpoLog Competitive Comparison Sheet New frontier in big log data analysis and application intelligence Technical white paper May 2015 XpoLog, a data analysis and management platform for applications' IT
SAP Data Services Hacks Auto Generating Data Migration Jobs Shobhit Acharya Session# 3507
SAP Data Services Hacks Auto Generating Data Migration Jobs Shobhit Acharya Session# 3507 Learning Points Improve data migration efficiency using SAP Data Services and implementing a few custom approaches
Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview
Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce
Hadoop Job Oriented Training Agenda
1 Hadoop Job Oriented Training Agenda Kapil CK [email protected] Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module
Native Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
Data processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
What's New in SAS Data Management
Paper SAS034-2014 What's New in SAS Data Management Nancy Rausch, SAS Institute Inc., Cary, NC; Mike Frost, SAS Institute Inc., Cary, NC, Mike Ames, SAS Institute Inc., Cary ABSTRACT The latest releases
Qsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.
Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in
Exploring the Synergistic Relationships Between BPC, BW and HANA
September 9 11, 2013 Anaheim, California Exploring the Synergistic Relationships Between, BW and HANA Sheldon Edelstein SAP Database and Solution Management Learning Points SAP Business Planning and Consolidation
Tap into Hadoop and Other No SQL Sources
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
Integrate Master Data with Big Data using Oracle Table Access for Hadoop
Integrate Master Data with Big Data using Oracle Table Access for Hadoop Kuassi Mensah Oracle Corporation Redwood Shores, CA, USA Keywords: Hadoop, BigData, Hive SQL, Spark SQL, HCatalog, StorageHandler
Luncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe
The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect
The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect IT Insight podcast This podcast belongs to the IT Insight series You can subscribe to the podcast through
Dominik Wagenknecht Accenture
Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna
Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
Value Realization at Johnson Controls using SAP HANA smart data integration Steve Carpenter Johnson Controls Ryan Champlin - SAP
Value Realization at Johnson Controls using SAP HANA smart data integration Steve Carpenter Johnson Controls Ryan Champlin - SAP Agenda What is SAP HANA smart data integration? Use cases for Smart Data
IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look
IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based
Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer
Automated Data Ingestion Bernhard Disselhoff Enterprise Sales Engineer Agenda Pentaho Overview Templated dynamic ETL workflows Pentaho Data Integration (PDI) Use Cases Pentaho Overview Overview What we
Hadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
PUBLIC Performance Optimization Guide
SAP Data Services Document Version: 4.2 Support Package 6 (14.2.6.0) 2015-11-20 PUBLIC Content 1 Welcome to SAP Data Services....6 1.1 Welcome.... 6 1.2 Documentation set for SAP Data Services....6 1.3
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks
Hadoop Introduction Olivier Renault Solution Engineer - Hortonworks Hortonworks A Brief History of Apache Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2013
Constructing a Data Lake: Hadoop and Oracle Database United!
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
Performance and Scalability Overview
Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics Platform. Contents Pentaho Scalability and
ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
SAP Crystal Reports & SAP HANA: Integration & Roadmap Kenneth Li SAP SESSION CODE: 0401
SAP Crystal Reports & SAP HANA: Integration & Roadmap Kenneth Li SAP SESSION CODE: 0401 LEARNING POINTS Learn about Crystal Reports for HANA Glance at the road map for the product Overview of deploying
Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture
Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture Apps and data source extensions with APIs Future white label, embed or integrate Power BI Deploy Intelligent
From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten
From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten MC Brown, Director of Documentation Linas Virbalas, Senior Software Engineer. About Tungsten Replicator Open source drop-in
Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam [email protected]
Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam [email protected] Agenda The rise of Big Data & Hadoop MySQL in the Big Data Lifecycle MySQL Solutions for Big Data Q&A
Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy
Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics
Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
Performance and Scalability Overview
Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics platform. PENTAHO PERFORMANCE ENGINEERING
Big Data Analytics in LinkedIn. Danielle Aring & William Merritt
Big Data Analytics in LinkedIn by Danielle Aring & William Merritt 2 Brief History of LinkedIn - Launched in 2003 by Reid Hoffman (https://ourstory.linkedin.com/) - 2005: Introduced first business lines
SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24. Data Federation Administration Tool Guide
SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24 Data Federation Administration Tool Guide Content 1 What's new in the.... 5 2 Introduction to administration
Data Governance in the Hadoop Data Lake. Michael Lang May 2015
Data Governance in the Hadoop Data Lake Michael Lang May 2015 Introduction Product Manager for Teradata Loom Joined Teradata as part of acquisition of Revelytix, original developer of Loom VP of Sales
Replicating to everything
Replicating to everything Featuring Tungsten Replicator A Giuseppe Maxia, QA Architect Vmware About me Giuseppe Maxia, a.k.a. "The Data Charmer" QA Architect at VMware Previously at AB / Sun / 3 times
Oracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
Getting Started with Hadoop. Raanan Dagan Paul Tibaldi
Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop
BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig
BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig Contents Acknowledgements... 1 Introduction to Hive and Pig... 2 Setup... 2 Exercise 1 Load Avro data into HDFS... 2 Exercise 2 Define an
#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld
Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case
Apache Hadoop: The Big Data Refinery
Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data
TE's Analytics on Hadoop and SAP HANA Using SAP Vora
TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. -
SAP Sybase Replication Server What s New in 15.7.1 SP100. Bill Zhang, Product Management, SAP HANA Lisa Spagnolie, Director of Product Marketing
SAP Sybase Replication Server What s New in 15.7.1 SP100 Bill Zhang, Product Management, SAP HANA Lisa Spagnolie, Director of Product Marketing Agenda SAP Sybase Replication Server Overview Replication
Hadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
Oracle Warehouse Builder 10g
Oracle Warehouse Builder 10g Architectural White paper February 2004 Table of contents INTRODUCTION... 3 OVERVIEW... 4 THE DESIGN COMPONENT... 4 THE RUNTIME COMPONENT... 5 THE DESIGN ARCHITECTURE... 6
High-Volume Data Warehousing in Centerprise. Product Datasheet
High-Volume Data Warehousing in Centerprise Product Datasheet Table of Contents Overview 3 Data Complexity 3 Data Quality 3 Speed and Scalability 3 Centerprise Data Warehouse Features 4 ETL in a Unified
Big Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
ORACLE DATA INTEGRATOR ENTERPRISE EDITION
ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition 12c delivers high-performance data movement and transformation among enterprise platforms with its open and integrated
An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
Manifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
Self-service BI for big data applications using Apache Drill
Self-service BI for big data applications using Apache Drill 2015 MapR Technologies 2015 MapR Technologies 1 Data Is Doubling Every Two Years Unstructured data will account for more than 80% of the data
An Overview of SAP BW Powered by HANA. Al Weedman
An Overview of SAP BW Powered by HANA Al Weedman About BICP SAP HANA, BOBJ, and BW Implementations The BICP is a focused SAP Business Intelligence consulting services organization focused specifically
XpoLog Center Suite Data Sheet
XpoLog Center Suite Data Sheet General XpoLog is a data analysis and management platform for Applications IT data. Business applications rely on a dynamic heterogeneous applications infrastructure, such
Big Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
Oracle Data Integrator 11g New Features & OBIEE Integration. Presented by: Arun K. Chaturvedi Business Intelligence Consultant/Architect
Oracle Data Integrator 11g New Features & OBIEE Integration Presented by: Arun K. Chaturvedi Business Intelligence Consultant/Architect Agenda 01. Overview & The Architecture 02. New Features Productivity,
Business Application Services Testing
Business Application Services Testing Curriculum Structure Course name Duration(days) Express 2 Testing Concept and methodologies 3 Introduction to Performance Testing 3 Web Testing 2 QTP 5 SQL 5 Load
Spring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE
Spring,2015 Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE Contents: Briefly About Big Data Management What is hive? Hive Architecture Working
Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload
Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload
MySQL and Hadoop. Percona Live 2014 Chris Schneider
MySQL and Hadoop Percona Live 2014 Chris Schneider About Me Chris Schneider, Database Architect @ Groupon Spent the last 10 years building MySQL architecture for multiple companies Worked with Hadoop for
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.
SQL Server 2012 Performance White Paper
Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.
Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
Reference Architecture, Requirements, Gaps, Roles
Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture
Contents. Pentaho Corporation. Version 5.1. Copyright Page. New Features in Pentaho Data Integration 5.1. PDI Version 5.1 Minor Functionality Changes
Contents Pentaho Corporation Version 5.1 Copyright Page New Features in Pentaho Data Integration 5.1 PDI Version 5.1 Minor Functionality Changes Legal Notices https://help.pentaho.com/template:pentaho/controls/pdftocfooter
Roadmap Talend : découvrez les futures fonctionnalités de Talend
Roadmap Talend : découvrez les futures fonctionnalités de Talend Cédric Carbone Talend Connect 9 octobre 2014 Talend 2014 1 Connecting the Data-Driven Enterprise Talend 2014 2 Agenda Agenda Why a Unified
SAP Data Services and SAP Information Steward Document Version: 4.2 Support Package 7 (14.2.7.0) 2016-05-06 PUBLIC. Master Guide
SAP Data Services and SAP Information Steward Document Version: 4.2 Support Package 7 (14.2.7.0) 2016-05-06 PUBLIC Content 1 Getting Started....4 1.1 Products Overview.... 4 1.2 Components overview....4
Safe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
Hadoop and Map-Reduce. Swati Gore
Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data
Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect
Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate
BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata
BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING
Sentimental Analysis using Hadoop Phase 2: Week 2
Sentimental Analysis using Hadoop Phase 2: Week 2 MARKET / INDUSTRY, FUTURE SCOPE BY ANKUR UPRIT The key value type basically, uses a hash table in which there exists a unique key and a pointer to a particular
SAP HANA Cloud Platform
SAP HANA Cloud Platform SAP Forum 2015 César Martín 12 de marzo de 2015 SAP HANA Cloud Platform Build, extend, and run next-generation applications on SAP HANA in the cloud The in-memory cloud platform-as-a-service
How, What, and Where of Data Warehouses for MySQL
How, What, and Where of Data Warehouses for MySQL Robert Hodges CEO, Continuent. Introducing Continuent The leading provider of clustering and replication for open source DBMS Our Product: Continuent Tungsten
Big Data Operations Guide for Cloudera Manager v5.x Hadoop
Big Data Operations Guide for Cloudera Manager v5.x Hadoop Logging into the Enterprise Cloudera Manager 1. On the server where you have installed 'Cloudera Manager', make sure that the server is running,
Oracle Data Integrator 11g: Integration and Administration
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 4108 4709 Oracle Data Integrator 11g: Integration and Administration Duration: 5 Days What you will learn Oracle Data Integrator is a comprehensive
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?
Integration of Apache Hive and HBase
Integration of Apache Hive and HBase Enis Soztutar enis [at] apache [dot] org @enissoz Page 1 About Me User and committer of Hadoop since 2007 Contributor to Apache Hadoop, HBase, Hive and Gora Joined
OWB Users, Enter The New ODI World
OWB Users, Enter The New ODI World Kulvinder Hari Oracle Introduction Oracle Data Integrator (ODI) is a best-of-breed data integration platform focused on fast bulk data movement and handling complex data
Ganzheitliches Datenmanagement
Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist
Self-service BI for big data applications using Apache Drill
Self-service BI for big data applications using Apache Drill 2015 MapR Technologies 2015 MapR Technologies 1 Management - MCS MapR Data Platform for Hadoop and NoSQL APACHE HADOOP AND OSS ECOSYSTEM Batch
In-memory computing with SAP HANA
In-memory computing with SAP HANA June 2015 Amit Satoor, SAP @asatoor 2015 SAP SE or an SAP affiliate company. All rights reserved. 1 Hyperconnectivity across people, business, and devices give rise to
Talend Open Studio for Big Data. Release Notes 5.2.1
Talend Open Studio for Big Data Release Notes 5.2.1 Talend Open Studio for Big Data Copyleft This documentation is provided under the terms of the Creative Commons Public License (CCPL). For more information
Federated SQL on Hadoop and Beyond: Leveraging Apache Geode to Build a Poor Man's SAP HANA. by Christian Tzolov @christzolov
Federated SQL on Hadoop and Beyond: Leveraging Apache Geode to Build a Poor Man's SAP HANA by Christian Tzolov @christzolov Whoami Christian Tzolov Technical Architect at Pivotal, BigData, Hadoop, SpringXD,
Sisense. Product Highlights. www.sisense.com
Sisense Product Highlights Introduction Sisense is a business intelligence solution that simplifies analytics for complex data by offering an end-to-end platform that lets users easily prepare and analyze
An Oracle White Paper February 2014. Oracle Data Integrator 12c Architecture Overview
An Oracle White Paper February 2014 Oracle Data Integrator 12c Introduction Oracle Data Integrator (ODI) 12c is built on several components all working together around a centralized metadata repository.
Upcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC [email protected] Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering
MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation
6.0, 6.5 and Beyond. The Future of Spotfire. Tobias Lehtipalo Sr. Director of Product Management
6.0, 6.5 and Beyond The Future of Spotfire Tobias Lehtipalo Sr. Director of Product Management Key peformance indicators Hundreds of Records Visual Data Discovery Millions of Records Data Mining or Data
