Optimization of Analytic Data Flows for Next Generation Business Intelligence Applications

Size: px
Start display at page:

Download "Optimization of Analytic Data Flows for Next Generation Business Intelligence Applications"

Transcription

1 Optimization of Analytic Data Flows for Next Generation Business Intelligence Applications Umeshwar Dayal, Kevin Wilkinson, Alkis Simitsis, Malu Castellanos, Lupita Paz HP Labs Palo Alto, CA, USA 1 Copyright Copyright 2010 Hewlett-Packard 2010 Hewlett-Packard Development Development Company, Company, L.P. L.P.

2 OUTLINE Next Generation Business Intelligence Analytic Data Flows Challenges in Analytic Data Flow Design and Optimization QoX Framework QoX-driven Optimization of Data Integration Flows Micro-benchmarks Open Problems Summary

3 TRADITIONAL BUSINESS INTELLIGENCE ARCHITECTURE BI: Technologies, tools, and practices for collecting, integrating, analyzing, and presenting large volumes of information to enable better decision making Focus on enterprise OLTP data sources Batch ETL (Extract-Transform-Load) pipeline Front-end analytics for strategic decision making

4 EMERGENCE OF AN INFLECTION POINT Sensors, real time events, unstructured data and web contents have the promise of transforming the way we manage our customers, resources, environments, health, The shift to the web makes available unprecedented quantities of structured and unstructured databases about human activities and sensing technologies are making available great quantities of data that provide unprecedented scope and resolution the availability of rich streams of data, create(s) an inflection point.. for generating insights and guiding decision making To date, we have only scratched the surface of the potential for learning from these large-scale data sets.. - From Data to Knowledge to Action A Global Enabler for the 21 st Century, Computing Community Consortium, August 2010 Analytics, and specifically technologies that enable real-time analysis of large data volumes, social media analytics, and mobile BI will continue to dominate the enterprise agenda - Computer World, Jan

5 Live BI: Decisions at any time scale Executives Model Builders Enterprise Operational systems Business Control Systems Operations staff Operational Data Stores Enterprise Data Warehouse Decreasing time scales for response (decision latencies) Traditional Business Intelligence 5 5 Increasing need for automation Live Business Intelligence

6 ANALYTICS DATA FLOWS End-to-end Analytics Data Flow Front-end Analytics Back-end Analytics Analysis querying, reporting, data mining, modeling, analytics, visualization operations Loading Transformation Cleaning, de-noising, entity identification, etc. Information extraction, access Streaming Challenges Long development cycles Little support for optimization or automation Possibly many objectives: Correctness, Fault-tolerance, Scalability, Maintainability, Cost Many types of data sources: Structured/ Unstructured, Stored/ Streaming, Internal/ External Complex flows, involving ETL, stream processing, queries, analytics, visualization operations Possibly executed on multiple engines

7 NEXT-GENERATION: LIVE BI Development & Deployment-time Execution-time Interaction and Delivery Data Flow Composer and Editor Federation and Orchestration Optimizer Execution Engines RDBMS Hadoop ETL engine Massively parallel, large memory cluster Stream engine Analytics Engine Data Stores RDB Unstructured File Data Cloud Storage other Streaming data sources (sensor data, event logs, social media, web data )

8 EXAMPLE FLOW 1: STRUCTURED DATA -- TPC-H SOURCES Store 1 Lineitem Union Store n Lineitem Surrogate key generate Lineitem Orders Compute sales = quantity * unitcost Filter on date range Join on orderkey (get relevant sales for date range) Filter on cmpnkey (get products, date range) Join on prodkey (get relevant sales for products) 9 CmpnDim Copyright 2010 Hewlett-Packard Development Company, L.P. Rollup sales on prodkey, cmpnkey RptSalesByProdCmpn Sales of products in response to marketing campaign

9 EXAMPLE FLOW 2: UNSTRUCTURED DATA SENTIMENT ANALYSIS 10 Copyright 2010 Hewlett-Packard Development Company, L.P.

10 EXAMPLE FLOW 3 (STRUCTURED & UNSTRUCTURED DATA) Tweets Filter on any mention of enterprise or its products Relevant tweets Sentiment analysis for products by tweet RptSAbyTwtProd CmpnDim Surrogate key generate Filter on cmpnkey (get products, date range) Filter on date range (get tweets during campaign) Join on prodkey (get tweets for products) 11 RptSalesByProdCmpn Copyright 2010 Hewlett-Packard Development Company, L.P. Rollup sentimentscore on prodkey, cpmnkey, attribute (aggregate sentiment scores) Join on prodkey, cmpnkey RptSAbyProdCmpn Correlate Sales and Sentiments for products in response to marketing campaign

11 EXAMPLE FLOW 4: EVENT AND STREAM ANALYTICS Sensor data, external event Streams Stream pre-processing Detect Patterns/Events Events Event Stream Event Correlation Diagnosis & Root Cause Analysis Complex Events Diagnoses Contextual data (structured, unsructured) Event History Prediction Action Assessment Predicted Events Recommended Actions Event Discovery and Creation Event Analysis 12

12 document batch file Data Sources Preprocessing (Adaptor) (Doc Info, Text body) vertical slice profile info profile info Token, POS Tag, Negation Flag, Attribute Flag, Sentiment Flag] Relate Sentiment to Attribute (Doc Info, Text body) attribute sentiment value Doc Info = (Doc id, Time Stamp, Source type, Source URL, Author Profile, GeoTag, Score) Author Profile = (id, name, , location, gender, ) Splitter on Doc Id, hash partition, 3x parallel Sentiment Word Detection sentiment word dict. Attribute, Sentiment Value] (Doc Info, Text body) Token, POS Tag, Negation Flag, Attribute Flag] Join on doc Id profile info Remove Docs from Blacklisted Authors (Doc Info, Text body) Attribute Extraction attribute dict. Attribute, Sentiment Value] Sentence Detector Token, POS Tag, Negation flag] Aggregate by Doc document sentiment value Sentence] Negation Detection negation dict. (Doc Info, Attribute, Sentiment Value]) Tokenizer Token, POS Tag], order <Doc Id, Sentence Id, Token Loc> Token id, Token], order <Doc Id, Sentence Id, Token Loc> Compound Phrases phrase dict. DASHBOARD POS Tagging Token id, Token], order <Doc Id, Sentence Id, Token Loc> (Doc Info, [ Token id, Token, POS Tag]) recovery point Merger on Doc id, 3x parallel create recovery point FROM DESIGN TO EXECUTION [Dayal, et al., EDBT 09] [Wilkinson, Simitsis ER 10] Flow design editor xlm Logical Data Flow Graph Performance Fault-tolerance Maintainability Freshness Scalability Availability Cost QoX Objectives Optimizer Optimized Physical Data Flow Graph Sources Hadoop Scripts xlm Enginespecific codegen tool-specific execution instructions Scripts DBMS ETL Tool Target 13

13 Quality objectives functional correctness the optimization of the initial flow does not affect the flow semantics or correctness performance measures the flow completion in terms of time and resources used freshness the latency between the occurrence of an event at a source system and the reflection of that event at the target system reliability the ability of flow to complete successfully within the specified time window despite failures 14 14

14 QUALITY OBJECTIVES Performance: usually throughput, total execution time Reliability: probability that a flow will complete correctly (~ mean time between failures) Recoverability: ability to restore the flow after a failure (~mean time to recover) Scalability: ability to process higher volumes of data or higher data rates Freshness: latency between source data and data in the DW Consistency: accuracy (correctness) of the data in the DW Availability: probability that the data is available for queries Maintainability: ability to operate at specified service levels Flexibility: ability to respond to evolving requirements Affordability: cost of integration project Traceability: ability to track lineage (provenance) of data Auditability: ability to satisfy legal/regulatory compliance requirements

15 TRADEOFFS AMONG QoX OBJECTIVES Fastest Recoverable REC REC Perf Rec Rel Cost w/o RPs Performance w/ 2 RPs STATE=AZ REC REC Perf Rec Rel Cost STATE=CA REC REC w/o RPs STATE=NV REC REC w/ 6 RPs Reliable Perf Rec Rel Cost w/o RPs + n/a

16 QoX Optimization [Simitsis, et al., ICDE 10 Simitsis, et al., SIGMOD 09] Logical ] Optimization: Flow restructuring Parallelization, partitioning Collapsing chains Reordering operators Inserting recovery (persistence) points Replication Physical optimization Implementations of operators Materialize vs. Virtualize Choice of execution engines Data shipping vs.function shipping Optimization for performance and faulttolerance Given a data flow F, objective function OF w n: input recordset size w: execution time window k: number of faults to be tolerated

17 STATE SPACE SEARCH initial state min-cost state 18

18 OPTIMIZATION TACTICS: TRANSITIONS IN THE STATE SPACE GRAPH factorize/ distribute swap addrp partition replicate 19

19 OPTIMIZATION FOR MULTI-ENGINE EXECUTION Characterize the execution of operations on different execution engines Implement micro-benchmarks for individual operators Learn cost functions to plug into optimization objective Interpolate from micro-benchmarks Compose costs for individual operations into cost of flow Account for resource contention Extend transitions considered by the optimizer to include implementation and engine choices Extend heuristics for state space search

20 Example: Sentiment Analysis Flow Logical Model Data Sources Doc Info = (Doc id, Time Stamp, Source type, Source URL, Author Profile, GeoTag, Score) Author Profile = (id, name, , location, gender, ) Other Operators Preprocessing (Adaptor) (Doc Info, Text body) document batch file (Doc Info, Attribute, Sentiment Value]) Aggregate by Doc Sentence Detector Attribute, Sentiment Value] Sentence] Remove Docs from Blacklisted Authors Tokenizer Token id, Token], order <Doc Id, Sentence Id, Token Loc> Attribute, Sentiment Value] POS Tagging Relate Sentiment to Attribute (Doc Info, [ Token id, Token, POS Tag]) Token, POS Tag, Negation Flag, Attribute Flag, Sentiment Flag] phrase dict. Compound Phrases Sentiment Word Detection Token, POS Tag], order <Doc Id, Sentence Id, Token Loc> Token, POS Tag, Negation Flag, Attribute Flag] negation dict. Negation Detection Token, POS Tag, Negation flag] Attribute Extraction document sentiment value attribute sentiment value sentiment word dict. attribute dict. 21

21 Example: Sentiment Analysis Flow Physical Model Sources Data Sources Doc Info = (Doc id, Time Stamp, Source type, Source URL, Author Profile, GeoTag, Score) Author Profile = (id, name, , location, gender, ) document batch file Scripts Preprocessing (Adaptor) (Doc Info, Text body) vertical slice profile info profile info (Doc Info, Text body) Splitter on Doc Id, hash partition, 3x parallel Hadoop (Doc Info, Text body) Remove Docs from Blacklisted Authors (Doc Info, Text body) Sentence Detector Sentence] Tokenizer Token id, Token], order <Doc Id, Sentence Id, Token Loc> POS Tagging Token id, Token], order <Doc Id, Sentence Id, Token Loc> recovery point Scripts Merger on Doc id, 3x parallel create recovery point DBMS Token, POS Tag, Negation Flag, Attribute Flag, Sentiment Flag] Sentiment Word Detection sentiment word dict. Token, POS Tag, Negation Flag, Attribute Flag] Attribute Extraction attribute dict. Token, POS Tag, Negation flag] Negation Detection negation dict. Token, POS Tag], order <Doc Id, Sentence Id, Token Loc> Compound Phrases phrase dict. (Doc Info, [ Token id, Token, POS Tag]) ETL Tool Relate Sentiment to Attribute Attribute, Sentiment Value] Join on doc Id Attribute, Sentiment Value] Aggregate by Doc (Doc Info, Attribute, Sentiment Value]) Target DASHBOARD attribute sentiment value profile info document sentiment value 22

22 MICRO-BENCHMARKS FOR CONVENTIONAL (ETL, RELATIONAL) OPERATORS Experiments on sort, join, surrogate-key, etc. Example: Sort os-sort home-grown distributed sort implementation hd-sort Hadoop-based sort (used Pig Latin) pdb-sort used a commercial, parallel DBMS Dataset TPC-H lineitem SF Size (GBs) Rows (x10 6 )

23 Sort (Comparison) Hadoop vs. Custom script vs. pdb (with and without loading) Pig script (hd-sort): sf$sf = load 'lineitem.tbl.sf$sf' using PigStorage(' ') as (f1,f2,...,f17); ord1_$nd = order sf$sf by f1 parallel $nd;... ord10_$nd = order ord9_$nd by f10 parallel $nd; store ord$op_$nd into 'res_sf$sf_ord$op-$nd.dat using PigStorage(' '); C and shell scripts (os-sort): split file: filesize/#nodes (leftovers go to chunk #1) transfer data sort chunks on nodes transfer back and merge (hierarchical) (to be fair with hd-sort and etl-sort, avoid pipelining) 24 (TPC-H data SF:1,10,100)

24 Sort (Tuning) Hadoop e.g., #reducers Shell scripts e.g., analyze sort internal phases 25

25 MICRO-BENCHMARKS FOR UNCONVENTIONAL OPERATORS Experiments on Sentiment Analysis operators individual operators the whole sentiment analysis flow treated as a single operator Implementations Single node Distributed, scripts Distributed, Hadoop Datasets Small- to medium-sized document corpus: up to 200,000 documents Large corpus: ~30 million documents

26 MICRO-BENCHMARKS FOR UNCONVENTIONAL OPERATORS (SENTIMENT ANALYSIS) Single node, small to mediumsized corpus Use linear regression to learn interpolation function 27 Copyright 2010 Hewlett-Packard Development Company, L.P. Single node, large corpus; effect of limited memory 12-node Hadoop cluster

27 SENTIMENT ANALYSIS (COMPARISON) Single node vs. Distributed with Hadoop vs. Distributed without Hadoop

28 OPEN PROBLEMS: FUTURE WORK Micro-benchmarks of other operators: ETL, relational, content analytics, front-end analytics, stream processing Cost functions: interpolation, composition Benchmarks and objective functions for other QoX measures Optimization strategies for physical optimization Assign segments of the flow to execution engines Where, when, and what to materialize

29 SUMMARY Design of data flows for BI applications is a very expensive, largely manual activity: little support for automation, optimization Situation gets worse with next gen BI: fuse back-end and front-end operations into end-to-end analytic data flows Many different types of sources, more complex operators, multiple execution engines Layered methodology: QoX-driven optimization allows trading off different quality objectives Need benchmarks and micro-benchmarks to drive the design and optimization Many challenges remain

30 31 Copyright 2010 Hewlett-Packard Development Company, L.P. Thank You

Business Processes Meet Operational Business Intelligence

Business Processes Meet Operational Business Intelligence Business Processes Meet Operational Business Intelligence Umeshwar Dayal, Kevin Wilkinson, Alkis Simitsis, Malu Castellanos HP Labs, Palo Alto, CA, USA Abstract As Business Intelligence architectures evolve

More information

Data Integration Flows for Business Intelligence

Data Integration Flows for Business Intelligence Data Integration Flows for Business Intelligence Umeshwar Dayal HP Labs Palo Alto, Ca, USA umeshwar.dayal@hp.com Malu Castellanos HP Labs Palo Alto, Ca, USA malu.castellanos@hp.com Alkis Simitsis HP Labs

More information

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...

More information

Trafodion Operational SQL-on-Hadoop

Trafodion Operational SQL-on-Hadoop Trafodion Operational SQL-on-Hadoop SophiaConf 2015 Pierre Baudelle, HP EMEA TSC July 6 th, 2015 Hadoop workload profiles Operational Interactive Non-interactive Batch Real-time analytics Operational SQL

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Real Time Analytics for Big Data. NtiSh Nati Shalom @natishalom

Real Time Analytics for Big Data. NtiSh Nati Shalom @natishalom Real Time Analytics for Big Data A Twitter Inspired Case Study NtiSh Nati Shalom @natishalom Big Data Predictions Overthe next few years we'll see the adoption of scalable frameworks and platforms for

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

Big Data Analytics Platform @ Nokia

Big Data Analytics Platform @ Nokia Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform

More information

Getting Real Real Time Data Integration Patterns and Architectures

Getting Real Real Time Data Integration Patterns and Architectures Getting Real Real Time Data Integration Patterns and Architectures Nelson Petracek Senior Director, Enterprise Technology Architecture Informatica Digital Government Institute s Enterprise Architecture

More information

How To Use Hp Vertica Ondemand

How To Use Hp Vertica Ondemand Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks Hadoop Introduction Olivier Renault Solution Engineer - Hortonworks Hortonworks A Brief History of Apache Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2013

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn Presented by :- Ishank Kumar Aakash Patel Vishnu Dev Yadav CONTENT Abstract Introduction Related work The Ecosystem Ingress

More information

Il mondo dei DB Cambia : Tecnologie e opportunita`

Il mondo dei DB Cambia : Tecnologie e opportunita` Il mondo dei DB Cambia : Tecnologie e opportunita` Giorgio Raico Pre-Sales Consultant Hewlett-Packard Italiana 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject

More information

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Simplifying Big Data Analytics: Unifying Batch and Stream Processing John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Streaming Analy.cs S S S Scale- up Database Data And Compute Grid

More information

Unified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia

Unified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia Unified Big Data Processing with Apache Spark Matei Zaharia @matei_zaharia What is Apache Spark? Fast & general engine for big data processing Generalizes MapReduce model to support more types of processing

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps White provides GRASP-powered big data predictive analytics that increases marketing effectiveness and customer satisfaction with API-driven adaptive apps that anticipate, learn, and adapt to deliver contextual,

More information

Customized Report- Big Data

Customized Report- Big Data GINeVRA Digital Research Hub Customized Report- Big Data 1 2014. All Rights Reserved. Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 2 2014. All Rights Reserved.

More information

Big Data and Analytics in Government

Big Data and Analytics in Government Big Data and Analytics in Government Nov 29, 2012 Mark Johnson Director, Engineered Systems Program 2 Agenda What Big Data Is Government Big Data Use Cases Building a Complete Information Solution Conclusion

More information

Building a real-time, self-service data analytics ecosystem Greg Arnold, Sr. Director Engineering

Building a real-time, self-service data analytics ecosystem Greg Arnold, Sr. Director Engineering Building a real-time, self-service data analytics ecosystem Greg Arnold, Sr. Director Engineering Self Service at scale 6 5 4 3 2 1 ? Relational? MPP? Hadoop? Linkedin data 350M Members 25B 3.5M 4.8B 2M

More information

SQL Server Administrator Introduction - 3 Days Objectives

SQL Server Administrator Introduction - 3 Days Objectives SQL Server Administrator Introduction - 3 Days INTRODUCTION TO MICROSOFT SQL SERVER Exploring the components of SQL Server Identifying SQL Server administration tasks INSTALLING SQL SERVER Identifying

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

Data Warehouse (DW) Maturity Assessment Questionnaire

Data Warehouse (DW) Maturity Assessment Questionnaire Data Warehouse (DW) Maturity Assessment Questionnaire Catalina Sacu - csacu@students.cs.uu.nl Marco Spruit m.r.spruit@cs.uu.nl Frank Habers fhabers@inergy.nl September, 2010 Technical Report UU-CS-2010-021

More information

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed

More information

Unified Batch & Stream Processing Platform

Unified Batch & Stream Processing Platform Unified Batch & Stream Processing Platform Himanshu Bari Director Product Management Most Big Data Use Cases Are About Improving/Re-write EXISTING solutions To KNOWN problems Current Solutions Were Built

More information

How To Use Big Data For Telco (For A Telco)

How To Use Big Data For Telco (For A Telco) ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA David Vanderfeesten, Bell Labs Belgium ANNO 2012 YOUR DATA IS MONEY BIG MONEY! Your click stream, your activity stream, your electricity consumption, your call

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data: The Datafication Of Everything Thoughts Devices Processes Thoughts Things Processes Run the Business Organize data to do something

More information

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges James Campbell Corporate Systems Engineer HP Vertica jcampbell@vertica.com Big

More information

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW Roger Breu PDW Solution Specialist Microsoft Western Europe Marcus Gullberg PDW Partner Account Manager Microsoft Sweden

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS

THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS WHITE PAPER Successfully writing Fast Data applications to manage data generated from mobile, smart devices and social interactions, and the

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc. Oracle9i Data Warehouse Review Robert F. Edwards Dulcian, Inc. Agenda Oracle9i Server OLAP Server Analytical SQL Data Mining ETL Warehouse Builder 3i Oracle 9i Server Overview 9i Server = Data Warehouse

More information

Cloud Computing at Google. Architecture

Cloud Computing at Google. Architecture Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale

More information

Azure Data Lake Analytics

Azure Data Lake Analytics Azure Data Lake Analytics Compose and orchestrate data services at scale Fully managed service to support orchestration of data movement and processing Connect to relational or non-relational data

More information

PALANTIR CYBER An End-to-End Cyber Intelligence Platform for Analysis & Knowledge Management

PALANTIR CYBER An End-to-End Cyber Intelligence Platform for Analysis & Knowledge Management PALANTIR CYBER An End-to-End Cyber Intelligence Platform for Analysis & Knowledge Management INTRODUCTION Traditional perimeter defense solutions fail against sophisticated adversaries who target their

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are

More information

A Brief Introduction to Apache Tez

A Brief Introduction to Apache Tez A Brief Introduction to Apache Tez Introduction It is a fact that data is basically the new currency of the modern business world. Companies that effectively maximize the value of their data (extract value

More information

SQL Server 2012 Performance White Paper

SQL Server 2012 Performance White Paper Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.

More information

Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO. Big Data Everywhere Conference, NYC November 2015

Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO. Big Data Everywhere Conference, NYC November 2015 Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO Big Data Everywhere Conference, NYC November 2015 Agenda 1. Challenges with Risk Data Aggregation and Risk Reporting (RDARR) 2. How a

More information

Hadoop IST 734 SS CHUNG

Hadoop IST 734 SS CHUNG Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to

More information

Advanced In-Database Analytics

Advanced In-Database Analytics Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

Trustworthiness of Big Data

Trustworthiness of Big Data Trustworthiness of Big Data International Journal of Computer Applications (0975 8887) Akhil Mittal Technical Test Lead Infosys Limited ABSTRACT Big data refers to large datasets that are challenging to

More information

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence Augmented Search for Web Applications New frontier in big log data analysis and application intelligence Business white paper May 2015 Web applications are the most common business applications today.

More information

Big Data. Fast Forward. Putting data to productive use

Big Data. Fast Forward. Putting data to productive use Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize

More information

Beyond Lambda - how to get from logical to physical. Artur Borycki, Director International Technology & Innovations

Beyond Lambda - how to get from logical to physical. Artur Borycki, Director International Technology & Innovations Beyond Lambda - how to get from logical to physical Artur Borycki, Director International Technology & Innovations Simplification & Efficiency Teradata believe in the principles of self-service, automation

More information

XpoLog Competitive Comparison Sheet

XpoLog Competitive Comparison Sheet XpoLog Competitive Comparison Sheet New frontier in big log data analysis and application intelligence Technical white paper May 2015 XpoLog, a data analysis and management platform for applications' IT

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

Big Data Architect Certification Self-Study Kit Bundle

Big Data Architect Certification Self-Study Kit Bundle Big Data Architect Certification Bundle This certification bundle provides you with the self-study materials you need to prepare for the exams required to complete the Big Data Architect Certification.

More information

Data Management in the Cloud

Data Management in the Cloud Data Management in the Cloud Ryan Stern stern@cs.colostate.edu : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University Outline Today Microsoft Cloud SQL Server

More information

Benchmarking No One Size Fits All Big Data Analytics

Benchmarking No One Size Fits All Big Data Analytics Benchmarking No One Size Fits All Big Data Analytics Zhian He, Mayuresh Kunjir, Harold Lim, Eric Lo, Shivnath Babu, Meichun Hsu, and Malu Castellanos The Hong Kong Polytechnic University and Duke University

More information

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata Up Your R Game James Taylor, Decision Management Solutions Bill Franks, Teradata Today s Speakers James Taylor Bill Franks CEO Chief Analytics Officer Decision Management Solutions Teradata 7/28/14 3 Polling

More information

CONNECTING DATA WITH BUSINESS

CONNECTING DATA WITH BUSINESS CONNECTING DATA WITH BUSINESS Big Data and Data Science consulting Business Value through Data Knowledge Synergic Partners is a specialized Big Data, Data Science and Data Engineering consultancy firm

More information

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social

More information

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here Data Virtualization for Agile Business Intelligence Systems and Virtual MDM To View This Presentation as a Video Click Here Agenda Data Virtualization New Capabilities New Challenges in Data Integration

More information

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84 Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics

More information

Microsoft Business Intelligence

Microsoft Business Intelligence Microsoft Business Intelligence P L A T F O R M O V E R V I E W M A R C H 1 8 TH, 2 0 0 9 C H U C K R U S S E L L S E N I O R P A R T N E R C O L L E C T I V E I N T E L L I G E N C E I N C. C R U S S

More information

Agile Business Intelligence Data Lake Architecture

Agile Business Intelligence Data Lake Architecture Agile Business Intelligence Data Lake Architecture TABLE OF CONTENTS Introduction... 2 Data Lake Architecture... 2 Step 1 Extract From Source Data... 5 Step 2 Register And Catalogue Data Sets... 5 Step

More information

What's New in SAS Data Management

What's New in SAS Data Management Paper SAS034-2014 What's New in SAS Data Management Nancy Rausch, SAS Institute Inc., Cary, NC; Mike Frost, SAS Institute Inc., Cary, NC, Mike Ames, SAS Institute Inc., Cary ABSTRACT The latest releases

More information

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013 SAP HANA SAP s In-Memory Database Dr. Martin Kittel, SAP HANA Development January 16, 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

From Spark to Ignition:

From Spark to Ignition: From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for

More information

Welcome. Host: Eric Kavanagh. eric.kavanagh@bloorgroup.com. The Briefing Room. Twitter Tag: #briefr

Welcome. Host: Eric Kavanagh. eric.kavanagh@bloorgroup.com. The Briefing Room. Twitter Tag: #briefr The Briefing Room Welcome Host: Eric Kavanagh eric.kavanagh@bloorgroup.com Twitter Tag: #briefr The Briefing Room Mission! Reveal the essential characteristics of enterprise software, good and bad! Provide

More information

70-467: Designing Business Intelligence Solutions with Microsoft SQL Server

70-467: Designing Business Intelligence Solutions with Microsoft SQL Server 70-467: Designing Business Intelligence Solutions with Microsoft SQL Server The following tables show where changes to exam 70-467 have been made to include updates that relate to SQL Server 2014 tasks.

More information

SAP and Hortonworks Reference Architecture

SAP and Hortonworks Reference Architecture SAP and Hortonworks Reference Architecture Hortonworks. We Do Hadoop. June Page 1 2014 Hortonworks Inc. 2011 2014. All Rights Reserved A Modern Data Architecture With SAP DATA SYSTEMS APPLICATIO NS Statistical

More information

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

How, What, and Where of Data Warehouses for MySQL

How, What, and Where of Data Warehouses for MySQL How, What, and Where of Data Warehouses for MySQL Robert Hodges CEO, Continuent. Introducing Continuent The leading provider of clustering and replication for open source DBMS Our Product: Continuent Tungsten

More information

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All

More information

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc. Oracle BI EE Implementation on Netezza Prepared by SureShot Strategies, Inc. The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE

More information

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics Paper 1828-2014 Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics John Cunningham, Teradata Corporation, Danville, CA ABSTRACT SAS High Performance Analytics (HPA) is a

More information

Mobile Storage and Search Engine of Information Oriented to Food Cloud

Mobile Storage and Search Engine of Information Oriented to Food Cloud Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 ISSN: 2042-4868; e-issn: 2042-4876 Maxwell Scientific Organization, 2013 Submitted: May 29, 2013 Accepted: July 04, 2013 Published:

More information

Building Scalable Big Data Infrastructure Using Open Source Software. Sam William sampd@stumbleupon.

Building Scalable Big Data Infrastructure Using Open Source Software. Sam William sampd@stumbleupon. Building Scalable Big Data Infrastructure Using Open Source Software Sam William sampd@stumbleupon. What is StumbleUpon? Help users find content they did not expect to find The best way to discover new

More information

Innovative technology for big data analytics

Innovative technology for big data analytics Technical white paper Innovative technology for big data analytics The HP Vertica Analytics Platform database provides price/performance, scalability, availability, and ease of administration Table of

More information

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc. Beyond Web Application Log Analysis using Apache TM Hadoop A Whitepaper by Orzota, Inc. 1 Web Applications As more and more software moves to a Software as a Service (SaaS) model, the web application has

More information

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate

More information

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.

More information

Practical Considerations for Real-Time Business Intelligence. Donovan Schneider Yahoo! September 11, 2006

Practical Considerations for Real-Time Business Intelligence. Donovan Schneider Yahoo! September 11, 2006 Practical Considerations for Real-Time Business Intelligence Donovan Schneider Yahoo! September 11, 2006 Outline Business Intelligence (BI) Background Real-Time Business Intelligence Examples Two Requirements

More information

Oracle Real Time Decisions

Oracle Real Time Decisions A Product Review James Taylor CEO CONTENTS Introducing Decision Management Systems Oracle Real Time Decisions Product Architecture Key Features Availability Conclusion Oracle Real Time Decisions (RTD)

More information

Augmented Search for Software Testing

Augmented Search for Software Testing Augmented Search for Software Testing For Testers, Developers, and QA Managers New frontier in big log data analysis and application intelligence Business white paper May 2015 During software testing cycles,

More information

SQL Server 2005 Features Comparison

SQL Server 2005 Features Comparison Page 1 of 10 Quick Links Home Worldwide Search Microsoft.com for: Go : Home Product Information How to Buy Editions Learning Downloads Support Partners Technologies Solutions Community Previous Versions

More information

Big Data and Advanced Analytics Technologies for the Smart Grid

Big Data and Advanced Analytics Technologies for the Smart Grid 1 Big Data and Advanced Analytics Technologies for the Smart Grid Arnie de Castro, PhD SAS Institute IEEE PES 2014 General Meeting July 27-31, 2014 Panel Session: Using Smart Grid Data to Improve Planning,

More information

SQL Server 2012 Parallel Data Warehouse. Solution Brief

SQL Server 2012 Parallel Data Warehouse. Solution Brief SQL Server 2012 Parallel Data Warehouse Solution Brief Published February 22, 2013 Contents Introduction... 1 Microsoft Platform: Windows Server and SQL Server... 2 SQL Server 2012 Parallel Data Warehouse...

More information

PARC and SAP Co-innovation: High-performance Graph Analytics for Big Data Powered by SAP HANA

PARC and SAP Co-innovation: High-performance Graph Analytics for Big Data Powered by SAP HANA PARC and SAP Co-innovation: High-performance Graph Analytics for Big Data Powered by SAP HANA Harnessing the combined power of SAP HANA and PARC s HiperGraph graph analytics technology for real-time insights

More information

Ganzheitliches Datenmanagement

Ganzheitliches Datenmanagement Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist

More information

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Big Data, Advanced Analytics:

More information

Big Data JAMES WARREN. Principles and best practices of NATHAN MARZ MANNING. scalable real-time data systems. Shelter Island

Big Data JAMES WARREN. Principles and best practices of NATHAN MARZ MANNING. scalable real-time data systems. Shelter Island Big Data Principles and best practices of scalable real-time data systems NATHAN MARZ JAMES WARREN II MANNING Shelter Island contents preface xiii acknowledgments xv about this book xviii ~1 Anew paradigm

More information

SQL Server 2012 Business Intelligence Boot Camp

SQL Server 2012 Business Intelligence Boot Camp SQL Server 2012 Business Intelligence Boot Camp Length: 5 Days Technology: Microsoft SQL Server 2012 Delivery Method: Instructor-led (classroom) About this Course Data warehousing is a solution organizations

More information