Massive Predictive Modeling using Oracle R Technologies Mark Hornick, Director, Oracle Advanced Analytics

Size: px
Start display at page:

Download "Massive Predictive Modeling using Oracle R Technologies Mark Hornick, Director, Oracle Advanced Analytics"

Transcription

1 Massive Predictive Modeling using Oracle R Technologies Mark Hornick, Director, Oracle Advanced Analytics

2 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle. 3

3 Agenda Massive Predictive Modeling Use cases Enabling technologies 4

4 Quick Survey: How many models have you built? in your lifetime > 10 > 100 > 1000 > > >

5 Data Size (rows) billions Massive Predictive Modeling 100s 1 millions Generalized Specialized # Models 7

6 billions Data Size (rows) 1000s Broad coverage # Models per Entity 100s 1 Targeted 1 millions # Models 8

7 Massive Predictive Modeling - Goals Build one or more models per entity, e.g., customer Understand and/or predict entity behavior Aggregate results across entities, e.g., to assess future demand model model model model model model model model model n Σ cust=1 Demand over time 9

8 Massive Predictive Modeling - Challenges Effectively dealing with Big Data Hardware, software, network, storage Algorithms that scale and perform with Big Data Building many models in parallel Production deployment Storing and managing models Backup, recovery, and security 10

9 Use Cases 14

10 Predicting Customer Electricity Usage 15

11 Motivation: Energy Theft Detecting patterns of meter tampering Storage of information about which meters have been tampered with Analysis and decision making SA country loses US$4 billion per year due to energy theft Forecast future behavior 16

12 Motivation: Different customers, different demands Creation of a demand and consumption curve for each customer Analysis: in which period will company have to deliver more energy? Price electricity in a given period Storage of information about the consumption of each customer in different periods of day Each customer has different demand and consumption patterns Customer decides when to use energy to reduce cost Company redirects the energy to where it is most needed at the moment, saving on the generation

13 Sensor Data Analysis Model each customer s usage to understand behavior and predict individual usage and overall aggregate demand Consider 200K customers, each with a utility smart meter 1 reading / meter / hour 200K x 8760 hours / year 1.752B readings 3 years worth of data 5.256B readings readings per customer 10 seconds to build each model hours (23.2 days) with 128 DOP 4.3 hours

14 Database-centric architecture Smart meter scenario Oracle Database Data c1 c2 ci cn R Datastore R Script Repository f(dat,args, ) f(dat,args, ) f(dat,args, ) f(dat,args, ) f(dat,args, ) { R Script build model Model c1 Model c2 Model ci Model cn }

15 Database-centric architecture Smart meter scenario Oracle Database Data c1 c2 ci cn R Datastore Model R Script Repository f(dat,args, ) f(dat,args, ) f(dat,args, ) f(dat,args, ) f(dat,args, ) { } R Script score data scores c1 scores c2 scores ci scores cn

16 How many lines of code do you think it should take to implement this?

17 Build models and store in database, partition on CUST_ID ore.groupapply (CUST_USAGE_DATA, 14 lines CUST_USAGE_DATA$CUST_ID, function(dat, ds.name) { cust_id <- dat$cust_id[1] mod <- lm(consumption ~. -CUST_ID, dat) mod$effects <- mod$residuals <- mod$fitted.values <- NULL name <- paste("mod", cust_id,sep="") assign(name, mod) ds.name1 <- paste(ds.name,".",cust_id,sep="") ore.save(list=paste("mod",cust_id,sep=""), name=ds.name1, overwrite=true) TRUE }, ds.name="mydatastore", ore.connect=true, parallel=true ) 22

18 Score customers in database, partition on CUST_ID ore.groupapply(cust_usage_data_new, CUST_USAGE_DATA_NEW$CUST_ID, 16 lines function(dat, ds.name) { cust_id <- dat$cust_id[1] ds.name1 <- paste(ds.name,".",cust_id,sep="") ore.load(ds.name1) name <- paste("mod", cust_id,sep="") mod <- get(name) prd <- predict(mod, newdata=dat) prd[as.integer(rownames(prd))] <- prd res <- cbind(cust_id=cust_id, PRED = prd) data.frame(res) }, ds.name="mydatastore", ore.connect=true, parallel=true, FUN.VALUE=data.frame(CUST_ID=numeric(0), PRED=numeric(0)) ) 23

19 Execution (sec) Execution Examples (with DOP=24) 1000 Models Data: 26,280,000 rows Total build time: 65.2 seconds Total scoring time: 25.7 seconds (all data) 50,000 Models Data: 1,314,000,000 rows Total build time: minutes Total scoring time: 18 minutes (all data) 10,000 Models Data: 262,800,000 rows Total build time: 516 seconds Total scoring time: 217 seconds (all data) 1 Model/Customer # rows (millions) Build Time Score Time 24

20 Simulation 25

21 Compute distribution of generated random normal values simulation <- function(index, n) { set.seed(index) x <- rnorm(n) res <- data.frame(t(matrix(summary(x)))) names(res) <- c("min","q1","median","mean","q3","max") res$id <- index res } (res <- simulation(1,1000)) 26

22 Simulation with sample size 1000 over 10 trials res <- ore.indexapply(10, simulation, n=1000, FUN.VALUE=res[1,], parallel=true) stats <- ore.pull(res) library(reshape2) melt.stats <- melt(stats, id.vars="id") boxplot(value~variable, data=melt.stats, main="distribution of Stats - sample 1000, 10 trials") 27

23 Simulation with sample sizes 10 1:6 and 100 trials num.trials <- 100 for(n in 10^(1:6)){ t1 <- system.time(stats <- ore.pull(ore.indexapply(num.trials, simulation, n=n, FUN.VALUE=res[1,], parallel=true)))[3] cat("n=",n,", time=",t1,"\n") melt.stats <- melt(stats, id.vars="id") boxplot(value~variable, data=melt.stats, main=paste("distribution of Stats - sample",n,",", num.trials, "trials")) gc() } 28

24 Plot Results: sample sizes 10 1:6 and 100 trials

25 Scalable Performance varying number of trials (10^x)

26 Enabling Technologies 32

27 Oracle R Enterprise Oracle Advanced Analytics Option to Oracle Database Eliminate memory constraint of client R engine Minimize or eliminate data movement latency Execute R scripts through database server machine for scalability and performance Achieve scalability and performance by leveraging Oracle Database as HPC environment Enable integration and management of R scripts through SQL Operationalize entire R scripts in production applications eliminate porting R code Avoid reinventing code to integrate R results into existing applications Client R Engine Transparency Layer ORE packages Oracle Database User tables In-db stats SQL Interfaces SQL*Plus, SQLDeveloper, Database Server Machine 34

28 Oracle s R Technologies Oracle R Distribution ROracle Software available to R Community for free Oracle R Enterprise Oracle R Advanced Analytics for Hadoop Come to our booth to learn more 35

29 Resources Oracle R Distribution ROracle Oracle R Enterprise Oracle R Advanced Analytics for Hadoop Book: Using R to Unlock the Value of Big Data Blog: Forum: 47

30 FastR New implementation of R in Java Uses the new Truffle interpreter framework and Graal optimizing compiler in conjunction with the HotSpot JVM for high performance, scalability and portability Dynamically compiles, adaptively optimizes and deoptimizes at run time Joint effort: Oracle Labs (Germany, USA, Austria), JKU Linz (Austria), Purdue University (USA), TU Dortmund (Germany) Open-source project (research prototype!) GPLv2 More info at the poster session 48

31 49

32

A Perfect Storm. Oracle Big Data Science for Enterprise R and SAS Users. Marcos Arancibia, Consulting Product Manager marcos.arancibia@oracle.

A Perfect Storm. Oracle Big Data Science for Enterprise R and SAS Users. Marcos Arancibia, Consulting Product Manager marcos.arancibia@oracle. A Perfect Storm Oracle Big Data Science for Enterprise R and SAS Users Mark Hornick, Director, Advanced Analytics mark.hornick@oracle.com @MarkHornick Marcos Arancibia, Consulting Product Manager marcos.arancibia@oracle.com

More information

Big Data Analytics Scaling R to Enterprise Data user! 2013 Albacete Spain #user2013

Big Data Analytics Scaling R to Enterprise Data user! 2013 Albacete Spain #user2013 Big Analytics Scaling R to Enterprise user! 2013 Albacete Spain #user2013 Luis Campos Mark Hornick 1 Big Solutions Lead, Oracle EMEA Director, Oracle base Advanced Analytics @luigicampos @MarkHornick 2

More information

Starting Smart with Oracle Advanced Analytics

Starting Smart with Oracle Advanced Analytics Starting Smart with Oracle Advanced Analytics Great Lakes Oracle Conference Tim Vlamis Thursday, May 19, 2016 Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri Developed

More information

High-Performance Analytics

High-Performance Analytics High-Performance Analytics David Pope January 2012 Principal Solutions Architect High Performance Analytics Practice Saturday, April 21, 2012 Agenda Who Is SAS / SAS Technology Evolution Current Trends

More information

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features Charlie Berger, MS Eng, MBA Sr. Director Product Management, Data Mining and Advanced Analytics charlie.berger@oracle.com www.twitter.com/charliedatamine

More information

Oracle Advanced Analytics - Option to Oracle Database: Oracle R Enterprise and Oracle Data Mining. Data Warehouse Global Leaders Winter 2013

Oracle Advanced Analytics - Option to Oracle Database: Oracle R Enterprise and Oracle Data Mining. Data Warehouse Global Leaders Winter 2013 Oracle Advanced Analytics - Option to Oracle Database: Oracle R Enterprise and Oracle Data Mining Data Warehouse Global Leaders Winter 2013 Dan Vlamis, Vlamis Software Solutions Tim Vlamis, Vlamis Software

More information

Learning R Series Session 4: Oracle R Enterprise 1.3 Predictive Analytics Mark Hornick Oracle Advanced Analytics

Learning R Series Session 4: Oracle R Enterprise 1.3 Predictive Analytics Mark Hornick Oracle Advanced Analytics Learning R Series Session 4: Oracle R Enterprise 1.3 Predictive Analytics Mark Hornick Oracle Advanced Analytics Learning R Series 2012 Session Title Session 1 Introduction to Oracle's

More information

Oracle Database 12c Plug In. Switch On. Get SMART.

Oracle Database 12c Plug In. Switch On. Get SMART. Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.

More information

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

Session 1: Introduction to Oracle's R Technologies

Session 1: Introduction to Oracle's R Technologies Session 1: Introduction to Oracle's R Technologies Mark Hornick, Director, Oracle Advanced Analytics Development Oracle Advanced Analytics Topics What is R? Oracle R Enterprise motivation

More information

Outils pour l'analyse prédictive parallèle de multiples sources de données non structurées

Outils pour l'analyse prédictive parallèle de multiples sources de données non structurées Outils pour l'analyse prédictive parallèle de multiples sources de données non structurées Forum Ter@tec Mercredi 25 juin 2015 Marc Wolff Application Engineer HPC & Big Data 2015 The MathWorks, Inc. 1

More information

I/O Considerations in Big Data Analytics

I/O Considerations in Big Data Analytics Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very

More information

JVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra

JVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra JVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra January 2014 Legal Notices Apache Cassandra, Spark and Solr and their respective logos are trademarks or registered trademarks

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining

Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining R The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

Java/Scala Engineer Internet of Iot Competitors

Java/Scala Engineer Internet of Iot Competitors JOB 1 Sr. Java/Scala Engineer Internet of Things, IoT, is a true digital revolution. Predictions of 20, 50 or 100 billion connected devices in 2020 are pointing to massive changes for people and industries.

More information

Big Data and Advanced Analytics Technologies for the Smart Grid

Big Data and Advanced Analytics Technologies for the Smart Grid 1 Big Data and Advanced Analytics Technologies for the Smart Grid Arnie de Castro, PhD SAS Institute IEEE PES 2014 General Meeting July 27-31, 2014 Panel Session: Using Smart Grid Data to Improve Planning,

More information

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel Big Data and Analytics: Getting Started with ArcGIS Mike Park Erik Hoel Agenda Overview of big data Distributed computation User experience Data management Big data What is it? Big Data is a loosely defined

More information

Towards Smart and Intelligent SDN Controller

Towards Smart and Intelligent SDN Controller Towards Smart and Intelligent SDN Controller - Through the Generic, Extensible, and Elastic Time Series Data Repository (TSDR) YuLing Chen, Dell Inc. Rajesh Narayanan, Dell Inc. Sharon Aicler, Cisco Systems

More information

INTRODUCTION TO CLOUD COMPUTING CEN483 PARALLEL AND DISTRIBUTED SYSTEMS

INTRODUCTION TO CLOUD COMPUTING CEN483 PARALLEL AND DISTRIBUTED SYSTEMS INTRODUCTION TO CLOUD COMPUTING CEN483 PARALLEL AND DISTRIBUTED SYSTEMS CLOUD COMPUTING Cloud computing is a model for enabling convenient, ondemand network access to a shared pool of configurable computing

More information

Performance And Scalability In Oracle9i And SQL Server 2000

Performance And Scalability In Oracle9i And SQL Server 2000 Performance And Scalability In Oracle9i And SQL Server 2000 Presented By : Phathisile Sibanda Supervisor : John Ebden 1 Presentation Overview Project Objectives Motivation -Why performance & Scalability

More information

Streaming Big Data Performance Benchmark for Real-time Log Analytics in an Industry Environment

Streaming Big Data Performance Benchmark for Real-time Log Analytics in an Industry Environment Streaming Big Data Performance Benchmark for Real-time Log Analytics in an Industry Environment SQLstream s-server The Streaming Big Data Engine for Machine Data Intelligence 2 SQLstream proves 15x faster

More information

Five Essential Components for Highly Reliable Data Centers

Five Essential Components for Highly Reliable Data Centers GE Intelligent Platforms Five Essential Components for Highly Reliable Data Centers Ensuring continuous operations with an integrated, holistic technology strategy that provides high availability, increased

More information

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process ORACLE OLAP KEY FEATURES AND BENEFITS FAST ANSWERS TO TOUGH QUESTIONS EASILY KEY FEATURES & BENEFITS World class analytic engine Superior query performance Simple SQL access to advanced analytics Enhanced

More information

Using the Coherence Cloud Service

Using the Coherence Cloud Service Using the Coherence Cloud Service An introduction Dave Felcey Coherence Product Manager July 2, 2015 Safe Harbor Statement The following is intended to outline our general product direction. It is intended

More information

SAP and Hortonworks Reference Architecture

SAP and Hortonworks Reference Architecture SAP and Hortonworks Reference Architecture Hortonworks. We Do Hadoop. June Page 1 2014 Hortonworks Inc. 2011 2014. All Rights Reserved A Modern Data Architecture With SAP DATA SYSTEMS APPLICATIO NS Statistical

More information

Streaming Big Data Performance Benchmark. for

Streaming Big Data Performance Benchmark. for Streaming Big Data Performance Benchmark for 2 The V of Big Data Velocity means both how fast data is being produced and how fast the data must be processed to meet demand. Gartner Static Big Data is a

More information

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1 CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level -ORACLE TIMESTEN 11gR1 CASE STUDY Oracle TimesTen In-Memory Database and Shared Disk HA Implementation

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

SAP SE - Legal Requirements and Requirements

SAP SE - Legal Requirements and Requirements Finding the signals in the noise Niklas Packendorff @packendorff Solution Expert Analytics & Data Platform Legal disclaimer The information in this presentation is confidential and proprietary to SAP and

More information

Customized Report- Big Data

Customized Report- Big Data GINeVRA Digital Research Hub Customized Report- Big Data 1 2014. All Rights Reserved. Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 2 2014. All Rights Reserved.

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

Scalable Architecture on Amazon AWS Cloud

Scalable Architecture on Amazon AWS Cloud Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect

More information

Big Data on Microsoft Platform

Big Data on Microsoft Platform Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4

More information

Mobile RFID solutions

Mobile RFID solutions A TAKE Solutions White Paper Mobile RFID solutions small smart solutions Introduction Mobile RFID enables unique RFID use-cases not possible with fixed readers. Mobile data collection devices such as scanners

More information

Harnessing the power of advanced analytics with IBM Netezza

Harnessing the power of advanced analytics with IBM Netezza IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced

More information

Holistic Performance Analysis of J2EE Applications

Holistic Performance Analysis of J2EE Applications Holistic Performance Analysis of J2EE Applications By Madhu Tanikella In order to identify and resolve performance problems of enterprise Java Applications and reduce the time-to-market, performance analysis

More information

<Insert Picture Here> Move to Oracle Database with Oracle SQL Developer Migrations

<Insert Picture Here> Move to Oracle Database with Oracle SQL Developer Migrations Move to Oracle Database with Oracle SQL Developer Migrations The following is intended to outline our general product direction. It is intended for information purposes only, and

More information

Microsoft Research Windows Azure for Research Training

Microsoft Research Windows Azure for Research Training Copyright 2013 Microsoft Corporation. All rights reserved. Except where otherwise noted, these materials are licensed under the terms of the Apache License, Version 2.0. You may use it according to the

More information

Testing & Assuring Mobile End User Experience Before Production. Neotys

Testing & Assuring Mobile End User Experience Before Production. Neotys Testing & Assuring Mobile End User Experience Before Production Neotys Agenda Introduction The challenges Best practices NeoLoad mobile capabilities Mobile devices are used more and more At Home In 2014,

More information

Understanding the Benefits of IBM SPSS Statistics Server

Understanding the Benefits of IBM SPSS Statistics Server IBM SPSS Statistics Server Understanding the Benefits of IBM SPSS Statistics Server Contents: 1 Introduction 2 Performance 101: Understanding the drivers of better performance 3 Why performance is faster

More information

Jun Liu, Senior Software Engineer Bianny Bian, Engineering Manager SSG/STO/PAC

Jun Liu, Senior Software Engineer Bianny Bian, Engineering Manager SSG/STO/PAC Jun Liu, Senior Software Engineer Bianny Bian, Engineering Manager SSG/STO/PAC Agenda Quick Overview of Impala Design Challenges of an Impala Deployment Case Study: Use Simulation-Based Approach to Design

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Microsoft Research Microsoft Azure for Research Training

Microsoft Research Microsoft Azure for Research Training Copyright 2014 Microsoft Corporation. All rights reserved. Except where otherwise noted, these materials are licensed under the terms of the Apache License, Version 2.0. You may use it according to the

More information

Enabling R for Big Data with PL/R and PivotalR Real World Examples on Hadoop & MPP Databases

Enabling R for Big Data with PL/R and PivotalR Real World Examples on Hadoop & MPP Databases Enabling R for Big Data with PL/R and PivotalR Real World Examples on Hadoop & MPP Databases Woo J. Jung Principal Data Scientist Pivotal Labs 1 All In On Open Source Still can t believe we did this. Truly

More information

2009 Oracle Corporation 1

2009 Oracle Corporation 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,

More information

Lecture 9: Data Mining, Data Analytics and Big Data

Lecture 9: Data Mining, Data Analytics and Big Data Lecture 9: Data Mining, Data Analytics and Big Data Maaike Limper, Antonio Romero, Manuel Martin 1 Introduction Two openlab Projects in IT-DB Data Analytics In-Database Physics Analysis Both using data

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

Oracle Communications WebRTC Session Controller: Basic Admin. Student Guide

Oracle Communications WebRTC Session Controller: Basic Admin. Student Guide Oracle Communications WebRTC Session Controller: Basic Admin Student Guide Edition 1.0 April 2015 Copyright 2015, Oracle and/or its affiliates. All rights reserved. Disclaimer This document contains proprietary

More information

In-Database Analytics

In-Database Analytics Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing

More information

Application of Predictive Analytics for Better Alignment of Business and IT

Application of Predictive Analytics for Better Alignment of Business and IT Application of Predictive Analytics for Better Alignment of Business and IT Boris Zibitsker, PhD bzibitsker@beznext.com July 25, 2014 Big Data Summit - Riga, Latvia About the Presenter Boris Zibitsker

More information

What s Cool in the SAP JVM (CON3243)

What s Cool in the SAP JVM (CON3243) What s Cool in the SAP JVM (CON3243) Volker Simonis, SAP SE September, 2014 Public Agenda SAP JVM Supportability SAP JVM Profiler SAP JVM Debugger 2014 SAP SE. All rights reserved. Public 2 SAP JVM SAP

More information

Time series IoT data ingestion into Cassandra using Kaa

Time series IoT data ingestion into Cassandra using Kaa Time series IoT data ingestion into Cassandra using Kaa Andrew Shvayka ashvayka@cybervisiontech.com Agenda Data ingestion challenges Why Kaa? Why Cassandra? Reference architecture overview Hands-on Sandbox

More information

Oracle BI Publisher Enterprise Cluster Deployment. An Oracle White Paper August 2007

Oracle BI Publisher Enterprise Cluster Deployment. An Oracle White Paper August 2007 Oracle BI Publisher Enterprise Cluster Deployment An Oracle White Paper August 2007 Oracle BI Publisher Enterprise INTRODUCTION This paper covers Oracle BI Publisher cluster and high availability deployment.

More information

An Oracle White Paper May 2012. Oracle Database Cloud Service

An Oracle White Paper May 2012. Oracle Database Cloud Service An Oracle White Paper May 2012 Oracle Database Cloud Service Executive Overview The Oracle Database Cloud Service provides a unique combination of the simplicity and ease of use promised by Cloud computing

More information

BW-EML SAP Standard Application Benchmark

BW-EML SAP Standard Application Benchmark BW-EML SAP Standard Application Benchmark Heiko Gerwens and Tobias Kutning (&) SAP SE, Walldorf, Germany tobas.kutning@sap.com Abstract. The focus of this presentation is on the latest addition to the

More information

An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise

An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

COMPUTER MEASUREMENT GROUP - India Hyderabad Chapter. Strategies to Optimize Cloud Costs By Cloud Performance Monitoring

COMPUTER MEASUREMENT GROUP - India Hyderabad Chapter. Strategies to Optimize Cloud Costs By Cloud Performance Monitoring COMPUTER MEASUREMENT GROUP - India Hyderabad Chapter Strategies to Optimize Cloud Costs By Cloud Performance Monitoring October 2013 www.cmgindia.org Computer Measurement Group, India 1 About Me Credentials

More information

ORACLE DATABASE 10G ENTERPRISE EDITION

ORACLE DATABASE 10G ENTERPRISE EDITION ORACLE DATABASE 10G ENTERPRISE EDITION OVERVIEW Oracle Database 10g Enterprise Edition is ideal for enterprises that ENTERPRISE EDITION For enterprises of any size For databases up to 8 Exabytes in size.

More information

Search Big Data with MySQL and Sphinx. Mindaugas Žukas www.ivinco.com

Search Big Data with MySQL and Sphinx. Mindaugas Žukas www.ivinco.com Search Big Data with MySQL and Sphinx Mindaugas Žukas www.ivinco.com Agenda Big Data Architecture Factors and Technologies MySQL and Big Data Sphinx Search Server overview Case study: building a Big Data

More information

IoT Security Platform

IoT Security Platform IoT Security Platform 2 Introduction Wars begin when the costs of attack are low, the benefits for a victor are high, and there is an inability to enforce law. The same is true in cyberwars. Today there

More information

Optimizing Storage for Better TCO in Oracle Environments. Part 1: Management INFOSTOR. Executive Brief

Optimizing Storage for Better TCO in Oracle Environments. Part 1: Management INFOSTOR. Executive Brief Optimizing Storage for Better TCO in Oracle Environments INFOSTOR Executive Brief a QuinStreet Excutive Brief. 2012 To the casual observer, and even to business decision makers who don t work in information

More information

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the

More information

How To Build A Cloud Computer

How To Build A Cloud Computer Introducing the Singlechip Cloud Computer Exploring the Future of Many-core Processors White Paper Intel Labs Jim Held Intel Fellow, Intel Labs Director, Tera-scale Computing Research Sean Koehl Technology

More information

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze

More information

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop 1 Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop 2 Pivotal s Full Approach It s More Than Just Hadoop Pivotal Data Labs 3 Why Pivotal Exists First Movers Solve the Big Data Utility Gap

More information

Inge Os Sales Consulting Manager Oracle Norway

Inge Os Sales Consulting Manager Oracle Norway Inge Os Sales Consulting Manager Oracle Norway Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database Machine Oracle & Sun Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database

More information

SAP HANA SPS 09 - What s New? HANA IM Services: SDI and SDQ

SAP HANA SPS 09 - What s New? HANA IM Services: SDI and SDQ SAP HANA SPS 09 - What s New? HANA IM Services: SDI and SDQ (Delta from SPS 08 to SPS 09) SAP HANA Product Management November, 2014 2014 SAP SE or an SAP affiliate company. All rights reserved. 1 Agenda

More information

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.

More information

Archiving and Sharing Big Data Digital Repositories, Libraries, Cloud Storage

Archiving and Sharing Big Data Digital Repositories, Libraries, Cloud Storage Archiving and Sharing Big Data Digital Repositories, Libraries, Cloud Storage Cyrus Shahabi, Ph.D. Professor of Computer Science & Electrical Engineering Director, Integrated Media Systems Center (IMSC)

More information

System Requirements Table of contents

System Requirements Table of contents Table of contents 1 Introduction... 2 2 Knoa Agent... 2 2.1 System Requirements...2 2.2 Environment Requirements...4 3 Knoa Server Architecture...4 3.1 Knoa Server Components... 4 3.2 Server Hardware Setup...5

More information

Veeam Backup and Replication Architecture and Deployment. Nelson Simao Systems Engineer

Veeam Backup and Replication Architecture and Deployment. Nelson Simao Systems Engineer Veeam Backup and Replication Architecture and Deployment Nelson Simao Systems Engineer Agenda Veeam Backup Server / Proxy Architecture Veeam Backup Server / Backup Proxy Backup Transport Modes Physical

More information

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015 Pulsar Realtime Analytics At Scale Tony Ng April 14, 2015 Big Data Trends Bigger data volumes More data sources DBs, logs, behavioral & business event streams, sensors Faster analysis Next day to hours

More information

Hadoop for Enterprises:

Hadoop for Enterprises: Hadoop for Enterprises: Overcoming the Major Challenges Introduction to Big Data Big Data are information assets that are high volume, velocity, and variety. Big Data demands cost-effective, innovative

More information

Migrating SaaS Applications to Windows Azure

Migrating SaaS Applications to Windows Azure Migrating SaaS Applications to Windows Azure Lessons Learned 04.04.2012 Speaker Introduction Deepthi Raju Marketing Technology Services Deepthi joined Smartbridge in 2005 and has over twenty years of technology

More information

Developing Relevant Dining Visits with Oracle Advanced Analytics Olive Garden s transition toward tailoring guests experiences

Developing Relevant Dining Visits with Oracle Advanced Analytics Olive Garden s transition toward tailoring guests experiences Developing Relevant Dining Visits with Oracle Advanced Analytics Olive Garden s transition toward tailoring guests experiences Matt Fritz Senior Data Scientist Business Challenge Darden comprises several

More information

From Spark to Ignition:

From Spark to Ignition: From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for

More information

Hadoop IST 734 SS CHUNG

Hadoop IST 734 SS CHUNG Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to

More information

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Enabling High performance Big Data platform with RDMA

Enabling High performance Big Data platform with RDMA Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery

More information

Oracle: Database and Data Management Innovations with CERN Public Day

Oracle: Database and Data Management Innovations with CERN Public Day Presented to Oracle: Database and Data Management Innovations with CERN Public Day Kevin Jernigan, Oracle Lorena Lobato Pardavila, CERN Manuel Martin Marquez, CERN June 10, 2015 Copyright 2015, Oracle

More information

Cloud Computing Backgrounder

Cloud Computing Backgrounder Cloud Computing Backgrounder No surprise: information technology (IT) is huge. Huge costs, huge number of buzz words, huge amount of jargon, and a huge competitive advantage for those who can effectively

More information

Data Centric Computing Revisited

Data Centric Computing Revisited Piyush Chaudhary Technical Computing Solutions Data Centric Computing Revisited SPXXL/SCICOMP Summer 2013 Bottom line: It is a time of Powerful Information Data volume is on the rise Dimensions of data

More information

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization

More information

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved. Preview of Oracle Database 12c In-Memory Option 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

SAP HANA Reinventing Real-Time Businesses through Innovation, Value & Simplicity. Eduardo Rodrigues October 2013

SAP HANA Reinventing Real-Time Businesses through Innovation, Value & Simplicity. Eduardo Rodrigues October 2013 Reinventing Real-Time Businesses through Innovation, Value & Simplicity Eduardo Rodrigues October 2013 Agenda The Existing Data Management Conundrum Innovations Transformational Impact at Customers Summary

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Integrating Big Data into the Computing Curricula

Integrating Big Data into the Computing Curricula Integrating Big Data into the Computing Curricula Yasin Silva, Suzanne Dietrich, Jason Reed, Lisa Tsosie Arizona State University http://www.public.asu.edu/~ynsilva/ibigdata/ 1 Overview Motivation Big

More information

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov An Industrial Perspective on the Hadoop Ecosystem Eldar Khalilov Pavel Valov agenda 03.12.2015 2 agenda Introduction 03.12.2015 2 agenda Introduction Research goals 03.12.2015 2 agenda Introduction Research

More information

Graph Database Performance: An Oracle Perspective

Graph Database Performance: An Oracle Perspective Graph Database Performance: An Oracle Perspective Xavier Lopez, Ph.D. Senior Director, Product Management 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Program Agenda Broad Perspective

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information