PMU Time Series Data Mining
|
|
- Sibyl Stone
- 7 years ago
- Views:
Transcription
1 PMU Time Series Data Mining Natasha Balac, Ph.D Chuck Wells, Ph.D Nicole Wolter Albert Nguyen Jake Schurmeier Predictive Analytics Center of Excellence (PACE) San Diego Supercomputer Center University of California, San Diego
2 Brief History of SDSC : NSF national supercomputer center; managed by General Atomics : NSF PACI program leadership center; managed by UCSD PACI: Partnerships for Advanced Computational Infrastructure : Internal transition to support more diversified research computing Still NSF national resource provider 2009-future: Multi-constituency cyberinfrastructure (CI) center Provide data-intensive CI resources, services, and expertise for campus, state, and nation Approaching $1B in lifetime contract and grant activity
3 Gordon Speeds and Feeds INTEL SANDY BRIDGE COMPUTE NODE Sockets & Cores 2 & 16 Clock speed 2.6 GHz DRAM capacity and speed 64 GB, 1,333 MHz INTEL710 emlc FLASH I/O NODE NAND flash SSD drives 16 SSD capacity per drive & per node 16 * 300 GB = 4.8 TB SMP SUPER-NODE (VIA VSMP) Compute nodes / I/O Nodes 32 / 2 Addressable DRAM 2 TB Addressable memory including flash 11.6 TB GORDON (AGGREGATE) Compute Nodes 1,024 Compute cores 16,384 Peak performance 341 TF DRAM/SSD memory Architecture Link Bandwidth Vendor INFINIBAND INTERCONNECT 64 TB DRAM; 300 TB SSD Dual-Rail, 3D torus QDR Mellanox LUSTRE-BASED DISK I/O SUBSYSTEM (SHARED) Total storage: current/planned 4 PB/6 PB (raw) Total bandwidth 100 GB/s
4 PMU Frequency Data
5 Sampling PMU Frequency Data and Fast Fourier Transformation (FFT) Transforming Frequency Data to FFT Data 23 samples of Frequency Data was taken from the PMU at different times The FFT was computed for each sample Each FFT was standardized by setting the max value to 1 The following slides are the standardized FFT for the various time samples
6 FFT at Various Time (1 of 4) X-Axis = Frequency Y-Axis: Magnitude
7 FFT at Various Time (2 of 4) X-Axis = Frequency Y-Axis: Magnitude
8 FFT at Various Time (3 of 4) X-Axis = Frequency Y-Axis: Magnitude
9 FFT at Various Time (4 of 4) X-Axis = Frequency Y-Axis: Magnitude
10 Time Series Representation and Similarity Measure Transforming FFT Data into FFT Bins For each preceding sample, FFT Frequencies are discretized into 25 bins For each bin the mean and the sum are calculated Correlation matrix comparing the corresponding event and control frequency bins
11 FFT Correlation Matrix Control Group Event Group
12 Simple Anomaly Detection Benford s Law Also called the First-Digit Law, refers to the frequency distribution of digits in many (but not all) real-life sources of data. In this distribution, the number 1 occurs as the leading digit about 30% of the time, while larger numbers occur in that position less frequently: 9 as the first digit less than 5% of the time Benford's Law also concerns the expected distribution for digits beyond the first, which approach a uniform distribution
13 40% Benford Distribution Between Compressed and Uncompressed Data 35% 30% 25% Benford Compressed Uncompress 20% 15% 10% 5% 0% First Digit
14 40% Benford Distribution Between Control and Event 35% 30% 25% Control Event 20% 15% 10% 5% 0% First Digit
15 Next Steps Alternate time series representation and dimensionality reduction Discrete wavelet transform Discrete Haar Wavelet Transform (DTWT) Adaptive Piecewise Constant Approximation Symbolic Aggregate Approximation (SAX) representation
16 Time Series Data Mining Pattern Discovery and Clustering for Motif discovery K-motif detection Anomaly detection or finding discords Distance-based Clustering Self Organizing Map (SOM) Multi-resolution Clustering (MPAA) ARIMA EM-Clustering Hidden Markov Model (HMM) Motif-based clustering
17 Classification Descriptive techniques Supervised learning - maps data into predefined categories/classes Nearest Neighbor classifier Applies the similarity measures to the object to be classified to determine its best classification based on the existing data that has already been classified Decision trees A set of rules are inferred from the training data, and this set of rules is then applied to any new data to be classified
18 Clustering K-means X X X X Hierarchical Clustering
19 Scalability From one to multiple PMUs multivariate time series mining Sub-second data collection and processing
20 Analytics Architecture OSIsoft PI server direct export to Hadoop Hadoop & myhadoop with Mahout KNIME batch job Revolution Analytics R libraries for Big Data Once the models are trained (near)real-time scoring can be implemented on the sensor streams enabling large prediction window horizon
21 Thank you! For further information, contact: Chuck Wells Natasha Balac
PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD. Natasha Balac, Ph.D.
PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD Natasha Balac, Ph.D. Brief History of SDSC 1985-1997: NSF national supercomputer center; managed by General Atomics
More informationHadoop on the Gordon Data Intensive Cluster
Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,
More informationApplication and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster
Application and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster Mahidhar Tatineni (mahidhar@sdsc.edu) MVAPICH User Group Meeting August 27, 2014 NSF grants: OCI #0910847 Gordon: A Data
More informationAppro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales
Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007
More informationHadoop Deployment and Performance on Gordon Data Intensive Supercomputer!
Hadoop Deployment and Performance on Gordon Data Intensive Supercomputer! Mahidhar Tatineni, Rick Wagner, Eva Hocks, Christopher Irving, and Jerry Greenberg! SDSC! XSEDE13, July 22-25, 2013! Overview!!
More informationAn Introduction to the Gordon Architecture
An Introduction to the Gordon Architecture Gordon Summer Institute & Cyberinfrastructure Summer Institute for Geoscientists August 8-11, 2011 Shawn Strande Gordon Project Manager San Diego Supercomputer
More informationDemystifying The Data Scientist
Demystifying The Data Scientist Natasha Balac, Ph.D. Predictive Analytics Center of Excellence, Director San Diego Supercomputer Center University of California, San Diego Brief History of SDSC 1985-1997:
More informationBIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationHPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk
HPC and Big Data EPCC The University of Edinburgh Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk EPCC Facilities Technology Transfer European Projects HPC Research Visitor Programmes Training
More informationJUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert
Mitglied der Helmholtz-Gemeinschaft JUROPA Linux Cluster An Overview 19 May 2014 Ulrich Detert JuRoPA JuRoPA Jülich Research on Petaflop Architectures Bull, Sun, ParTec, Intel, Mellanox, Novell, FZJ JUROPA
More informationData Mining for Big Data: Tools and Approaches. Pace SDSC SAN DIEGO SUPERCOMPUTER CENTER
Data Mining for Big Data: Tools and Approaches Pace SDSC Todo R domc exercise? Test train account Paradigm stream eg fro mbook? And mapred or join and vector mult? Outline Scaling What is Big Data Parallel
More information4 th Workshop on Big Data Benchmarking
4 th Workshop on Big Data Benchmarking MPP SQL Engines: architectural choices and their implications on benchmarking 09 Oct 2013 Agenda: Big Data Landscape Market Requirements Benchmark Parameters Benchmark
More informationHadoop: Embracing future hardware
Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop
More informationScalable Cloud Computing Solutions for Next Generation Sequencing Data
Scalable Cloud Computing Solutions for Next Generation Sequencing Data Matti Niemenmaa 1, Aleksi Kallio 2, André Schumacher 1, Petri Klemelä 2, Eija Korpelainen 2, and Keijo Heljanko 1 1 Department of
More informationFLOW-3D Performance Benchmark and Profiling. September 2012
FLOW-3D Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: FLOW-3D, Dell, Intel, Mellanox Compute
More informationPSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation
PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation 1. Overview of NEC PCIe SSD Appliance for Microsoft SQL Server Page 2 NEC Corporation
More informationWrangler: A New Generation of Data-intensive Supercomputing. Christopher Jordan, Siva Kulasekaran, Niall Gaffney
Wrangler: A New Generation of Data-intensive Supercomputing Christopher Jordan, Siva Kulasekaran, Niall Gaffney Project Partners Academic partners: TACC Primary system design, deployment, and operations
More informationInge Os Sales Consulting Manager Oracle Norway
Inge Os Sales Consulting Manager Oracle Norway Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database Machine Oracle & Sun Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database
More informationDIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies
More informationLenovo Database Configuration for Microsoft SQL Server 2014 37TB
Database Lenovo Database Configuration for Microsoft SQL Server 2014 37TB Data Warehouse Fast Track Solution Data Warehouse problem and a solution The rapid growth of technology means that the amount of
More informationIntel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance
Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance Hybrid Storage Performance Gains for IOPS and Bandwidth Utilizing Colfax Servers and Enmotus FuzeDrive Software NVMe Hybrid
More informationNews and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
More informationTUT NoSQL Seminar (Oracle) Big Data
Timo Raitalaakso +358 40 848 0148 rafu@solita.fi TUT NoSQL Seminar (Oracle) Big Data 11.12.2012 Timo Raitalaakso MSc 2000 Work: Solita since 2001 Senior Database Specialist Oracle ACE 2012 Blog: http://rafudb.blogspot.com
More informationThe Hardware Dilemma. Stephanie Best, SGI Director Big Data Marketing Ray Morcos, SGI Big Data Engineering
The Hardware Dilemma Stephanie Best, SGI Director Big Data Marketing Ray Morcos, SGI Big Data Engineering April 9, 2013 The Blurring of the Lines Business Applications and High Performance Computing Are
More informationIntel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband
Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband A P P R O I N T E R N A T I O N A L I N C Steve Lyness Vice President, HPC Solutions Engineering slyness@appro.com Company Overview
More informationOutline. What is Big data and where they come from? How we deal with Big data?
What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,
More informationOBJECTIVE ANALYSIS WHITE PAPER MATCH FLASH. TO THE PROCESSOR Why Multithreading Requires Parallelized Flash ATCHING
OBJECTIVE ANALYSIS WHITE PAPER MATCH ATCHING FLASH TO THE PROCESSOR Why Multithreading Requires Parallelized Flash T he computing community is at an important juncture: flash memory is now generally accepted
More informationSMB Direct for SQL Server and Private Cloud
SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server
More informationMain Memory Data Warehouses
Main Memory Data Warehouses Robert Wrembel Poznan University of Technology Institute of Computing Science Robert.Wrembel@cs.put.poznan.pl www.cs.put.poznan.pl/rwrembel Lecture outline Teradata Data Warehouse
More informationStovepipes to Clouds. Rick Reid Principal Engineer SGI Federal. 2013 by SGI Federal. Published by The Aerospace Corporation with permission.
Stovepipes to Clouds Rick Reid Principal Engineer SGI Federal 2013 by SGI Federal. Published by The Aerospace Corporation with permission. Agenda Stovepipe Characteristics Why we Built Stovepipes Cluster
More informationSR-IOV: Performance Benefits for Virtualized Interconnects!
SR-IOV: Performance Benefits for Virtualized Interconnects! Glenn K. Lockwood! Mahidhar Tatineni! Rick Wagner!! July 15, XSEDE14, Atlanta! Background! High Performance Computing (HPC) reaching beyond traditional
More informationHow To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 7 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx
More informationI/O Considerations in Big Data Analytics
Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very
More informationData Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com
Data Warehousing and Analytics Infrastructure at Facebook Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com Overview Challenges in a Fast Growing & Dynamic Environment Data Flow Architecture,
More informationMinimize cost and risk for data warehousing
SYSTEM X SERVERS SOLUTION BRIEF Minimize cost and risk for data warehousing Microsoft Data Warehouse Fast Track for SQL Server 2014 on System x3850 X6 (55TB) Highlights Improve time to value for your data
More informationPreview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.
Preview of Oracle Database 12c In-Memory Option 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any
More information2. COMPUTER SYSTEM. 2.1 Introduction
2. COMPUTER SYSTEM 2.1 Introduction The computer system at the Japan Meteorological Agency (JMA) has been repeatedly upgraded since IBM 704 was firstly installed in 1959. The current system has been completed
More informationDEPLOYING AND MONITORING HADOOP MAP-REDUCE ANALYTICS ON SINGLE-CHIP CLOUD COMPUTER
DEPLOYING AND MONITORING HADOOP MAP-REDUCE ANALYTICS ON SINGLE-CHIP CLOUD COMPUTER ANDREAS-LAZAROS GEORGIADIS, SOTIRIOS XYDIS, DIMITRIOS SOUDRIS MICROPROCESSOR AND MICROSYSTEMS LABORATORY ELECTRICAL AND
More informationUtilizing the SDSC Cloud Storage Service
Utilizing the SDSC Cloud Storage Service PASIG Conference January 13, 2012 Richard L. Moore rlm@sdsc.edu San Diego Supercomputer Center University of California San Diego Traditional supercomputer center
More informationApplication of Predictive Analytics for Better Alignment of Business and IT
Application of Predictive Analytics for Better Alignment of Business and IT Boris Zibitsker, PhD bzibitsker@beznext.com July 25, 2014 Big Data Summit - Riga, Latvia About the Presenter Boris Zibitsker
More informationBayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
More informationAmadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator
WHITE PAPER Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com SAS 9 Preferred Implementation Partner tests a single Fusion
More informationIs a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
More informationOpenMP Programming on ScaleMP
OpenMP Programming on ScaleMP Dirk Schmidl schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) MPI vs. OpenMP MPI distributed address space explicit message passing typically code redesign
More informationLecture 2 Parallel Programming Platforms
Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple
More informationHADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW
HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW 757 Maleta Lane, Suite 201 Castle Rock, CO 80108 Brett Weninger, Managing Director brett.weninger@adurant.com Dave Smelker, Managing Principal dave.smelker@adurant.com
More informationMississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC
Mississippi State University High Performance Computing Collaboratory Brief Overview Trey Breckenridge Director, HPC Mississippi State University Public university (Land Grant) founded in 1878 Traditional
More informationData Mining with Hadoop at TACC
Data Mining with Hadoop at TACC Weijia Xu Data Mining & Statistics Data Mining & Statistics Group Main activities Research and Development Developing new data mining and analysis solutions for practical
More informationPentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System
Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System By Jake Cornelius Senior Vice President of Products Pentaho June 1, 2012 Pentaho Delivers High-Performance
More informationWhat s the Big Deal? Big Data, Cloud & the Internet of Things. Christine Kirkpatrick San Diego Supercomputer Center, UC San Diego
What s the Big Deal? Big Data, Cloud & the Internet of Things Christine Kirkpatrick San Diego Supercomputer Center, UC San Diego A Futurist s Near-Term View The Future Depends on Data Self-driving car
More informationBuilding a Top500-class Supercomputing Cluster at LNS-BUAP
Building a Top500-class Supercomputing Cluster at LNS-BUAP Dr. José Luis Ricardo Chávez Dr. Humberto Salazar Ibargüen Dr. Enrique Varela Carlos Laboratorio Nacional de Supercómputo Benemérita Universidad
More informationHigh Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates
High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of
More informationUnstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012
Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012 1 Market Trends Big Data Growing technology deployments are creating an exponential increase in the volume
More informationAnalytics on Big Data
Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis
More informationCloud Data Center Acceleration 2015
Cloud Data Center Acceleration 2015 Agenda! Computer & Storage Trends! Server and Storage System - Memory and Homogenous Architecture - Direct Attachment! Memory Trends! Acceleration Introduction! FPGA
More informationMaximum performance, minimal risk for data warehousing
SYSTEM X SERVERS SOLUTION BRIEF Maximum performance, minimal risk for data warehousing Microsoft Data Warehouse Fast Track for SQL Server 2014 on System x3850 X6 (95TB) The rapid growth of technology has
More informationIncreasing Flash Throughput for Big Data Applications (Data Management Track)
Scale Simplify Optimize Evolve Increasing Flash Throughput for Big Data Applications (Data Management Track) Flash Memory 1 Industry Context Addressing the challenge A proposed solution Review of the Benefits
More informationHiBench Introduction. Carson Wang (carson.wang@intel.com) Software & Services Group
HiBench Introduction Carson Wang (carson.wang@intel.com) Agenda Background Workloads Configurations Benchmark Report Tuning Guide Background WHY Why we need big data benchmarking systems? WHAT What is
More informationBITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?
BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? The Big Data Buzz big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database
More informationComparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014
Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet Anand Rangaswamy September 2014 Storage Developer Conference Mellanox Overview Ticker: MLNX Leading provider of high-throughput,
More informationThe Greenplum Analytics Workbench
The Greenplum Analytics Workbench External Overview 1 The Greenplum Analytics Workbench Definition Is a 1000-node Hadoop Cluster. Pre-configured with publicly available data sets. Contains the entire Hadoop
More informationBig Data Performance Growth on the Rise
Impact of Big Data growth On Transparent Computing Michael A. Greene Intel Vice President, Software and Services Group, General Manager, System Technologies and Optimization 1 Transparent Computing (TC)
More informationIntroduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.
Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus
More informationIntegrating Apache Spark with an Enterprise Data Warehouse
Integrating Apache Spark with an Enterprise Warehouse Dr. Michael Wurst, IBM Corporation Architect Spark/R/Python base Integration, In-base Analytics Dr. Toni Bollinger, IBM Corporation Senior Software
More informationOracle Database In-Memory The Next Big Thing
Oracle Database In-Memory The Next Big Thing Maria Colgan Master Product Manager #DBIM12c Why is Oracle do this Oracle Database In-Memory Goals Real Time Analytics Accelerate Mixed Workload OLTP No Changes
More informationRevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
More informationElasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack
Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper
More informationHadoop SNS. renren.com. Saturday, December 3, 11
Hadoop SNS renren.com Saturday, December 3, 11 2.2 190 40 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December
More informationBenchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
More informationBuilding a Scalable Storage with InfiniBand
WHITE PAPER Building a Scalable Storage with InfiniBand The Problem...1 Traditional Solutions and their Inherent Problems...2 InfiniBand as a Key Advantage...3 VSA Enables Solutions from a Core Technology...5
More informationAgenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC
HPC Architecture End to End Alexandre Chauvin Agenda HPC Software Stack Visualization National Scientific Center 2 Agenda HPC Software Stack Alexandre Chauvin Typical HPC Software Stack Externes LAN Typical
More informationJournée Thématique Big Data 13/03/2015
Journée Thématique Big Data 13/03/2015 1 Agenda About Flaminem What Do We Want To Predict? What Is The Machine Learning Theory Behind It? How Does It Work In Practice? What Is Happening When Data Gets
More informationBig Data: Image & Video Analytics
Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)
More informationOracle Exadata Database Machine for SAP Systems - Innovation Provided by SAP and Oracle for Joint Customers
Oracle Exadata Database Machine for SAP Systems - Innovation Provided by SAP and Oracle for Joint Customers Masood Ahmed EMEA Infrastructure Solutions Oracle/SAP Relationship Overview First SAP R/3 release
More information2009 Oracle Corporation 1
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,
More informationCan High-Performance Interconnects Benefit Memcached and Hadoop?
Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,
More informationThe Maui High Performance Computing Center Department of Defense Supercomputing Resource Center (MHPCC DSRC) Hadoop Implementation on Riptide - -
The Maui High Performance Computing Center Department of Defense Supercomputing Resource Center (MHPCC DSRC) Hadoop Implementation on Riptide - - Hadoop Implementation on Riptide 2 Table of Contents Executive
More informationCSE-E5430 Scalable Cloud Computing Lecture 2
CSE-E5430 Scalable Cloud Computing Lecture 2 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 14.9-2015 1/36 Google MapReduce A scalable batch processing
More informationData Mining. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/
Data Mining Dr. Saed Sayad University of Toronto 2010 saed.sayad@utoronto.ca http://chem-eng.utoronto.ca/~datamining/ 1 Data Mining Data mining is about explaining the past and predicting the future by
More informationInfrastructure Matters: POWER8 vs. Xeon x86
Advisory Infrastructure Matters: POWER8 vs. Xeon x86 Executive Summary This report compares IBM s new POWER8-based scale-out Power System to Intel E5 v2 x86- based scale-out systems. A follow-on report
More informationU"lizing the SDSC Cloud Storage Service
U"lizing the SDSC Cloud Storage Service PASIG Conference January 13, 2012 Richard L. Moore rlm@sdsc.edu San Diego Supercomputer Center University of California San Diego SAN DIEGO SUPERCOMPUTER CENTER
More informationIntroduction to Hadoop on the SDSC Gordon Data Intensive Cluster"
Introduction to Hadoop on the SDSC Gordon Data Intensive Cluster" Mahidhar Tatineni SDSC Summer Institute August 06, 2013 Overview "" Hadoop framework extensively used for scalable distributed processing
More informationBig + Fast + Safe + Simple = Lowest Technical Risk
Big + Fast + Safe + Simple = Lowest Technical Risk The Synergy of Greenplum and Isilon Architecture in HP Environments Steffen Thuemmel (Isilon) Andreas Scherbaum (Greenplum) 1 Our problem 2 What is Big
More informationCurrent Status of FEFS for the K computer
Current Status of FEFS for the K computer Shinji Sumimoto Fujitsu Limited Apr.24 2012 LUG2012@Austin Outline RIKEN and Fujitsu are jointly developing the K computer * Development continues with system
More informationData Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1
Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints
More informationOutline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging
Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging
More informationBig Data in Test and Evaluation by Udaya Ranawake (HPCMP PETTT/Engility Corporation)
Big Data in Test and Evaluation by Udaya Ranawake (HPCMP PETTT/Engility Corporation) Approved for Public Release. Distribution Unlimited. Data Intensive Applications in T&E Win-T at ATC Automotive Data
More informationClusters: Mainstream Technology for CAE
Clusters: Mainstream Technology for CAE Alanna Dwyer HPC Division, HP Linux and Clusters Sparked a Revolution in High Performance Computing! Supercomputing performance now affordable and accessible Linux
More informationAdvanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
More informationKNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it
KNIME TUTORIAL Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it Outline Introduction on KNIME KNIME components Exercise: Market Basket Analysis Exercise: Customer Segmentation Exercise:
More informationOracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture
Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture Ron Weiss, Exadata Product Management Exadata Database Machine Best Platform to Run the
More informationCloud Computing through Virtualization and HPC technologies
Cloud Computing through Virtualization and HPC technologies William Lu, Ph.D. 1 Agenda Cloud Computing & HPC A Case of HPC Implementation Application Performance in VM Summary 2 Cloud Computing & HPC HPC
More informationIn-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps Yu Su, Yi Wang, Gagan Agrawal The Ohio State University Motivation HPC Trends Huge performance gap CPU: extremely fast for generating
More informationAdvances in Virtualization In Support of In-Memory Big Data Applications
9/29/15 HPTS 2015 1 Advances in Virtualization In Support of In-Memory Big Data Applications SCALE SIMPLIFY OPTIMIZE EVOLVE Ike Nassi Ike.nassi@tidalscale.com 9/29/15 HPTS 2015 2 What is the Problem We
More informationCloud Computing Where ISR Data Will Go for Exploitation
Cloud Computing Where ISR Data Will Go for Exploitation 22 September 2009 Albert Reuther, Jeremy Kepner, Peter Michaleas, William Smith This work is sponsored by the Department of the Air Force under Air
More informationHETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK
HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK Steve Oberlin CTO, Accelerated Computing US to Build Two Flagship Supercomputers SUMMIT SIERRA Partnership for Science 100-300 PFLOPS Peak Performance
More informationDriving Revenue Growth Across Multiple Markets
Driving Revenue Growth Across Multiple Markets Kevin Deierling Vice President, Marketing October 24, 2014 Multiple Growing Market Segments Key Markets Driving Growth High-Performance Computing Web 2.0
More informationHadoop s Advantages for! Machine! Learning and. Predictive! Analytics. Webinar will begin shortly. Presented by Hortonworks & Zementis
Webinar will begin shortly Hadoop s Advantages for Machine Learning and Predictive Analytics Presented by Hortonworks & Zementis September 10, 2014 Copyright 2014 Zementis, Inc. All rights reserved. 2
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More information