Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Similar documents

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15

Steven C.H. Hoi School of Information Systems Singapore Management University

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Deep Learning For Text Processing

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Applications of Deep Learning to the GEOINT mission. June 2015

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

DEEP LEARNING WITH GPUS

Machine Learning: Overview

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Deep Learning and GPUs. Julie Bernauer

Software Development Training Camp 1 (0-3) Prerequisite : Program development skill enhancement camp, at least 48 person-hours.

Deep Learning Meets Heterogeneous Computing. Dr. Ren Wu Distinguished Scientist, IDL, Baidu

BIG DATA What it is and how to use?

Learning to Process Natural Language in Big Data Environment

Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory. The Nation s Premier Laboratory for Land Forces UNCLASSIFIED

Hyperscale. The new frontier for HPC. Philippe Trautmann. HPC/POD Sales Manager EMEA March 13th, 2011

Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research

Big-Data Computing with Smart Clouds and IoT Sensing

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

HPC Cloud. Focus on your research. Floris Sluiter Project leader SARA

Neural Networks and Support Vector Machines

NVIDIA GPUs in the Cloud

Deep learning applications and challenges in big data analytics

Journée Thématique Big Data 13/03/2015

Deep Learning GPU-Based Hardware Platform

Topics in basic DBMS course

SURVEY REPORT DATA SCIENCE SOCIETY 2014

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

Introduction to Data Mining

Pedraforca: ARM + GPU prototype

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Graph Processing: Some Background

In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University

Scalable and High Performance Computing for Big Data Analytics in Understanding the Human Dynamics in the Mobile Age

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Optimizing a 3D-FWT code in a cluster of CPUs+GPUs

The Data Mining Process

HPC with Multicore and GPUs

Data Intensive Science and Computing

A simple application of Artificial Neural Network to cloud classification

Cloud Data Center Acceleration 2015

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

ultra fast SOM using CUDA

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

CSC384 Intro to Artificial Intelligence

Scalable Developments for Big Data Analytics in Remote Sensing

SIPAC. Signals and Data Identification, Processing, Analysis, and Classification

Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network

FLOW-3D Performance Benchmark and Profiling. September 2012

The Machine: The future of technology Hyperscale Division EMEA

Cloud-Based Apps Drive the Need for Frequency-Flexible Clock Generators in Converged Data Center Networks

Information Management course

Parallel Programming Survey

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

ECBDL 14: Evolu/onary Computa/on for Big Data and Big Learning Workshop July 13 th, 2014 Big Data Compe//on

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health

ArcGIS Pro: Virtualizing in Citrix XenApp and XenDesktop. Emily Apsey Performance Engineer

Accelerating Real Time Big Data Applications. PRESENTATION TITLE GOES HERE Bob Hansen

Machine Learning Introduction

GPU Programming in Computer Vision

Pedestrian Detection with RCNN

Intelligent Flexible Automation

SGI High Performance Computing

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

Large-Scale Data Processing

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Colgate-Palmolive selects SAP HANA to improve the speed of business analytics with IBM and SAP

Building All-Flash Software Defined Storages for Datacenters. Ji Hyuck Yun Storage Tech. Lab SK Telecom

What is Artificial Intelligence?

Data Centric Systems (DCS)

ONE platform for ALL YOUR DATA Radim Petrzela February 26 th, 2013

Findings in High-Speed OrthoMosaic

Scientific Computing Data Management Visions

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

Caching Mechanisms for Mobile and IOT Devices

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

The Big Data Paradigm Shift. Insight Through Automation

Cray: Enabling Real-Time Discovery in Big Data

A Holistic Model of the Energy-Efficiency of Hypervisors

Class 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots?

Research Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

How To Build An Ark Processor With An Nvidia Gpu And An African Processor

Introduction to GPU Programming Languages

Unlocking the Intelligence in. Big Data. Ron Kasabian General Manager Big Data Solutions Intel Corporation

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC Denver

MulticoreWare. Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan

Transcription:

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Natalia Vassilieva, PhD Senior Research Manager GTC 2016

Deep learning proof points as of today Vision Speech Text Other Search & information extraction Security/Video surveillance Self-driving cars Robotics Interactive voice response (IVR) systems Voice interfaces (Mobile, Cars, Gaming, Home) Security (speaker identification) Health care People with disabilities Search and ranking Sentiment analysis Machine translation Question answering Recommendation engines Advertising Fraud detection AI challenges Drug discovery Sensor data analysis Diagnostic support 2

Why Deep Learning & Sensor Data? Deep Learning is about Huge volumes of training data (labeled and unlabeled) Multidimensional and complex data with nontrivial patterns (spatial or temporal) Replacement of manual feature engineering with unsupervised feature learning Cross modality feature learning Sensor Data is about Huge volumes of data (mostly unlabeled) Complex data with non-trivial patterns (mostly temporal) Variety of data representations, feature engineering is hard Multiple modalities Works well for speech! Most sensor data is time series 3

This talk Does Deep Learning work for sensor data? Do existing infrastructure and algorithms fit sensor data? The Machine and Distributed Mesh Computing 4

Part I Case Study: Sensor Data Analysis with Deep Learning 5

Patient activity recognition from accelerometer data Scripted video and accelerometer data from one sensor and 52 subjects (~20 min per subject) Accelerometer data: 500Hz x 4 dimensions = 12000 measurements per minute per person 16 classes

Accelerometer data (X axes) 7

Data distribution Total number of frames: ~3.35M 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 8

Approaches Baselines - ZeroR (majority class predictor) - Support Vector Machines - Decision Trees (C50 implementation) - Shallow Neural Networks (FANN library) - Features: manually engineered Deep Neural Networks - Fully-connected hidden layers (pre-trained with stacked sparse autoencoders) + softmax - Time-delay layers - Recurrent layers - Deep Neural Networks and Conditional Random Fields - Multiple meta-parameter configurations - Features: amplitude spectrum 9

Approaches Baselines - ZeroR (majority class predictor) - Support Vector Machines - Decision Trees (C50 implementation) - Shallow Neural Networks (FANN library) - Features: manually engineered Deep Neural Networks - Fully-connected: stacked sparse autoencoders + softmax - Time-delay layers - Recurrent layers - Deep Neural Networks and Conditional Random Fields - Multiple meta-parameter configurations - Features: amplitude spectrum 10

Growing class-separation power of deeper representations Raw data First level of representations Second level of representations

Results: single person Baseline methods, engineered features ZeroR SVM (binary) Shallow NN c50 SVM set type size train 388518 cv 129510 test 129510 total 647538 Accuracy 71.2 97.6 98.6 99.6 98.03 Deep Neural Networks, amplitude spectrum 1533-200-200-16 Accuracy 99.7 12

Results: 52 subjects, subject-independent Baselines v.s. DNN ZeroR c50 DNN DNN + CRF set type size train 2608637 cv 0 test 738180 total 3346817 Accuracy 69.7 71.6 84.5 95.1 DNN DNN + CRF True labels 13

Results: 52 subjects, subject-independent Baselines v.s. DNN ZeroR c50 DNN DNN + CRF set type size train 2608637 cv 0 test 738180 total 3346817 Accuracy 69.7 71.6 84.5 95.1 Deep models: - are better at classification on sensor data (generalize better) - do not require sophisticated feature engineering - require significant amount of iterations to converge 14

Part II Today s infrastructure and Deep Learning 15

Today s scale Model size, data size, compute requirements Application Model Training data FLOP per epoch Training time Vision 1.7 * 10 9 ~6.8 GB Speech 60 * 10 6 ~240 MB Text 6.5 * 10 6 ~260 MB Signals 1.2 * 10 6 ~4.8 MB 14*10 6 images ~2.5 TB (256x256) ~10 TB (512x512) 100K hours of audio ~34*10 9 frames ~50 TB 6*1.7*10 9 *14*10 6 3 days x 16000 cores ~1.4*10 17 2 days x 16 servers x 4 GPUs 8 hours x 36 servers x 4 GPUs 6*60*10 6 *34*10 9 days x 8 GPUs ~1.2*10 19 856*10 6 words 6*6.5*10 6 *856*10 6 ~3.3*10 16 4 weeks 3*10 6 frames 6*1.2*3*10 6 *3*10 6 6.5*10 13 days

Challenges of DNN training Slow and expensive Very large number of parameters (>10 6 ), huge (unlabeled) data sets for training (10 6-10 9 ) Computationally expensive: requires O(model size * data size) FLOPs per epoch Needs many iterations (and epochs) to converge Needs frequent synchronization to converge fast Compute requirements today: 10 13 10 19 FLOPs per epoch 1 epoch per hour: ~10x TFLOPS SP Today s hardware: NVIDIA Titan X: 7 TFLOPS SP, 12 GB memory NVIDIA Tesla M40: 7 TFLOPS SP, 12 GB memory NVIDIA Tesla K40: 4.29 TFLOPS SP, 12 GB memory NVIDIA Tesla K80: 5.6 TFLOPS SP, 24 GB memory INTEL Xeon Phi: 2.4 TFLOPS SP 17

Scalability of DNN training for time series Hard to scale Google Brain: 1000 machines (16000 CPUs) x 3 days COTS HPC systems: 16 machines x 4 GPUs x 2 days Deep Image by Baidu: 36 machines x 4 GPUs x ~8 hours Deep Speech by Baidu: 8 GPUs x ~weeks Deep Speech 2 by Baidu: 8 or 16 GPUs x 3 to 5 days Limited scalability of training for speech/time-series data! J. Dean et. al, Large Scale Distributed Deep Networks 18

Types of artificial neural networks Topology to fit data characteristics Images: Locally Connected Convolutional Speech, time series, sequences: Fully Connected, Recurrent Input Hidden Layer 1 Hidden Layer 2 Hidden Layer 3 Output Input Hidden Layer 1 Hidden Layer 2 Hidden Layer 3 Output 19

Today s hardware (scale-out or scale-up) CPU/GPU cluster Multi-socket large memory machine GPU GPU InfiniBand: ~12 GB/s CPU PCIe CPU QPI link CPU PCIe: ~16 GB/s QPI link: ~12.8 GB/s per direction Memory Memory Memory InfiniBand NUMA node 1 NUMA node 2 GPU GPU PCIe CPU QPI link CPU CPU Memory Memory Memory NUMA node 3 NUMA node 4 20

Part III The Machine and Distributed Mesh Computing 21

Memory SoC SoC Memory SoC SoC Memory SoC Memory + Fabric SoC SoC SoC Memory Processor-centric computing Memory-Driven Computing 22

I/O Copper 23

Copper 24

Copper 25

26

Memory SoC GPU Memory SoC SoC Memory RISC VOpen Architecture ASIC SoC Quantum Memory Processor-centric computing Memory-Driven Computing 27

The Machine will be ported to different scales and form-factors The Machine Many cores Massive pool of NVM Photonics NVM = Non-volatile memory 28

The evolution of the IoT Gen 0 Yesteryears Gen 1 Today Gen 2 Tomorrow Gen 3 The future Things on a network Still works well for small, local, custom systems with strict performance needs The Cloudcentric IoT Good choice for low-cost things where data can easily be moved, with few ramifications Edge analytics Ideal for things producing large volumes of data that are difficult, costly or sensitive to move Distributed Mesh Computing Multi-party things autonomously collaborate with privacy intact 29

Tomorrow: Deep Learning and Edge Analytics Edge Node Gets trained model Uses the model in real-time Collects data Sends some data to center Center Collects all data Trains model Sends model to edge nodes 30

The Future: Deep Learning and Distributed Mesh Computing Edge Node Participate in Training Uses the model in real-time Collects data Sends some data in mesh The Mesh Distributed training Sends model as needed 31

Summary Does Deep Learning work for sensor data? Yes, we have proof points Do existing infrastructure and algorithms fit sensor data? No, training deep models for sensor data is slow and expensive today The Machine and Distributed Mesh Computing We believe this changes everything 24

Thank you! Natalia Vassilieva nvassilieva@hpe.com To learn more about Hewlett Packard Labs, visit: http://www.labs.hpe.com To learn more, visit www.hpe.com/themachine 33