Simplified Machine Learning for CUDA. Umar Arshad @arshad_umar Arrayfire @arrayfire

Similar documents

Implementation of Neural Networks with Theano.

Supporting Online Material for

Programming Exercise 3: Multi-class Classification and Neural Networks

Introduction to Machine Learning CMU-10701

Chapter 4: Artificial Neural Networks

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

Efficient online learning of a non-negative sparse autoencoder

Analecta Vol. 8, No. 2 ISSN

Lecture 6. Artificial Neural Networks

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

6.2.8 Neural networks for data mining

Neural network software tool development: exploring programming language options

Introduction to Machine Learning Using Python. Vikram Kamath

Performance Evaluation of Artificial Neural. Networks for Spatial Data Analysis

Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

Improving Deep Neural Network Performance by Reusing Features Trained with Transductive Transference

Predict Influencers in the Social Network

SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK

Keywords: Image complexity, PSNR, Levenberg-Marquardt, Multi-layer neural network.

CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015

Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Image Classification for Dogs and Cats

Neural Networks and Support Vector Machines

CUDAMat: a CUDA-based matrix class for Python

Recurrent Neural Networks

An Introduction to Neural Networks

Deep Learning for Multivariate Financial Time Series. Gilberto Batres-Estrada

Neural Computation - Assignment

Novelty Detection in image recognition using IRF Neural Networks properties

Making Sense of the Mayhem: Machine Learning and March Madness

Electroencephalography Analysis Using Neural Network and Support Vector Machine during Sleep

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

Taking Inverse Graphics Seriously

Lecture 8 February 4

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

Visualizing Higher-Layer Features of a Deep Network

Package AMORE. February 19, 2015

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

A New Approach For Estimating Software Effort Using RBFN Network

1. Classification problems

Administrivia. Traditional Recognition Approach. Overview. CMPSCI 370: Intro. to Computer Vision Deep learning

Neural Network based Vehicle Classification for Intelligent Traffic Control

A simple application of Artificial Neural Network to cloud classification

Feature Engineering in Machine Learning

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

Advanced analytics at your hands

Machine Learning: Multi Layer Perceptrons

Data Mining using Artificial Neural Network Rules

DIAGONAL BASED FEATURE EXTRACTION FOR HANDWRITTEN ALPHABETS RECOGNITION SYSTEM USING NEURAL NETWORK

NEURAL NETWORKS A Comprehensive Foundation

Università degli Studi di Bologna

Role of Neural network in data mining

Data Mining. Supervised Methods. Ciro Donalek Ay/Bi 199ab: Methods of Sciences hcp://esci101.blogspot.

Artificial Neural Network for Speech Recognition

Method of Combining the Degrees of Similarity in Handwritten Signature Authentication Using Neural Networks

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

American International Journal of Research in Science, Technology, Engineering & Mathematics

Face Recognition For Remote Database Backup System

Why Does Unsupervised Pre-training Help Deep Learning?

IFT3395/6390. Machine Learning from linear regression to Neural Networks. Machine Learning. Training Set. t (3.5, -2,..., 127, 0,...

Chapter 6. The stacking ensemble approach

HPC with Multicore and GPUs

Performance Evaluation On Human Resource Management Of China S Commercial Banks Based On Improved Bp Neural Networks

Efficient Learning of Sparse Representations with an Energy-Based Model

Intro to GPU computing. Spring 2015 Mark Silberstein, , Technion 1

Neural Networks and Back Propagation Algorithm

Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition

To Recognize Shapes, First Learn to Generate Images

Data Mining Lab 5: Introduction to Neural Networks

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

AN APPLICATION OF TIME SERIES ANALYSIS FOR WEATHER FORECASTING

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

Neural Network Design in Cloud Computing

Machine Learning and Data Mining -

MapReduce/Bigtable for Distributed Optimization

Back Propagation Neural Networks User Manual

Machine Learning and Pattern Recognition Logistic Regression

Logistic Regression for Spam Filtering

Learning to Process Natural Language in Big Data Environment

Big Data Deep Learning: Challenges and Perspectives

Introduction to GPU Programming Languages

An Artificial Neural Networks-Based on-line Monitoring Odor Sensing System

Next Generation GPU Architecture Code-named Fermi

CUDA Programming. Week 4. Shared memory and register

3D Object Recognition using Convolutional Neural Networks with Transfer Learning between Input Channels

NEURAL NETWORK FUNDAMENTALS WITH GRAPHS, ALGORITHMS, AND APPLICATIONS

Chapter 1 Hybrid Intelligent Intrusion Detection Scheme

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

LARGE-SCALE MALWARE CLASSIFICATION USING RANDOM PROJECTIONS AND NEURAL NETWORKS

TRAINING A LIMITED-INTERCONNECT, SYNTHETIC NEURAL IC

A Content based Spam Filtering Using Optical Back Propagation Technique

Towards better accuracy for Spam predictions

Transcription:

Simplified Machine Learning for CUDA Umar Arshad @arshad_umar Arrayfire @arrayfire

ArrayFire CUDA and OpenCL experts since 2007 Headquartered in Atlanta, GA In search for the best and the brightest Expert domain experience in a wide variety of fields Computer Vision Machine Learning Financial etc.

ArrayFire Consulting Services Custom software development services Deep experience working with thousands of customers Analysis Acceleration Algorithm development Expert software engineers Large scale software development experience Production quality code Extensive domain knowledge

ArrayFire Training 2-4 Day CUDA or OpenCL training On site or at our headquarters Taught by a performance engineer by your side We have seen it all and know how to fix things Hands on labs You will not be copying code Run on GPU hardware Customized for your application Examples target your use-case Only C/C++ experience required

ArrayFire the Library A general purpose computational library Backends for CUDA, OpenCL and CPU Cross Platform Open Source (BSD - 3 - clause) Concentrate on performance and ease of use JIT Hundreds of Functions

Machine Learning Excellent for modeling highly dimensional data Pattern recognition Decision making Requires lots of data for good results

MNIST Dataset Dataset of handwritten digits 60,000 samples from ~250 writers 28x28 grayscale pixels Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. "Gradient-based learning applied to document recognition." Proceedings of the IEEE, 86(11):22782324, November 1998

Loading MNIST array train_images, train_targets; array test_images, test_targets; int num_train, num_test, num_classes; // Load mnist data setup_mnist<true>(&num_classes, &num_train, &num_test, train_images, test_images, train_targets, test_targets, 1.0); Labels Images Height... Width Samples 2 8 6 7 Samples

Perceptron Introduced in the late 1950 s Linear classifier W1 W2 Output W3 W4 Activation Function W5 W6 W7 W8

Teaching a Perceptron Initialize weights to zero Generate response Calculate error Update weights Repeat

Perceptron Weights Classes(10) Classes Pixels Pixels(784)

ArrayFire API Creating arrays in ArrayFire array zeros = constant(0, 5); array zeros2d = constant(0, 2, 3); // // // // // 0 0 0 0 0 // 0, 0, 0 // 0, 0, 0

Perceptron //Initialize weights to 0 const int pixel_count = 28*28; //train_feat.dims(1); const int num_labels = 10; //train_targest.dims(1); array weights = constant(0, pixel_count, num_labels);

Teaching a Perceptron Initialize weights to zero Generate response Calculate error Update weights Repeat

Response Sum of the input multiplied by the weights Send result into an activation function

Response Flatten Images... Height Pixels Images Width Samples Samples

ArrayFire API Reshaping volume into a matrix // Reshape images into feature vectors array out = moddims(in, dim0, dim1, dim2, dim3); // Reshape images into feature vectors array train_feats = moddims(train_images, pixel_count, num_train);

Response // Reshape images into feature vectors array train_feats = moddims(train_images, pixel_count, num_train); array test_feats = moddims(test_images, pixel_count, num_test );

Response... = Classes(10) Pixels(784) Classes(10) Pixels(784) Samples Samples

Response... = Classes(10) Pixels(784) Classes(10) Pixels(784) Samples Samples

Response... = Classes(10) Pixels(784) Classes(10) Pixels(784) Samples Samples

Response Matrix Multiply!... = Classes(10) Pixels(784) Classes(10) Pixels(784) Samples Samples

ArrayFire API array response = matmul(weights, train_feats);

Activation Function Sigmoid Function // Activation function array sigmoid(const array &val) { return 1 / (1 + exp(-val)); }

Prediction array prediction = sigmoid(matmul(weights, train_feats));

Teaching a Perceptron Initialize weights to zero Generate response Update weights Repeat

Calculating Error Subtract the expected output with prediction Multiply with the learning rate array err = train_targets - prediction; weights += learning_rate * (matmulnt(err, train_feats));

Teaching a Perceptron Initialize weights to zero Generate response Update weights Repeat

Repeat for(int i = 0; i < 100; i++) { array prediction = sigmoid(matmul(weights, train_feats)); array err = train_targets - prediction; float mean_abs_error = mean<float>(abs(err)); printf("err: %0.4f\n", mean_abs_error); weights += learning_rate * (matmulnt(err, train_feats)); }

Results Measure Accuracy float accuracy(const array& predicted, const array& target) { array val, plabels, tlabels; max(val, tlabels, target, 0); max(val, plabels, predicted, 0); return 100 * count<float>(plabels == tlabels) / tlabels.elements(); } 82.03%

Perceptron Improvements Smaller batches Variable learning rate Linear Classifier! Handwritten digit recognition cannot be solved by a linear classifier Additional layers are required

Neural Networks Made up of one or more layers of neurons Hidden layers updated using back propagation Inputs Outputs

Back Propagation Hidden layers do not have an expected output Calculating error on output Send in data from the output layer back into network Gradient descent

Back Propagation void ann::back_propagate(const vector<array> signal, const array &target, const double &alpha){ // Get error for output layer array out = signal[num_layers - 1]; array err = (out - target); int m = target.dims(0); for (int i = num_layers - 2; i >= 0; i--) { array in = add_bias(signal[i]); array delta = (deriv(out) * err).t(); // Adjust weights array grad = -(alpha * matmul(delta, in)) / m; weights[i] += grad.t(); // Input to current layer is output of previous out = signal[i]; err = matmul(weights[i], delta).t(); // Remove the error of bias and propagate backward err = err(span, seq(1, out.dims(1))); } }

Results Accuracy: 93.90% Time: 31.30 seconds Epoch: 250

Back Propagation Effective Slow for deeper networks Requires labeled data

Deep Belief Nets A neural network made of multiple layers Restricted Boltzmann Machines Deep belief networks are great with unlabeled data ANN Labels RBM RBM

Restricted Boltsmann Machine A neural network with one hidden layer Each hidden neuron is connected to every visible neuron The connection has a weight which represents how strongly the neuron reacts to that input A bias is associated with both hidden and visible neurons Hidden Visible

Restricted Boltsmann Machine Data Representation Hidden Hidden Layers Visible Layers Visible

RBM Lets create our RBM class rbm { array weights; array h_bias; array v_bias; public: rbm(visible_size, hidden_size) : weights( constant(0, hidden_size, visible_size)), h_bias( constant(0, 1, hidden_size)), v_bias( constant(0, 1, visible_size)) {} }; Visible

Training RBM Feed input into RBM Calculate the response Feed the response back through the network Calculate the error of the reconstruction Adjust the weights

Building the DBN Feed the output of the previous layer to the next Learns higher level features Use back propagation to fine tune the data

Results Accuracy: 93.46% Time: 13.27 seconds Epoch: 108

Improvements More data Larger network Learning rate Longer iterations

Questions