Introduction to PyTorch. Jingfeng Yang (Part of tutorials and slides are made by Nihal Singh)

Similar documents
Simplified Machine Learning for CUDA. Umar

CUDAMat: a CUDA-based matrix class for Python

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

Advanced analytics at your hands

Predict Influencers in the Social Network

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15

Supporting Online Material for

Implementation of Neural Networks with Theano.

Pedestrian Detection with RCNN

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Lecture 8 February 4

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS

Sequence to Sequence Weather Forecasting with Long Short-Term Memory Recurrent Neural Networks

Analytic Modeling in Python

Application of Neural Network in User Authentication for Smart Home System

PhD in Computer Science and Engineering Bologna, April Machine Learning. Marco Lippi. Marco Lippi Machine Learning 1 / 80

STA 4273H: Statistical Machine Learning

Supervised Feature Selection & Unsupervised Dimensionality Reduction

Data Mining Practical Machine Learning Tools and Techniques

Getting Started with Caffe Julien Demouth, Senior Engineer

Intro to scientific programming (with Python) Pietro Berkes, Brandeis University

Self Organizing Maps: Fundamentals

Unlocking the True Value of Hadoop with Open Data Science

Introduction to GPU Programming Languages

Programming for Big Data

Data Mining. Nonlinear Classification

DEEP LEARNING WITH GPUS

Efficient online learning of a non-negative sparse autoencoder

Programming Exercise 3: Multi-class Classification and Neural Networks

Calling R Externally : from command line, Python, Java, and C++

Journée Thématique Big Data 13/03/2015

NEURAL NETWORKS A Comprehensive Foundation

Parallel & Distributed Optimization. Based on Mark Schmidt s slides

1. Classification problems

GPU Programming in Computer Vision

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

Anomaly detection. Problem motivation. Machine Learning

1 Topic. 2 Scilab. 2.1 What is Scilab?

Performance Evaluation On Human Resource Management Of China S Commercial Banks Based On Improved Bp Neural Networks

Big Data Paradigms in Python

New Work Item for ISO Predictive Analytics (Initial Notes and Thoughts) Introduction

WORKSPACE WEB DEVELOPMENT & OUTSOURCING TRAINING CENTER

Big Data Analytics. Lucas Rego Drumond

MACHINE LEARNING IN HIGH ENERGY PHYSICS

Computational Mathematics with Python

Peach Fuzzer Platform

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa

Data Mining Techniques for Prognosis in Pancreatic Cancer

Real-Time Analytics on Large Datasets: Predictive Models for Online Targeted Advertising

Masters in Information Technology

HPC with Multicore and GPUs

Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations

Lecture 6. Artificial Neural Networks

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 8: Multi-Layer Perceptrons

Collaborative Filtering Scalable Data Analysis Algorithms Claudia Lehmann, Andrina Mascher

Big Data Analytics Using Neural networks

Exercise 0. Although Python(x,y) comes already with a great variety of scientic Python packages, we might have to install additional dependencies:

ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node

Parallel and Large Scale Learning with scikit-learn

SURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH

BIG DATA What it is and how to use?

Scalable Developments for Big Data Analytics in Remote Sensing

T : Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari :

Part I Courses Syllabus

Machine Learning in Python with scikit-learn. O Reilly Webcast Aug. 2014

Polynomial Neural Network Discovery Client User Guide

Assignment 2: Option Pricing and the Black-Scholes formula The University of British Columbia Science One CS Instructor: Michael Gelbart

Introduction to Learning & Decision Trees

Neural network software tool development: exploring programming language options

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Computational intelligence in intrusion detection systems

CS Matters in Maryland CS Principles Course

L25: Ensemble learning

Chapter 6. The stacking ensemble approach

Learning to Process Natural Language in Big Data Environment

Neural Network Add-in

Image Captioning A survey of recent deep-learning approaches

An Introduction to APGL

Tutorial on Markov Chain Monte Carlo

Computational Mathematics with Python

COMP 598 Applied Machine Learning Lecture 21: Parallelization methods for large-scale machine learning! Big Data by the numbers

IBM SPSS Neural Networks 22

How To Develop Software

Name Spaces. Introduction into Python Python 5: Classes, Exceptions, Generators and more. Classes: Example. Classes: Briefest Introduction

Table of Contents. Chapter No. 1 Introduction 1. iii. xiv. xviii. xix. Page No.

TAACO Quick Start Guide: Windows 7 Kristopher Kyle and Scott Crossley

Adaptive Framework for Network Traffic Classification using Dimensionality Reduction and Clustering

Parallel Computing: Strategies and Implications. Dori Exterman CTO IncrediBuild.

Data Mining Techniques Chapter 7: Artificial Neural Networks

Apache Spark 11/10/15. Context. Reminder. Context. What is Spark? A GrowingStack

Postprocessing with Python

Analecta Vol. 8, No. 2 ISSN

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

Image and Video Understanding

Feature Engineering in Machine Learning

SIGNAL INTERPRETATION

Parallel Data Mining. Team 2 Flash Coders Team Research Investigation Presentation 2. Foundations of Parallel Computing Oct 2014

Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90

CHAPTER 6 IMPLEMENTATION OF CONVENTIONAL AND INTELLIGENT CLASSIFIER FOR FLAME MONITORING

Transcription:

Introduction to PyTorch Jingfeng Yang (Part of tutorials and slides are made by Nihal Singh)

Outline Pytorch Introduction Basics Examples

Introduction to PyTorch

What is PyTorch? Open source machine learning library Developed by Facebook's AI Research lab It leverages the power of GPUs Automatic computation of gradients Makes it easier to test and develop new ideas.

Other libraries?

Why PyTorch? It is pythonic - concise, close to Python conventions Strong GPU support Autograd - automatic differentiation Many algorithms and components are already implemented Similar to NumPy

Why PyTorch?

Getting Started with PyTorch Installation Via Anaconda/Miniconda: conda install pytorch Via pip: pip3 install torch

PyTorch Basics

ipython Notebook Tutorial bit.ly/pytorchbasics

Tensors Tensors are similar to NumPy s ndarrays, with the addition being that Tensors can also be used on a GPU to accelerate computing. Common operations for creation and manipulation of these Tensors are similar to those for ndarrays in NumPy. (rand, ones, zeros, indexing, slicing, reshape, transpose, cross product, matrix product, element wise multiplication)

Tensors Attributes of a tensor 't': t= torch.randn(1) requires_grad- making a trainable parameter By default False Turn on: t.requires_grad_()or t = torch.randn(1, requires_grad=true) Accessing tensor value: t.data Accessingtensor gradient t.grad grad_fn- history of operations for autograd t.grad_fn

Loading Data, Devices and CUDA Numpy arrays to PyTorch tensors torch.from_numpy(x_train) Returns a cpu tensor! PyTorch tensor to numpy t.numpy() Using GPU acceleration t.to() Sends to whatever device (cuda or cpu) Fallback to cpu if gpu is unavailable: torch.cuda.is_available() Check cpu/gpu tensor OR numpy array? type(t)or t.type()returns numpy.ndarray torch.tensor CPU - torch.cpu.floattensor GPU - torch.cuda.floattensor

Autograd Automatic Differentiation Package Don t need to worry about partial differentiation, chain rule etc. backward() does that Gradients are accumulated for each step by default: Need to zero out gradients after each update tensor.grad_zero()

Optimizer and Loss Optimizer Adam, SGD etc. An optimizer takes the parameters we want to update, the learning rate we want to use along with other hyper-parameters and performs the updates Loss Various predefined loss functions to choose from L1, MSE, Cross Entropy

Model In PyTorch, a model is represented by a regular Python class that inherits from the Module class. Two components init (self): it defines the parts that make up the model- in our case, two parameters, a and b forward(self, x) : it performs the actual computation, that is, it outputs a prediction, given the inputx

PyTorch Example (neural bag-of-words (ngrams) text classification) bit.ly/pytorchexample

Overview Sentence Embedding Layer Linear Layer Training Cross Entropy Softmax Evaluation Prediction

Design Model Initilaize modules. Use linear layer here. Can change it to RNN, CNN, Transformer etc. Randomly initilaize parameters Foward pass

Preprocess Build and preprocess dataset Build vocabulary

Preprocess One example of dataset: Create batch ( Used in SGD ) Choose pad or not ( Using [PAD] )

Training each epoch Iterable batches Before each optimization, make previous gradients zeros Forward pass to compute loss Backforward propagation to compute gradients and update parameters After each epoch, do learning rate decay ( optional )

Test process Do not need back propagation or parameter update!

The whole training process Use CrossEntropyLoss() as the criterion. The input is the output of the model. First do logsoftmax, then compute cross-entropy loss. Use SGD as optimizer. Use exponential decay to decrease learning rate Print information to monitor the training process

Evaluation with test dataset or random news