Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg
Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual Recognition Social Media Multimedia Analytics Contact: chhoi@smu.edu.sg 18/9/2015 Machine Learning (Steven Hoi) 2
What is Big Data Volume Velocity Variety Source: http://www.ibmbigdatahub.com/sites/default/files/infographic_file/4-vs-of-big-data.jpg 18/9/2015 Machine Learning (Steven Hoi) 3
Data-Driven Decision Marking 18/9/2015 Machine Learning (Steven Hoi) 4
Big Data Analytics: Challenges Volume Efficiency Handle vast volume of data (million or even billion) with limited computing capacity (CPU/RAM/DISK) Scalability Be able to scale up to handle explosively increasing data (e.g., real-time stream data) Velocity Machine Big Data Learning Analytics Variety Adaptability Be able to adapt complex and fastchanging environment to deal with diverse data and evolving concepts 18/9/2015 Machine Learning (Steven Hoi) 5
Roadmap Introduction Machine Learning Online Learning Deep Learning Applications Cyber-security Analytics Image Analytics Conclusions 18/9/2015 Machine Learning (Steven Hoi) 6
What is Machine Learning? Herbert Simon: Learning is any process by which a system improves performance from experience. Tom M. Mitchell: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E A well-defined learning task: <P, T, E> 18/9/2015 Machine Learning (Steven Hoi) 7
Types of Learning Supervised (inductive) learning Training data includes desired outputs Unsupervised learning Training data does not include desired outputs Semi-supervised learning Training data includes a few desired outputs Reinforcement learning Rewards from sequence of actions 18/9/2015 Machine Learning (Steven Hoi) 8
Supervised Learning Given (input, correct output), predict (input,?) Classification: discrete output Binary Classification Given input x, find y in {-1, +1} Multi-class classification Given input x, find y in {1,, k} Regression: continuous output Given input x, find y in real-valued space R (R^d) 18/9/2015 Machine Learning (Steven Hoi) 9
Traditional Supervised ML Two major drawbacks Batch learning Shallow models Online Learning Deep Learning 18/9/2015 Machine Learning (Steven Hoi) 10
What is Online Learning Batch/Offline Learning Online Learning Feedback Learner Update Predictor 18/9/2015 Machine Learning (Steven Hoi) 11
Why Online Learning? Avoid re-training when adding new data High efficiency Excellent scalability Strong adaptability to changing environments Simple to understand Trivial to implement Easy to be parallelized Theoretical guarantee 18/9/2015 Machine Learning (Steven Hoi) 12
What is Deep Learning? A family of machine learning algorithms based on multi-layer networks The origin dates back to the early works of Artificial Neural Networks in 1980s or earlier Partially inspired by the biological architecture of brain in neuroscience Higher layers form higher levels of abstraction ttps://grey.colorado.edu/compcogneuro/index.php/ccnbook/perception 18/9/2015 Machine Learning (Steven Hoi) 13
Shallow Learning vs Deep Learning f(x) f(x) (a) Linear models f(x) K(x1,x) K(xi,x) K(xn,x) (b) Non-linear models with shallow architecture (c) Non-linear Model with deep architecture 18/9/2015 Machine Learning (Steven Hoi) 14
Why Deep Learning? Machine Learning advanced model Accuracy Deep Learning Big Data Good Products Traditional Learning simple model Small data Scalability Big data 18/9/2015 Machine Learning (Steven Hoi) 15
Machine Learning for Big Data Analytics Applications Finance People to Machine Recommender Systems Cyber Security Machine Learning Video Analytics Social Media Image Analytics People to People Machine to Machine 18/9/2015 Machine Learning (Steven Hoi) 16
Machine Learning for Big Data Analytics Applications Cyber Security Machine Learning 18/9/2015 Machine Learning (Steven Hoi) 17
Machine Learning for Cyber Security Analytics Cyber Security Analytics from traditional log and rules based approach to machine learning based approach to make realtime informed decisions Big Data Analytics 18/9/2015 Machine Learning (Steven Hoi) 18
Machine Learning for Cyber Security Online Anomaly Detection (outlier/intrusion/fraud) Examples Fraud credit card transactions Malicious web/ spam email filtering Network intrusion detection systems 18/9/2015 Machine Learning (Steven Hoi) 19
Machine Learning for Cyber Security Challenges of Online Anomaly Detection Manage real-time data and response instantly Highly class imbalance (#anomalies << # normal) Different misclassification costs Labeling cost could be expensive Anomaly concepts/patterns often evolve over time Cost-Sensitive Online Active Learning for Malicious URLs (KDD 13) 18/9/2015 Machine Learning (Steven Hoi) 20
Machine Learning for Big Data Analytics Applications Machine Learning Image Analytics 18/9/2015 Machine Learning (Steven Hoi) 21
The Rising Visual Internet Social Trend: More VISUAL less TEXT contents 18/9/2015 Machine Learning (Steven Hoi) 22
The Era of Deep Learning Large Scale Visual Recognition Challenge (error rate from 2010 to 2014) Deep Learning https://medium.com/global-silicon-valley/machine-learning-yesterday-today-tomorrow-3d3023c7b519 18/9/2015 Machine Learning (Steven Hoi) 23
Deep Learning for Visual Recognition Classical Computer Vision Pipeline Feature Extraction: SIFT, HoG... Detection, Classification Recognition Deep Learning for Visual Recognition Detection, Deep NN Deep Learning Classification Deep NN... Recognition 18/9/2015 Machine Learning (Steven Hoi) 24
Deep Learning: Basics Deep Learning is a set of machine learning algorithms based on multi-layer networks CAT DOG 18/9/2015 Machine Learning (Steven Hoi) 25
Deep Learning: Basics Deep Learning is a set of machine learning algorithms based on multi-layer networks CAT DOG 18/9/2015 Machine Learning (Steven Hoi) 26
Convolutional Neural Nets (CNN) (LeCun et al. 1989) Each output neuron is only connected with a local patch of input pixels The same parameters are shared across different locations, and scan input image generate a feature map Use multiple filters to scan the image and generate multiple feature maps LeNet 18/9/2015 Feature Extraction Machine Learning (Steven Hoi) 27 Classification
#parameters 4M 16M 37M 442K 1.3M 884K 307K 35K Deep Convolutional Neural Nets 60M Param 650K neurons 18/9/2015 LINEAR FULLY CONNECTED FULLY CONNECTED MAX POOLING CONV CONV CONV prediction MAX POOLING LOCAL RESPONSE NORM CONV MAX POOLING LOCAL RESPONSE NORM CONV input #FLOPs: 4M 16M 37M 74M 224M 149M 223M 105M 832M FLOPS (Krizhevsky et al. NIPS 12) AlexNet: a deep convolutional neural network trained on ILSVRC12. + data + gpu + non-saturating nonlinearity + regularization took 5~6 days on two NVIDIA GTX 580 GPUs Machine Learning (Steven Hoi) 28
More Deep Conv Nets VGG Nets (16 layers / 19 layers) Zeiler-Fergus Net Convolution Pooling Softmax Other 18/9/2015 GoogLeNet Machine Learning (Steven Hoi) 29
How to train a BIG deep net quickly? GPU-based training GPU Cluster NVIDIA K80 cards Head Node InfiniBand External Access Parallelism Principles Model Parallelism Data Parallelism Empirical Evaluation Compute Node 1 Compute Node 2 18/9/2015 Machine Learning (Steven Hoi) 30
Content-Based Image Retrieval (CBIR) 27-Aug-2015 Large Scale Deep Learning (Steven HOI) 31 18/9/2015 Machine Learning (Steven Hoi) 31
CBIR Pipelines and Challenges Color, texture, BoW Euclidean, Cosine Query Image Feature Representation Deep NN Similarity Search Deep NN top k similar images Index Image DB Key open questions: How to represent an image How to measure image similarity Deep Learning for CBIR How to adapt to a new CBIR domain/task? 18/9/2015 Machine Learning (Steven Hoi) 32
Convolutional Neural Network Input Raw RBG Image Fully-Connection Layer (FC1) Fully-Connection Layer (FC2) Final Output Labels (FC3) Low Level Mid Level High Level Deep Learning for Image Retrieval Massive Source Image in Various Categories ( e.g. ImageNet ILSVRC2012 )... Local Contrast Norm and Sample Pooling Convolutional Filtering Loops for high Level Feature (normalization and pooling are optional) (Krizhevsky et al, NIPS 12) 18/9/2015 Total # parameters: 60-million Machine Learning (Steven Hoi) 33
Fully-Connection Layer (FC1) Fully-Connection Layer (FC2) Final Output Labels (FC3) Deep Learning for Image Retrieval Fully connected layers Deep Learning for CBIR on NEW Domain Apply CNN models on new image datasets Feature Representation for CBIR SCHEME-I: Direct Representation SCHEME-II: Similarity/Metric Learning SCHEME-III: Deep Model Fine-tuning New Image Retrieval Dataset 1 New Image Retrieval Dataset 2 New Image Retrieval Dataset n Wan, Ji, Dayong Wang, Steven Chu Hong Hoi, Pengcheng Wu, Jianke Zhu, Yongdong Zhang, and Jintao Li. "Deep learning for content-based image retrieval: A comprehensive study." In ACM Multimedia, pp. 157-166. ACM, 2014. 18/9/2015 Machine Learning (Steven Hoi) 34
Scheme III: Deep Model Fine-tuning Fine-tuning w/ labeled data (explicit class labels) dog cat Fine-tuning w/ side information (weakly labeled) similar dissimilar 18/9/2015 Machine Learning (Steven Hoi) 35
Scheme I Direct Output w/o training data DF.FC3 DF.FC2 DF.FC1 MP Conv5 Conv4 Conv3 MP LRNorm Conv2 MP LRNorm Conv1 18/9/2015 Input Comparison of three schemes Scheme II Similarity Learning w/ side info Similarity/Metric Learning DF.FC3 DF.FC2 DF.FC1 MP Conv5 Conv4 Conv3 MP LRNorm Conv2 MP LRNorm Conv1 Input Scheme III Deep model fine-tuning w/ side info vs. w/ labeled data Loss on Side Info DF.FC3 DF.FC2 DF.FC1 MP Conv5 Conv4 Conv3 MP LRNorm Conv2 MP LRNorm Conv1 Loss on labeled data Input Machine Learning (Steven Hoi) 36
Evaluation of Retrieval Accuracy Landmark image retrieval performance on paris Similarity Learning Deep Fine-tuning 18/9/2015 Machine Learning (Steven Hoi) 37
DEMO: Visual Search and Image Recognition via Deep Convolutional Neural Networks http://dcnn.stevenhoi.org/ 18/9/2015 Machine Learning (Steven Hoi) 38
Conclusions Machine Learning for Big Data Online Learning Deep Learning Deep Learning Applications Large-scale image retrieval Ongoing research and applications Large-scale deep learning on many GPUs Many more exciting applications 18/9/2015 Machine Learning (Steven Hoi) 39
Q&A Thank You! Email: chhoi@smu.edu.sg LIBOL: An open-source Library of Online Learning Algorithms http://libol.stevenhoi.org 18/9/2015 Machine Learning (Steven Hoi) 40