Pedestrian Detection with RCNN
|
|
|
- Reynold Garrett
- 9 years ago
- Views:
Transcription
1 Pedestrian Detection with RCNN Matthew Chen Department of Computer Science Stanford University Abstract In this paper we evaluate the effectiveness of using a Region-based Convolutional Neural Network approach to the problem of pedestrian detection. Our dataset is composed of manually annotated video sequences from the ETH vision lab. Using selective search as our proposal method, we evaluate the performance of several neural network architectures as well as a baseline logistic regression unit. We find that the best result was split between using the AlexNet architecture with weights pre-trained on ImageNet as well as a variant of this network trained from scratch. 1 Introduction Pedestrian tracking has numerous applications from autonomous vehicles to surveillance. Traditionally many detection systems were based off of hand tuned features before being fed into a learning algorithm. Here we take advantage of recent work in Convolutional Neural Networks to pose the problem as a classification and localization task. In particular we will explore the use of Region based Convolutional Neural Networks. The process starts with separating the video into frames which will be processed individually. For each frame we generate class independent proposal boxes, in our case with a method called selective search. Then we train a deep neural network classifier to classify the proposals as either pedestrian or background. 2 Related Work This paper many follows an approach known as Regions Convolutional Neural Networks introduced in [6]. This method tackles the problem classifying and localizing objects in an image by running detection on a series of proposal boxes. These proposal boxes are generally precomputed offline using low level class independent segmentation methods such as selective search [12], though recent work has incorporated this process into the neural network pipeline [11]. Given the proposals, a deep convolutional neural network is trained to generate features which are fed into class specific SVM classifiers. This approach as proved successful in localizing a large class of items for the PASCAL VOC challenge. For the architecture of our CNN we test a baseline logistic method and compare it to results from implementations of well known CNNs. These CNNs include Cifarnet which was developed in [9] for the cifar-10 dataset and Alexnet which won the 2012 Imagenet challenge [10]. Additonally we look at the effect of using pretrained 1
2 ing set and two sequences (1,429 frames) in the test set as shown in Table 1. The annotations are not complete in that they do not strictly label all pedestrians in a given image. It is usually the case that only pedestrians which take up a certain subjective threshold of the screen are labelled. This leads to what could be some false negatives in the training set. Statistic Num Images Avg Pedestrians Avg Proposals Pos Proposals Neg Proposals Test Train Figure 1: Original image on top left. Positive selective search bounding boxes on top right. Warped background and pedestrian image on Table 1: Data statistics split up by training and test sets bottom left and right respectively Only a subset of the data was used. For the two sequences in the test set, the annotations were sparse in that they were recorded only on every fourth frame. Thus we included only the frames which annotations existed. Additionally, for training, we set the number of proposals used proportional to the number of positive proposals. Specifically we set the ratio at 1:2 positive pedestrian images to negative background images. weights for Alexnet and fine tuning the last layer. The eth pedestrian tracking dataset was established through through a sequence of papers [4] [3] [5]. These papers use additional information collected including stereo vision and odometry data as additional sources of information for their models. We are only using monocular camera data for each of the collected video sequences. 4 3 Methods Dataset and Features The complete pipeline from video frame to bounding box output is shown in Figure 2. We start with a given video sequence and split it up by frames. Then we run a algorithm to generate proposal bounding boxes, in this case we use selective search [12], which we cache for use across the process. We pass these bounding boxes along with the original image to the detector which is our convolutional neural network. The CNN The dataset is composed of several sequences of videos produced by a camera on a moving platform. Each frame has hand labelled annotations denoting bounding boxes of pedestrians. Overall there are seven sequences of videos with a combined 4,534 frames. The data was split into a training and test set where we have the frames from five sequences (3,105 frames) in the train2
3 Raw Image img BBs Proposal BBs IOU test train Proposal Method edgebox ss scores Pedestrian Detector Non- Maximal Supression Dataset Figure 3: Comparison of Selective Search and EdgeBox proposal algorithms Figure 2: Pedestrian detection pipeline produces softmax scores for each bounding box which are used in the final non-maximal suppression step. 4.1 Proposal Boxes We tested two different proposal box method which were popular in the literature [8]. The edge box method [14] actually performed better in terms of the average intersection over union (IOU) across all images in the set as shown in Figure 3. However we opted to use selective search as it proposed fewer boxes and hence increased the runtime of our algorithm. We started by precomputing proposal boxes for all frames in the dataset. Then we created an image processor to preprocess and warp these proposals, which varied in size and aspect ratio, into a fixed size to input into our neural network. The preprocessing involved mean subtraction and whitening for each frame. Using these sub-sampled images we trained a convolutional neural network on top of this data to classify a proposal as either background or pedestrian. Given the proposals and ground truth bounding boxes we could then generate a training set where the proposals were labeled based on their overlap with the ground truth images. We used a threshold of 0.3 so that images which had at least this overlap with a given ground truth bounding box were considered positive examples and the rest negative. 4.2 Detector We start by baselining our results relative to a logistic regression network that we train on our images. Additonally we experiment with various neural network architectures and measure their performance on our task. We start with using the Cifarnet architecture which takes image at a 32x32x3 scale [9]. For the alexnet pretrained architecture we maintained the exact architecture specified by the initial paper with the exception of the last layer which was replaced by a softmax function with two outputs initialized with random weights [10].The softmax function generates what can be interpreted as the probability for each class and is defined as follows. 3
4 Precision 0.4 Architecture alexnet alexnet_pretrained cifarnet logitnet Miss rate Architecture alexnet alexnet_pretrained cifarnet logitnet Recall False Positives Figure 4: Precision Recall curves for various methods on test set p(y = i x; θ) = e θt i x k j=1 eθt j x Next implement a simplified variant of the alexnet architecture in which we remove group convolution and local response normalization layers. We modify the middle convolutional layers maintain full spacial depth form the previous layers. Doing so simplifies the implementation as grouping was used primarily due to memory constraints from training on two separate GPUs in the original implementation. These networks were built using the tensorflow library [1] and trained running on multiple CPUs in parallel (the exact number of CPUs and specifications varied as this was run across multiple servers). 5 Results After training our net for 4000 iterations, in which each iteration was composed a random Figure 5: Miss rate to false positives by method proportional sub-sample of 50 images across the entire dataset, we tested each net on the complete test set. Figure 4 shows a comparison of the precision and recall curves of each net when adjusting threshold for classifying the image as pedestrian as opposed to background. The precision and recall is calculated on the test set using the same pedestrian overlap threshold to denote positive examples for proposal boxes. For AlexNet with pre-trained weights, all the weights besides the fully connected layers and the last convolutional layer were frozen at initialization for fine tuning. All other networks were trained from scratch with weights initialized from a truncated normal distribution with standard deviation proportional to input size. We find that Alexnet with pretrained weights from Imagenet performs the best in moderate prediction score thresholds while a slightly larger variant of Alexnet, trained from scratch, performs better at higher acceptance thresholds. The Cifarnet performance is not that far behind, which is interesting as it as order of magnitude 4
5 Figure 6: Example final output of our algorithm fewer parameters and takes input images of size 32x32x3 compared to 227x227x3 for alexnet. For the non-maximal supression step we aimed for a minimal threshold for our bounding boxes and used We then did a final evaluation of our algorithm by looking at the miss rate to false positive curve shown in Figure 5. We can see that our best performance still has a 70 percent miss rate. An example of the final output for a given frame is shown in Figure 6. 6 Discussion We find that AlexNet with pre-trained weights from Imagenet and fine tuning of the last layers performs the best on a majority of the threshold levels. Overall our approach still has a miss rate that is too high for many real world applications. This rate can be due to several factors. First is the proposal box method. As we have shown the best bounding boxes only had a mean IOU ratio of around 0.6 which would serve as an upper bound on the accuracy of our overall system. Next it is likely the case that our neural networks could have benefited from additional training time to reach convergence as the number of iterations we used was relatively small compared to the amount of time it took to train on imagenet and other large vision benchmarks. Additional hyperparameter tuning of parameters such as regularization on fully connected layers would also likely improve the results. The use of Region-based convolutional neural networks for the task of pedestrian detection does show promise. Our example output image shows the technique produces reasonable bounding boxes. However additional work needs to be done to improve the overall performance of the system. 7 Future Work The main constraint of the work presented in this paper was lack of GPU support for running these models due to resource contraints. Most CNN training is currently implemented using one or multiple GPUs which should have a order of magnitude speed up. Training on a GPU would allow for more iterations through the dataset to increase the change of convergence as well as runtime comparisons to current RCNN results in other domains. Additionally the added computational power would enable us to use a larger pedestrian dataset such as [2]. Training on a large dataset such as this one would allow the nets to generalize better. This is especially true for the larger network architectures which require larger training sets corresponding to the larger number of parameters to tune. 5
6 References [1] Martın Abadi et al. TensorFlow: Largescale machine learning on heterogeneous systems, In: Software available from tensorflow. org (). [2] Piotr Dollar et al. Pedestrian detection: An evaluation of the state of the art. In: Pattern Analysis and Machine Intelligence, IEEE Transactions on 34.4 (2012), pp [3] A. Ess et al. A Mobile Vision System for Robust Multi-Person Tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 08). IEEE Press, [4] Andreas Ess, Bastian Leibe, and Luc Van Gool. Depth and appearance for mobile scene analysis. In: Computer Vision, ICCV IEEE 11th International Conference on. IEEE. 2007, pp [5] Andreas Ess et al. Moving obstacle detection in highly dynamic scenes. In: Robotics and Automation, ICRA 09. IEEE International Conference on. IEEE. 2009, pp [6] Ross Girshick. Fast R-CNN. In: International Conference on Computer Vision (ICCV) [7] Ross Girshick et al. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE. 2014, pp [8] Jan Hosang, Rodrigo Benenson, and Bernt Schiele. How good are detection proposals, really? In: arxiv preprint arxiv: (2014). [9] Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images [10] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. 2012, pp [11] Shaoqing Ren et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In: Advances in Neural Information Processing Systems (NIPS) [12] Koen EA Van de Sande et al. Segmentation as selective search for object recognition. In: Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE. 2011, pp [13] Stefan Van Der Walt, S Chris Colbert, and Gael Varoquaux. The NumPy array: a structure for efficient numerical computation. In: Computing in Science & Engineering 13.2 (2011), pp [14] C Lawrence Zitnick and Piotr Dollár. Edge boxes: Locating object proposals from edges. In: Computer Vision ECCV Springer, 2014, pp
Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection
CSED703R: Deep Learning for Visual Recognition (206S) Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection Bohyung Han Computer Vision Lab. [email protected] 2 3 Object detection
Lecture 6: Classification & Localization. boris. [email protected]
Lecture 6: Classification & Localization boris. [email protected] 1 Agenda ILSVRC 2014 Overfeat: integrated classification, localization, and detection Classification with Localization Detection. 2 ILSVRC-2014
Bert Huang Department of Computer Science Virginia Tech
This paper was submitted as a final project report for CS6424/ECE6424 Probabilistic Graphical Models and Structured Prediction in the spring semester of 2016. The work presented here is done by students
CAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong Jan 26, 2016 Today Administrivia A bigger picture and some common questions Object detection proposals, by Samer
Convolutional Feature Maps
Convolutional Feature Maps Elements of efficient (and accurate) CNN-based object detection Kaiming He Microsoft Research Asia (MSRA) ICCV 2015 Tutorial on Tools for Efficient Object Detection Overview
Pedestrian Detection using R-CNN
Pedestrian Detection using R-CNN CS676A: Computer Vision Project Report Advisor: Prof. Vinay P. Namboodiri Deepak Kumar Mohit Singh Solanki (12228) (12419) Group-17 April 15, 2016 Abstract Pedestrian detection
MulticoreWare. Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan
1 MulticoreWare Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan Focused on Heterogeneous Computing Multiple verticals spawned from core competency Machine Learning
Fast R-CNN Object detection with Caffe
Fast R-CNN Object detection with Caffe Ross Girshick Microsoft Research arxiv code Latest roasts Goals for this section Super quick intro to object detection Show one way to tackle obj. det. with ConvNets
Compacting ConvNets for end to end Learning
Compacting ConvNets for end to end Learning Jose M. Alvarez Joint work with Lars Pertersson, Hao Zhou, Fatih Porikli. Success of CNN Image Classification Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton,
Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016
Module 5 Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016 Previously, end-to-end.. Dog Slide credit: Jose M 2 Previously, end-to-end.. Dog Learned Representation Slide credit: Jose
Fast R-CNN. Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th. Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083.
Fast R-CNN Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083. ECS 289G 001 Paper Presentation, Prof. Lee Result 1 67% Accuracy
Deformable Part Models with CNN Features
Deformable Part Models with CNN Features Pierre-André Savalle 1, Stavros Tsogkas 1,2, George Papandreou 3, Iasonas Kokkinos 1,2 1 Ecole Centrale Paris, 2 INRIA, 3 TTI-Chicago Abstract. In this work we
CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015
CS 1699: Intro to Computer Vision Deep Learning Prof. Adriana Kovashka University of Pittsburgh December 1, 2015 Today: Deep neural networks Background Architectures and basic operations Applications Visualizing
Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite
Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite Philip Lenz 1 Andreas Geiger 2 Christoph Stiller 1 Raquel Urtasun 3 1 KARLSRUHE INSTITUTE OF TECHNOLOGY 2 MAX-PLANCK-INSTITUTE IS 3
SEMANTIC CONTEXT AND DEPTH-AWARE OBJECT PROPOSAL GENERATION
SEMANTIC TEXT AND DEPTH-AWARE OBJECT PROPOSAL GENERATION Haoyang Zhang,, Xuming He,, Fatih Porikli,, Laurent Kneip NICTA, Canberra; Australian National University, Canberra ABSTRACT This paper presents
Edge Boxes: Locating Object Proposals from Edges
Edge Boxes: Locating Object Proposals from Edges C. Lawrence Zitnick and Piotr Dollár Microsoft Research Abstract. The use of object proposals is an effective recent approach for increasing the computational
The Visual Internet of Things System Based on Depth Camera
The Visual Internet of Things System Based on Depth Camera Xucong Zhang 1, Xiaoyun Wang and Yingmin Jia Abstract The Visual Internet of Things is an important part of information technology. It is proposed
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles [email protected] and lunbo
CNN Based Object Detection in Large Video Images. WangTao, [email protected] IQIYI ltd. 2016.4
CNN Based Object Detection in Large Video Images WangTao, [email protected] IQIYI ltd. 2016.4 Outline Introduction Background Challenge Our approach System framework Object detection Scene recognition Body
Object Detection in Video using Faster R-CNN
Object Detection in Video using Faster R-CNN Prajit Ramachandran University of Illinois at Urbana-Champaign [email protected] Abstract Convolutional neural networks (CNN) currently dominate the computer
Sense Making in an IOT World: Sensor Data Analysis with Deep Learning
Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Natalia Vassilieva, PhD Senior Research Manager GTC 2016 Deep learning proof points as of today Vision Speech Text Other Search & information
Semantic Recognition: Object Detection and Scene Segmentation
Semantic Recognition: Object Detection and Scene Segmentation Xuming He [email protected] Computer Vision Research Group NICTA Robotic Vision Summer School 2015 Acknowledgement: Slides from Fei-Fei
arxiv:1506.03365v2 [cs.cv] 19 Jun 2015
LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop Fisher Yu Yinda Zhang Shuran Song Ari Seff Jianxiong Xiao arxiv:1506.03365v2 [cs.cv] 19 Jun 2015 Princeton
Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks
1 Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks Tomislav Hrkać, Karla Brkić, Zoran Kalafatić Faculty of Electrical Engineering and Computing University of
Learning to Process Natural Language in Big Data Environment
CCF ADL 2015 Nanchang Oct 11, 2015 Learning to Process Natural Language in Big Data Environment Hang Li Noah s Ark Lab Huawei Technologies Part 1: Deep Learning - Present and Future Talk Outline Overview
Speed Performance Improvement of Vehicle Blob Tracking System
Speed Performance Improvement of Vehicle Blob Tracking System Sung Chun Lee and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA [email protected], [email protected] Abstract. A speed
CS231M Project Report - Automated Real-Time Face Tracking and Blending
CS231M Project Report - Automated Real-Time Face Tracking and Blending Steven Lee, [email protected] June 6, 2015 1 Introduction Summary statement: The goal of this project is to create an Android
VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS
VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS Norbert Buch 1, Mark Cracknell 2, James Orwell 1 and Sergio A. Velastin 1 1. Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE,
Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15
Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright GENIVI Alliance
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
3D Model based Object Class Detection in An Arbitrary View
3D Model based Object Class Detection in An Arbitrary View Pingkun Yan, Saad M. Khan, Mubarak Shah School of Electrical Engineering and Computer Science University of Central Florida http://www.eecs.ucf.edu/
Image and Video Understanding
Image and Video Understanding 2VO 710.095 WS Christoph Feichtenhofer, Axel Pinz Slide credits: Many thanks to all the great computer vision researchers on which this presentation relies on. Most material
Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance
2012 IEEE International Conference on Multimedia and Expo Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance Rogerio Feris, Sharath Pankanti IBM T. J. Watson Research Center
Fast Matching of Binary Features
Fast Matching of Binary Features Marius Muja and David G. Lowe Laboratory for Computational Intelligence University of British Columbia, Vancouver, Canada {mariusm,lowe}@cs.ubc.ca Abstract There has been
Neovision2 Performance Evaluation Protocol
Neovision2 Performance Evaluation Protocol Version 3.0 4/16/2012 Public Release Prepared by Rajmadhan Ekambaram [email protected] Dmitry Goldgof, Ph.D. [email protected] Rangachar Kasturi, Ph.D.
Steven C.H. Hoi School of Information Systems Singapore Management University Email: [email protected]
Steven C.H. Hoi School of Information Systems Singapore Management University Email: [email protected] Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual
Introduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Deep Learning Barnabás Póczos & Aarti Singh Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey
Advanced analytics at your hands
2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously
Multi-view Face Detection Using Deep Convolutional Neural Networks
Multi-view Face Detection Using Deep Convolutional Neural Networks Sachin Sudhakar Farfade Yahoo [email protected] Mohammad Saberian Yahoo [email protected] Li-Jia Li Yahoo [email protected]
A Computer Vision System on a Chip: a case study from the automotive domain
A Computer Vision System on a Chip: a case study from the automotive domain Gideon P. Stein Elchanan Rushinek Gaby Hayun Amnon Shashua Mobileye Vision Technologies Ltd. Hebrew University Jerusalem, Israel
Novelty Detection in image recognition using IRF Neural Networks properties
Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,
Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not?
Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not? Erjin Zhou [email protected] Zhimin Cao [email protected] Qi Yin [email protected] Abstract Face recognition performance improves rapidly
Latest Advances in Deep Learning. Yao Chou
Latest Advances in Deep Learning Yao Chou Outline Introduction Images Classification Object Detection R-CNN Traditional Feature Descriptor Selective Search Implementation Latest Application Deep Learning
Making Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University [email protected] [email protected] I. Introduction III. Model The goal of our research
Real-Time Tracking of Pedestrians and Vehicles
Real-Time Tracking of Pedestrians and Vehicles N.T. Siebel and S.J. Maybank. Computational Vision Group Department of Computer Science The University of Reading Reading RG6 6AY, England Abstract We present
Random Forest Based Imbalanced Data Cleaning and Classification
Random Forest Based Imbalanced Data Cleaning and Classification Jie Gu Software School of Tsinghua University, China Abstract. The given task of PAKDD 2007 data mining competition is a typical problem
Fast Accurate Fish Detection and Recognition of Underwater Images with Fast R-CNN
Fast Accurate Fish Detection and Recognition of Underwater Images with Fast R-CNN Xiu Li 1, 2, Min Shang 1, 2, Hongwei Qin 1, 2, Liansheng Chen 1, 2 1. Department of Automation, Tsinghua University, Beijing
Automatic Labeling of Lane Markings for Autonomous Vehicles
Automatic Labeling of Lane Markings for Autonomous Vehicles Jeffrey Kiske Stanford University 450 Serra Mall, Stanford, CA 94305 [email protected] 1. Introduction As autonomous vehicles become more popular,
Supporting Online Material for
www.sciencemag.org/cgi/content/full/313/5786/504/dc1 Supporting Online Material for Reducing the Dimensionality of Data with Neural Networks G. E. Hinton* and R. R. Salakhutdinov *To whom correspondence
Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network
Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Anthony Lai (aslai), MK Li (lilemon), Foon Wang Pong (ppong) Abstract Algorithmic trading, high frequency trading (HFT)
Image Classification for Dogs and Cats
Image Classification for Dogs and Cats Bang Liu, Yan Liu Department of Electrical and Computer Engineering {bang3,yan10}@ualberta.ca Kai Zhou Department of Computing Science [email protected] Abstract
Applications of Deep Learning to the GEOINT mission. June 2015
Applications of Deep Learning to the GEOINT mission June 2015 Overview Motivation Deep Learning Recap GEOINT applications: Imagery exploitation OSINT exploitation Geospatial and activity based analytics
Real-Time Grasp Detection Using Convolutional Neural Networks
Real-Time Grasp Detection Using Convolutional Neural Networks Joseph Redmon 1, Anelia Angelova 2 Abstract We present an accurate, real-time approach to robotic grasp detection based on convolutional neural
Data Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
Distributed forests for MapReduce-based machine learning
Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication
SIGNAL INTERPRETATION
SIGNAL INTERPRETATION Lecture 6: ConvNets February 11, 2016 Heikki Huttunen [email protected] Department of Signal Processing Tampere University of Technology CONVNETS Continued from previous slideset
InstaNet: Object Classification Applied to Instagram Image Streams
InstaNet: Object Classification Applied to Instagram Image Streams Clifford Huang Stanford University [email protected] Mikhail Sushkov Stanford University [email protected] Abstract The growing
Face Model Fitting on Low Resolution Images
Face Model Fitting on Low Resolution Images Xiaoming Liu Peter H. Tu Frederick W. Wheeler Visualization and Computer Vision Lab General Electric Global Research Center Niskayuna, NY, 1239, USA {liux,tu,wheeler}@research.ge.com
arxiv:1604.08893v1 [cs.cv] 29 Apr 2016
Faster R-CNN Features for Instance Search Amaia Salvador, Xavier Giró-i-Nieto, Ferran Marqués Universitat Politècnica de Catalunya (UPC) Barcelona, Spain {amaia.salvador,xavier.giro}@upc.edu Shin ichi
Predicting Flight Delays
Predicting Flight Delays Dieterich Lawson [email protected] William Castillo [email protected] Introduction Every year approximately 20% of airline flights are delayed or cancelled, costing
Tracking performance evaluation on PETS 2015 Challenge datasets
Tracking performance evaluation on PETS 2015 Challenge datasets Tahir Nawaz, Jonathan Boyle, Longzhen Li and James Ferryman Computational Vision Group, School of Systems Engineering University of Reading,
Tracking and integrated navigation Konrad Schindler
Tracking and integrated navigation Konrad Schindler Institute of Geodesy and Photogrammetry Tracking Navigation needs predictions for dynamic objects estimate trajectories in 3D world coordinates and extrapolate
Scalable Object Detection by Filter Compression with Regularized Sparse Coding
Scalable Object Detection by Filter Compression with Regularized Sparse Coding Ting-Hsuan Chao, Yen-Liang Lin, Yin-Hsi Kuo, and Winston H Hsu National Taiwan University, Taipei, Taiwan Abstract For practical
Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012
Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering
FACE RECOGNITION BASED ATTENDANCE MARKING SYSTEM
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 2, February 2014,
EdVidParse: Detecting People and Content in Educational Videos
EdVidParse: Detecting People and Content in Educational Videos by Michele Pratusevich S.B., Massachusetts Institute of Technology (2013) Submitted to the Department of Electrical Engineering and Computer
Recognizing Informed Option Trading
Recognizing Informed Option Trading Alex Bain, Prabal Tiwaree, Kari Okamoto 1 Abstract While equity (stock) markets are generally efficient in discounting public information into stock prices, we believe
Neural Network based Vehicle Classification for Intelligent Traffic Control
Neural Network based Vehicle Classification for Intelligent Traffic Control Saeid Fazli 1, Shahram Mohammadi 2, Morteza Rahmani 3 1,2,3 Electrical Engineering Department, Zanjan University, Zanjan, IRAN
3D Object Recognition using Convolutional Neural Networks with Transfer Learning between Input Channels
3D Object Recognition using Convolutional Neural Networks with Transfer Learning between Input Channels Luís A. Alexandre Department of Informatics and Instituto de Telecomunicações Univ. Beira Interior,
GPU-Based Deep Learning Inference:
Whitepaper GPU-Based Deep Learning Inference: A Performance and Power Analysis November 2015 1 Contents Abstract... 3 Introduction... 3 Inference versus Training... 4 GPUs Excel at Neural Network Inference...
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Mean-Shift Tracking with Random Sampling
1 Mean-Shift Tracking with Random Sampling Alex Po Leung, Shaogang Gong Department of Computer Science Queen Mary, University of London, London, E1 4NS Abstract In this work, boosting the efficiency of
Multi-view Intelligent Vehicle Surveillance System
Multi-view Intelligent Vehicle Surveillance System S. Denman, C. Fookes, J. Cook, C. Davoren, A. Mamic, G. Farquharson, D. Chen, B. Chen and S. Sridharan Image and Video Research Laboratory Queensland
Vision based Vehicle Tracking using a high angle camera
Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu [email protected] [email protected] Abstract A vehicle tracking and grouping algorithm is presented in this work
A Content based Spam Filtering Using Optical Back Propagation Technique
A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT
Big Data Text Mining and Visualization. Anton Heijs
Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Object Detection from Video Tubelets with Convolutional Neural Networks
Object Detection from Video Tubelets with Convolutional Neural Networks Kai Kang Wanli Ouyang Hongsheng Li Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong {kkang,wlouyang,hsli,xgwang}@ee.cuhk.edu.hk
Canny Edge Detection
Canny Edge Detection 09gr820 March 23, 2009 1 Introduction The purpose of edge detection in general is to significantly reduce the amount of data in an image, while preserving the structural properties
Classifying Manipulation Primitives from Visual Data
Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if
Vehicle Tracking by Simultaneous Detection and Viewpoint Estimation
Vehicle Tracking by Simultaneous Detection and Viewpoint Estimation Ricardo Guerrero-Gómez-Olmedo, Roberto López-Sastre, Saturnino Maldonado-Bascón, and Antonio Fernández-Caballero 2 GRAM, Department of
Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
Stochastic Pooling for Regularization of Deep Convolutional Neural Networks Matthew D. Zeiler Department of Computer Science Courant Institute, New York University [email protected] Rob Fergus Department
The KITTI-ROAD Evaluation Benchmark. for Road Detection Algorithms
The KITTI-ROAD Evaluation Benchmark for Road Detection Algorithms 08.06.2014 Jannik Fritsch Honda Research Institute Europe, Offenbach, Germany Presented material created together with Tobias Kuehnl Research
Applying Deep Learning to Enhance Momentum Trading Strategies in Stocks
This version: December 12, 2013 Applying Deep Learning to Enhance Momentum Trading Strategies in Stocks Lawrence Takeuchi * Yu-Ying (Albert) Lee [email protected] [email protected] Abstract We
Augmented Reality Tic-Tac-Toe
Augmented Reality Tic-Tac-Toe Joe Maguire, David Saltzman Department of Electrical Engineering [email protected], [email protected] Abstract: This project implements an augmented reality version
FAST APPROXIMATE NEAREST NEIGHBORS WITH AUTOMATIC ALGORITHM CONFIGURATION
FAST APPROXIMATE NEAREST NEIGHBORS WITH AUTOMATIC ALGORITHM CONFIGURATION Marius Muja, David G. Lowe Computer Science Department, University of British Columbia, Vancouver, B.C., Canada [email protected],
Limitations of Human Vision. What is computer vision? What is computer vision (cont d)?
What is computer vision? Limitations of Human Vision Slide 1 Computer vision (image understanding) is a discipline that studies how to reconstruct, interpret and understand a 3D scene from its 2D images
Regularized Logistic Regression for Mind Reading with Parallel Validation
Regularized Logistic Regression for Mind Reading with Parallel Validation Heikki Huttunen, Jukka-Pekka Kauppi, Jussi Tohka Tampere University of Technology Department of Signal Processing Tampere, Finland
CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES
CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
Introduction to Logistic Regression
OpenStax-CNX module: m42090 1 Introduction to Logistic Regression Dan Calderon This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Abstract Gives introduction
Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang
Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform
Support Vector Machine-Based Human Behavior Classification in Crowd through Projection and Star Skeletonization
Journal of Computer Science 6 (9): 1008-1013, 2010 ISSN 1549-3636 2010 Science Publications Support Vector Machine-Based Human Behavior Classification in Crowd through Projection and Star Skeletonization
