3D Object Recognition using Convolutional Neural Networks with Transfer Learning between Input Channels
|
|
|
- Myles Jordan
- 10 years ago
- Views:
Transcription
1 3D Object Recognition using Convolutional Neural Networks with Transfer Learning between Input Channels Luís A. Alexandre Department of Informatics and Instituto de Telecomunicações Univ. Beira Interior, Covilhã, Portugal Abstract. RGB-D data is getting ever more interest from the research community as both cheap cameras appear in the market and the applications of this type of data become more common. A current trend in processing image data is the use of convolutional neural networks (CNNs) that have consistently beat competition in most benchmark data sets. In this paper we investigate the possibility of transferring knowledge between CNNs when processing RGB-D data with the goal of both improving accuracy and reducing training time. We present experiments that show that our proposed approach can achieve both these goals. Keywords: 3D object recognition, transfer learning, convolutional neural networks 1 Introduction The use of RGB plus depth (RGB-D) data has been increasing recently due to the availability of cheap cameras that produce this type of data. The possible applications of RGB-D data are multiple, but among the many possibilities we can cite the use for robotic vision as a means to allow a robot to better perceive its surrounding environment, recognizing both the type of environment [1] it is currently located in and the objects [2] that are within it. Deep learning [3] has also receive strong attention lately due to its capacity to create multilevel representations of the data and allowing for a possibly more efficient way to solve some problems [4]. Several approaches have been put forward in this area: deep belief networks [5], stacked (denoising) autoencoders [6] and convolutional neural networks (CNNs) [7, 8] are among some of these. CNNs have been shown to be particularly adapted to process image data, since they are inspired in the way the human visual system works [9] and have shown these capabilities by winning several international competitions involving object recognition [8]. This work was partially financed by FEDER funds through the Programa Operacional Factores de Competitividade - COMPETE and by Portuguese funds through FCT - Fundação para a Ciência e a Tecnologia in the framework of the project PTDC/EIA-EIA/119004/2010 and PEst-OE/EEI/LA0008/2013.
2 2 In this paper we investigate the possibility of making transfer learning while training CNNs for solving object recognition tasks using RGB-D data. The main idea is to use four independent CNNs, one for each channel, instead of using a single CNN receiving the four input channels, and train these four independent CNNs in a sequence, instead of training them in parallel, and using the weights of a trained CNN as starting point to train the other CNNs that will process the remaining channels. We compare this approach against training four CNNs in parallel and also against a single CNN that processes the four input channels simultaneously and conclude that our proposal not only saves training time but it also increases the recognition accuracy. The paper is organized as follows: the next section presents some related work; section 3 contains the detailed explanation of our proposal; section 4 presents the experiments and the last section lists our conclusions. 2 Related Work 2.1 Convolutional Neural Networks A convolutional neural network (CNN) is a type of deep neural network (DNN) inspired in the human visual system, used for processing images. These were originally proposed by Fukushima [10] and latter also developed by LeCun [7]. A CNN has several stages (typically two or three), each composed of two layers: the first layer does a convolution of the input image with a filter and the second layer downsamples the result of the first layer, using a pooling operation. These stages build increasingly abstract representations of the input pattern: the first might be sensitive to edges, the second to corners and intersections and so on. The idea is that these representations become both more abstract and more invariant as the pattern data goes through the CNN. The output of the last stage is usually a vector (not an image) that is fed to a multi-layer perceptron (MLP) that produces the final network output, usually a class label. The possibility of using CNNs for processing RGB-D data was investigated in [11]. In that paper the authors combined both a CNN with a recursive neural network and obtained state-of-the-art results on an RGB-D data set. 2.2 Transfer Learning with CNNs Transfer learning (TL) consists in the use of knowledge acquired solving a source problem to facilitate the resolution of a target problem. This can take many different forms [12]. In the case of neural networks, one way to do TL is to reuse layers from the source problem to solve the target problem. These layers can be reused as they are or they can be fine-tuned. The possibility of doing transfer learning on deep neural networks (DNNs) has been investigated before. In [13], the authors proposed to train a DNN on a given problem and reuse it by fine-tuning only the last layer, while keeping the remaining weights unchanged. They compare the results against randomly initialized DNNs and also against fine-tuning the last 2 layers, last 3 and so on. They focused on character recognition and concluded that transfer learning is viable in this task, since it allows for faster training and smaller classification errors.
3 3 In [14], the authors addressed the transfer learning between deep networks used as classifiers of digit images. They considered cases where only the set of class labels, or only the data distribution, changed from source to target problem. They concluded that reusing networks trained for a different label set led to better results than reusing networks trained for a different data distribution. In particular, reusing a network trained for more classes on a problem with less classes was beneficial for virtually any amount of training data. While these previous studies showed that in general one can achieve some improvement in terms of accuracy and / or overall processing time for different configurations of both learning architectures and data sets by transfer learning, none of these approaches considers the possibility put forward in the current paper of transferring learned networks between channels while processing RGB-D data. 3 Our Proposal The straightforward way to apply a CNN to RGB-D data is simply to train it to process images with four channels (three color + one depth). An alternative (which we will also investigate in the experiments section) is to use one CNN to process each channel independently and then combine the decisions of each network to obtain the final classification result. This combination can be done in many different ways and we will explore two common possibilities in the experiments. When training one CNN for each channel, it is obvious that the task each of these networks is learning to perform is very similar (specially among the three color channels). In this paper we propose to take advantage of the similarity of tasks that are performed on each input channel during training and avoid starting to train a CNN from scratch for each channel given that we can reuse the trained weights from one initial trained channel to become the starting point for training the remaining networks that will process the other channels. Formally, consider a color image I(x, y) that is composed of three color channels, such that I(x, y) = (R(x, y), G(x, y), B(x, y)) and each channel is a mapping from the set of image coordinates to the set D c = {0,..., 255}. The RGB-D data adds to the color image a depth image, that is usually a 16 bit grayscale image, D(x, y), that maps the image coordinates to the set D d = {0,..., }. A CNN trained on one of these channels is learning a function from D m c or D m d, where m is the number of pixels, to the set of class labels, y = {0, 1,..., C 1}, for a C-class classification problem. Some of the approaches that we test imply a combination of the outputs of several CNNs. For this, we used two combination methods, that we now describe. Let us represent the output of neuron j of network k by y j,k, and consider n classifiers and a problem with C classes. The decision of a CNN k corresponds to the output with largest value: d k = argmax j=1,...,c y j,k The combination using the majority vote is given by: i = arg n max 1 {dk =j} j=1,...,c k=1
4 4 where i is the class label obtained through the combination and 1 {dk =j} is the indicator function that gives 1 when d k is equal to j and zero otherwise. The second combination method is the maximum of the mean of the outputs produced by each CNN. The decision using this approach is the class 4 Experiments i = arg max j=1,...,c 1 n n y j,k = arg k=1 max n y j,k j=1,...,c k=1 We used an open source CNN implementation, cuda-convnet 1, that takes advantage of graphics processing units (GPUs). All experiments were run on Fedora 17, using a nvidia GeForce 680 GPU. 4.1 The Data Set We used a subset of the large data set of 3D point clouds from [15]. The original data set contains 300 objects from 51 different categories captured on a turntable from three different camera poses. We used 48 objects representing 10 categories. The training data contain clouds captured from two different camera views, and the validation and test data contains clouds captured using a third different view. The training set has a total of 946 clouds, the validation set has 240 clouds and the test set contains 475 clouds. The data used to train consisted on two 2D images per point cloud, one with the RGB data and another with the depth information (16 bit data). These images were re-scaled to squares with 64 pixel sides. Figure 1 shows some example images of the dataset objects and Fig. 2 shows the corresponding depth maps. It is easy to understand from these examples that a CNN that uses only the depth maps will have a more difficult task than a CNN using a color channel. 4.2 Network Architecture and Training All networks use three stages (three convolution plus sub-sampling layers) plus a final one hidden layer MLP. The convolution layers use 32 filters and the sub-sampling layers use the relu function. Full description of all paramenters can be found in the configuration files available online 2. The starting value of the learning rates of the convolution layers and MLP layer and some of the other meta-parameters (L 2 weight decay for convolutional and MLP layers) were obtained by doing a grid search using the depth data (the one that had worst performance) and were kept for all channels. Other parameters (number and sizes 1 code.google.com/p/cuda-convnet 2 Since there are too many parameters to present here, the configuration files used can be obtained online: The lists with the names of the files used in the training, validation and test sets are also there. This allows for our experiments to be reproduced.
5 5 Fig. 1. One sample RGB image of each of the 10 object categories used in the experiments. Note that in each category, there are usually more than 3 or 4 different objects that can differ significantly between each other. Fig. 2. One sample depth image (equalized for display purposes) of each of the 10 object categories used in the experiments. These sample images correspond to the RGB images presented in Fig. 1. of convolutional filters, initial values on the weights and bias, momentum, etc.) used the default values. This grid search used only the data from the training and validation sets. The training of all networks was done according to the following algorithm: 1. restart = 0 2. while restart < 3 do (a) train for 10 epochs (b) evaluate the validation error, VE (c) if VE increased for two consecutive evaluations: restart = restart + 1 divide the value of the learning rate of the convolution layers by 10 go back to the weights of the network used 20 epochs before 3. evaluate the test error and stop.
6 6 Note that the goal of these experiments was not to obtain the best possible result in this dataset but to illustrate other possible ways to train a CNN to deal with RGB- D data, namely, using transfer learning. Even so, the presented results are comparable with the color descriptor results obtained in [2], and are better than the results obtained with the grayscale descriptors. Note also that in that paper, the validation set was not used since for the methods presented there, no meta-parameter search was performed. 4.3 Two Baselines We consider two baseline results against which we compare the transfer learning proposal. For the first baseline, named RGBD in table 1, we consider a single CNN that receives all four channels as input. As a second baseline, we consider four independent CNNs, each one processing one of the input channels. The output of these four networks is then combined through simple majority vote (marked with maj in table 1) and also through the maximum of the mean value of the output of the last layer of the CNNs (marked with mean in table 1), to yield the final decision. 4.4 Transferring Learned Networks We train, as in baseline 2, four networks, one for each input channel. But in this case we will transfer the weights from the network that had the smallest error among the 40 trained with individual channels. These weights are used as a starting point for the training (fine-tuning) of the networks from the other three channels. This is done instead of using random weights. The individual decisions are then combined by the same two approaches used in baseline 2. Since the data used in the CNN that processes the depth data is different in its nature (different range of values and also different type of underlying data distribution) from the color channels, we also tested the combination of the channel that had smallest error (R) with the G and B channels learned with TL and the depth channel without transfer learning, since the weights transferred from a color channel network could be inadequate to serve as a starting point to train the depth channel CNN. The outputs of these networks are also combined with the same two approaches as before. 4.5 Discussion The results obtained in the experiments are presented in table 1. Regarding the error rates, we see that the best result was obtained with a transfer learning approach (R,G+TL,B+TL,D+TL mean), reducing by 1.07% in 29.87% (a 3.6% reduction) the error obtained with the RGBD approach (single CNN, first row of table 1) and also being faster than the RGBD approach by more than 22%. When training individual CNNs for each channel we found that the best results were obtained using channel R. So we transferred the weights of the network trained with only the R channel data that achieved the smallest error (as we trained 10 networks
7 7 for each channel) to be used as a starting point to train new networks for the other three remaining channels. These are listed in table 1, with a + TL in front of the channel name. These networks, that used the weights of the best R channel network, achieved better results than the originally trained networks for the case of the two remaining color channels (B and G) and slightly worst results for the depth channel. The standard deviations of their errors show that they all produced very similar final results. In the three combination settings, the decision obtained with the maximum of the mean was always better than the decision obtained with majority voting. These decision fusion approaches using the maximum of the mean were also always better than using a single CNN to process the four channels, both in terms of error rates and also in terms of the time spent. The combination of the channel that had smallest error (R) with the G and B channels learned with TL and the depth channel without transfer learning (R,G+TL,B+TL,D mean) did not outperform the R,G+TL,B+TL,D+TL mean, showing that even if the original CNN trained for the depth channel without transfer learning was used (that had smaller error than the D + TL CNN), the combination performance did not beat the combination using the R channel plus the other three TL networks. The best overall time was obtained when fusing the four independently trained CNNs without using TL: an improvement of more than 29% over the single CNN. Regarding the time, this approach can be use in real-time applications. The fastest time was around 500s, but this includes training, validation and testing for 10 repetitions of the experiment. Even if we consider this to be the time for testing only and since we used near 500 point clouds in the test set, this gives around 0.1 seconds per point cloud. But note that the training time is much longer than the test time, so if the networks were to be trainined off line, the test speed would be substantially higer than this value. 5 Conclusions In this paper we studied alternative approaches to train a CNN with RGB-D data. First we proposed the use of four independently trained CNNs, one for each channel that were then combined to produce a decision by two different methods. We concluded that using one of the combination methods we can obtain both smaller error and faster training time than using the single CNN for all channels. Then we proposed the use of transfer learning between the CNNs trained on each channel. This again showed better results than the original single CNN approach, both in terms of error and time, and also better than in the first combination proposal in terms of error but not in terms of time. So we conclude that it is advantageous to split RGB-D data into four separate data sets, train CNNs on each channel and fuse their results. Better results in terms of error rates can even be achieved if one transfers the weights from the CNN with smallest error and uses them as a starting point for the training of the CNNs of the remaining channels, implementing a transfer learning approach.
8 8 Table 1. Average (and standard deviation) of the error and time used on 10 repetitions for each of the evaluated approaches. The best results for error and time spent are in bold. Approach Error [%] Time[s] RGBD (3.30) (191.95) Channel R (3.17) (36.05) Channel G (2.40) (41.79) Channel B (5.58) (24.83) Channel D (0.77) (17.16) R,G,B,D maj (4.31) (119.83) R,G,B,D mean (1.66) (119.83) Channel G + TL (0.00) (0.70) Channel B + TL (0.00) (0.57) Channel D + TL (0.00) (0.67) R,G+TL,B+TL,D+TL maj (0.75) (37.99) R,G+TL,B+TL,D+TL mean (0.61) (37.99) R,G+TL,B+TL,D maj (0.79) (54.48) R,G+TL,B+TL,D mean (0.63) (54.48) References 1. Pronobis, A., Mozos, O.M., Caputo, B., Jensfelt, P.: Multi-modal semantic place classification. I. J. Robotic Res. 29(2-3) (2010) Alexandre, L.A.: 3D descriptors for object and category recognition: a comparative evaluation. In: Workshop on Color-Depth Camera Fusion in Robotics at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura, Portugal (October 2012) 3. Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine Learning 2(1) (2009) Le Roux, N., Bengio, Y.: Deep belief networks are compact universal approximators. Neural computation 22(8) (2010) Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural computation 18(7) (2006) Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Machine Learning, Proceedings of the Twenty- Fifth International Conference (ICML 2008), Helsinki, Finland, June 5-9, (2008) LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time-series. In Arbib, M.A., ed.: The Handbook of Brain Theory and Neural Networks, MIT Press (1995) 8. Ciresan, D.C., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA (June 2012) Filipe, S., Alexandre, L.A.: From the human visual system to the computational models of visual attention: A survey. Artificial Intelligence Review (January 2013) Fukushima, K.: Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics 36(4) (1980)
9 11. Socher, R., Huval, B., Bath, B., Manning, C., Ng, A.: Convolutional-recursive deep learning for 3d object classification. In Bartlett, P., Pereira, F., Burges, C., Bottou, L., Weinberger, K., eds.: Advances in Neural Information Processing Systems 25. (2012) Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22(10) (2010) Ciresan, D.C., Meier, U., Schmidhuber, J.: Transfer learning for latin and chinese characters with deep neural networks. In: The 2012 International Joint Conference on Neural Networks, Brisbane, Australia (June 2012) Amaral, T., Sá, J., Silva, L., Alexandre, L.A., Santos, J.: Improving performance on problems with few labelled data by reusing stacked auto-encoders. In: submitted. (2014) 15. Lai, K., Bo, L., Ren, X., Fox, D.: A Large-Scale hierarchical Multi-View RGB-D object dataset. In: Proc. of the IEEE International Conference on Robotics & Automation (ICRA). (2011) 9
A Genetic Algorithm-Evolved 3D Point Cloud Descriptor
A Genetic Algorithm-Evolved 3D Point Cloud Descriptor Dominik Wȩgrzyn and Luís A. Alexandre IT - Instituto de Telecomunicações Dept. of Computer Science, Univ. Beira Interior, 6200-001 Covilhã, Portugal
Improving Deep Neural Network Performance by Reusing Features Trained with Transductive Transference
Improving Deep Neural Network Performance by Reusing Features Trained with Transductive Transference Chetak Kandaswamy, Luís Silva, Luís Alexandre, Jorge M. Santos, Joaquim Marques de Sá Instituto de Engenharia
Introduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Deep Learning Barnabás Póczos & Aarti Singh Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey
Transfer Learning for Latin and Chinese Characters with Deep Neural Networks
Transfer Learning for Latin and Chinese Characters with Deep Neural Networks Dan C. Cireşan IDSIA USI-SUPSI Manno, Switzerland, 6928 Email: [email protected] Ueli Meier IDSIA USI-SUPSI Manno, Switzerland, 6928
Multi-Column Deep Neural Network for Traffic Sign Classification
Multi-Column Deep Neural Network for Traffic Sign Classification Dan Cireşan, Ueli Meier, Jonathan Masci and Jürgen Schmidhuber IDSIA - USI - SUPSI Galleria 2, Manno - Lugano 6928, Switzerland Abstract
Image Classification for Dogs and Cats
Image Classification for Dogs and Cats Bang Liu, Yan Liu Department of Electrical and Computer Engineering {bang3,yan10}@ualberta.ca Kai Zhou Department of Computing Science [email protected] Abstract
Efficient online learning of a non-negative sparse autoencoder
and Machine Learning. Bruges (Belgium), 28-30 April 2010, d-side publi., ISBN 2-93030-10-2. Efficient online learning of a non-negative sparse autoencoder Andre Lemme, R. Felix Reinhart and Jochen J. Steil
Analecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15
Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright GENIVI Alliance
Novelty Detection in image recognition using IRF Neural Networks properties
Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,
Applications of Deep Learning to the GEOINT mission. June 2015
Applications of Deep Learning to the GEOINT mission June 2015 Overview Motivation Deep Learning Recap GEOINT applications: Imagery exploitation OSINT exploitation Geospatial and activity based analytics
Lecture 6: Classification & Localization. boris. [email protected]
Lecture 6: Classification & Localization boris. [email protected] 1 Agenda ILSVRC 2014 Overfeat: integrated classification, localization, and detection Classification with Localization Detection. 2 ILSVRC-2014
Steven C.H. Hoi School of Information Systems Singapore Management University Email: [email protected]
Steven C.H. Hoi School of Information Systems Singapore Management University Email: [email protected] Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual
Sense Making in an IOT World: Sensor Data Analysis with Deep Learning
Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Natalia Vassilieva, PhD Senior Research Manager GTC 2016 Deep learning proof points as of today Vision Speech Text Other Search & information
SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS
UDC: 004.8 Original scientific paper SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS Tonimir Kišasondi, Alen Lovren i University of Zagreb, Faculty of Organization and Informatics,
An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network
Proceedings of the 8th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING & DATA BASES (AIKED '9) ISSN: 179-519 435 ISBN: 978-96-474-51-2 An Energy-Based Vehicle Tracking System using Principal
Supporting Online Material for
www.sciencemag.org/cgi/content/full/313/5786/504/dc1 Supporting Online Material for Reducing the Dimensionality of Data with Neural Networks G. E. Hinton* and R. R. Salakhutdinov *To whom correspondence
Face Recognition For Remote Database Backup System
Face Recognition For Remote Database Backup System Aniza Mohamed Din, Faudziah Ahmad, Mohamad Farhan Mohamad Mohsin, Ku Ruhana Ku-Mahamud, Mustafa Mufawak Theab 2 Graduate Department of Computer Science,UUM
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,
Selecting Receptive Fields in Deep Networks
Selecting Receptive Fields in Deep Networks Adam Coates Department of Computer Science Stanford University Stanford, CA 94305 [email protected] Andrew Y. Ng Department of Computer Science Stanford
CS 1699: Intro to Computer Vision. Deep Learning. Prof. Adriana Kovashka University of Pittsburgh December 1, 2015
CS 1699: Intro to Computer Vision Deep Learning Prof. Adriana Kovashka University of Pittsburgh December 1, 2015 Today: Deep neural networks Background Architectures and basic operations Applications Visualizing
Pedestrian Detection with RCNN
Pedestrian Detection with RCNN Matthew Chen Department of Computer Science Stanford University [email protected] Abstract In this paper we evaluate the effectiveness of using a Region-based Convolutional
Deep learning applications and challenges in big data analytics
Najafabadi et al. Journal of Big Data (2015) 2:1 DOI 10.1186/s40537-014-0007-7 RESEARCH Open Access Deep learning applications and challenges in big data analytics Maryam M Najafabadi 1, Flavio Villanustre
Optimizing content delivery through machine learning. James Schneider Anton DeFrancesco
Optimizing content delivery through machine learning James Schneider Anton DeFrancesco Obligatory company slide Our Research Areas Machine learning The problem Prioritize import information in low bandwidth
Online Evolution of Deep Convolutional Network for Vision-Based Reinforcement Learning
Online Evolution of Deep Convolutional Network for Vision-Based Reinforcement Learning Jan Koutník, Jürgen Schmidhuber, and Faustino Gomez IDSIA, USI-SUPSI Galleria 2 Manno-Lugano, CH 6928 {hkou,juergen,tino}@idsia.ch
Simple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,
Taking Inverse Graphics Seriously
CSC2535: 2013 Advanced Machine Learning Taking Inverse Graphics Seriously Geoffrey Hinton Department of Computer Science University of Toronto The representation used by the neural nets that work best
GLOVE-BASED GESTURE RECOGNITION SYSTEM
CLAWAR 2012 Proceedings of the Fifteenth International Conference on Climbing and Walking Robots and the Support Technologies for Mobile Machines, Baltimore, MD, USA, 23 26 July 2012 747 GLOVE-BASED GESTURE
How To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
An Analysis of Single-Layer Networks in Unsupervised Feature Learning
An Analysis of Single-Layer Networks in Unsupervised Feature Learning Adam Coates 1, Honglak Lee 2, Andrew Y. Ng 1 1 Computer Science Department, Stanford University {acoates,ang}@cs.stanford.edu 2 Computer
arxiv:1312.6034v2 [cs.cv] 19 Apr 2014
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps arxiv:1312.6034v2 [cs.cv] 19 Apr 2014 Karen Simonyan Andrea Vedaldi Andrew Zisserman Visual Geometry Group,
Handwritten Digit Recognition with a Back-Propagation Network
396 Le Cun, Boser, Denker, Henderson, Howard, Hubbard and Jackel Handwritten Digit Recognition with a Back-Propagation Network Y. Le Cun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard,
ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION
1 ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION B. Mikó PhD, Z-Form Tool Manufacturing and Application Ltd H-1082. Budapest, Asztalos S. u 4. Tel: (1) 477 1016, e-mail: [email protected]
Neural network software tool development: exploring programming language options
INEB- PSI Technical Report 2006-1 Neural network software tool development: exploring programming language options Alexandra Oliveira [email protected] Supervisor: Professor Joaquim Marques de Sá June 2006
Visualizing Higher-Layer Features of a Deep Network
Visualizing Higher-Layer Features of a Deep Network Dumitru Erhan, Yoshua Bengio, Aaron Courville, and Pascal Vincent Dept. IRO, Université de Montréal P.O. Box 6128, Downtown Branch, Montreal, H3C 3J7,
6.2.8 Neural networks for data mining
6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural
Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition
IWNEST PUBLISHER Journal of Industrial Engineering Research (ISSN: 2077-4559) Journal home page: http://www.iwnest.com/aace/ Adaptive sequence of Key Pose Detection for Human Action Recognition 1 T. Sindhu
Manifold Learning with Variational Auto-encoder for Medical Image Analysis
Manifold Learning with Variational Auto-encoder for Medical Image Analysis Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill [email protected] Abstract Manifold
Denoising Convolutional Autoencoders for Noisy Speech Recognition
Denoising Convolutional Autoencoders for Noisy Speech Recognition Mike Kayser Stanford University [email protected] Victor Zhong Stanford University [email protected] Abstract We propose the use of
Face Recognition in Low-resolution Images by Using Local Zernike Moments
Proceedings of the International Conference on Machine Vision and Machine Learning Prague, Czech Republic, August14-15, 014 Paper No. 15 Face Recognition in Low-resolution Images by Using Local Zernie
The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection
CSED703R: Deep Learning for Visual Recognition (206S) Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection Bohyung Han Computer Vision Lab. [email protected] 2 3 Object detection
Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network
Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Anthony Lai (aslai), MK Li (lilemon), Foon Wang Pong (ppong) Abstract Algorithmic trading, high frequency trading (HFT)
Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski [email protected]
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski [email protected] Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems
Training Methods for Adaptive Boosting of Neural Networks for Character Recognition
Submission to NIPS*97, Category: Algorithms & Architectures, Preferred: Oral Training Methods for Adaptive Boosting of Neural Networks for Character Recognition Holger Schwenk Dept. IRO Université de Montréal
Neural Network based Vehicle Classification for Intelligent Traffic Control
Neural Network based Vehicle Classification for Intelligent Traffic Control Saeid Fazli 1, Shahram Mohammadi 2, Morteza Rahmani 3 1,2,3 Electrical Engineering Department, Zanjan University, Zanjan, IRAN
Data quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
Building an Advanced Invariant Real-Time Human Tracking System
UDC 004.41 Building an Advanced Invariant Real-Time Human Tracking System Fayez Idris 1, Mazen Abu_Zaher 2, Rashad J. Rasras 3, and Ibrahiem M. M. El Emary 4 1 School of Informatics and Computing, German-Jordanian
Simplified Machine Learning for CUDA. Umar Arshad @arshad_umar Arrayfire @arrayfire
Simplified Machine Learning for CUDA Umar Arshad @arshad_umar Arrayfire @arrayfire ArrayFire CUDA and OpenCL experts since 2007 Headquartered in Atlanta, GA In search for the best and the brightest Expert
An Introduction to Deep Learning
Thought Leadership Paper Predictive Analytics An Introduction to Deep Learning Examining the Advantages of Hierarchical Learning Table of Contents 4 The Emergence of Deep Learning 7 Applying Deep-Learning
Learning to Process Natural Language in Big Data Environment
CCF ADL 2015 Nanchang Oct 11, 2015 Learning to Process Natural Language in Big Data Environment Hang Li Noah s Ark Lab Huawei Technologies Part 1: Deep Learning - Present and Future Talk Outline Overview
Machine Learning. 01 - Introduction
Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge
Classifying Manipulation Primitives from Visual Data
Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if
The Delicate Art of Flower Classification
The Delicate Art of Flower Classification Paul Vicol Simon Fraser University University Burnaby, BC [email protected] Note: The following is my contribution to a group project for a graduate machine learning
Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks
1 Tattoo Detection for Soft Biometric De-Identification Based on Convolutional NeuralNetworks Tomislav Hrkać, Karla Brkić, Zoran Kalafatić Faculty of Electrical Engineering and Computing University of
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles [email protected] and lunbo
III. SEGMENTATION. A. Origin Segmentation
2012 International Conference on Frontiers in Handwriting Recognition Handwritten English Word Recognition based on Convolutional Neural Networks Aiquan Yuan, Gang Bai, Po Yang, Yanni Guo, Xinting Zhao
Università degli Studi di Bologna
Università degli Studi di Bologna DEIS Biometric System Laboratory Incremental Learning by Message Passing in Hierarchical Temporal Memory Davide Maltoni Biometric System Laboratory DEIS - University of
Deep Learning Meets Heterogeneous Computing. Dr. Ren Wu Distinguished Scientist, IDL, Baidu [email protected]
Deep Learning Meets Heterogeneous Computing Dr. Ren Wu Distinguished Scientist, IDL, Baidu [email protected] Baidu Everyday 5b+ queries 500m+ users 100m+ mobile users 100m+ photos Big Data Storage Processing
TRAFFIC sign recognition has direct real-world applications
Traffic Sign Recognition with Multi-Scale Convolutional Networks Pierre Sermanet and Yann LeCun Courant Institute of Mathematical Sciences, New York University {sermanet,yann}@cs.nyu.edu Abstract We apply
The Applications of Deep Learning on Traffic Identification
The Applications of Deep Learning on Traffic Identification Zhanyi Wang [email protected] Abstract Generally speaking, most systems of network traffic identification are based on features. The features
Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines
Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines Marc Aurelio Ranzato Geoffrey E. Hinton Department of Computer Science - University of Toronto 10 King s College Road,
Applying Deep Learning to Enhance Momentum Trading Strategies in Stocks
This version: December 12, 2013 Applying Deep Learning to Enhance Momentum Trading Strategies in Stocks Lawrence Takeuchi * Yu-Ying (Albert) Lee [email protected] [email protected] Abstract We
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
Lecture 6. Artificial Neural Networks
Lecture 6 Artificial Neural Networks 1 1 Artificial Neural Networks In this note we provide an overview of the key concepts that have led to the emergence of Artificial Neural Networks as a major paradigm
Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
Stochastic Pooling for Regularization of Deep Convolutional Neural Networks Matthew D. Zeiler Department of Computer Science Courant Institute, New York University [email protected] Rob Fergus Department
Removing Moving Objects from Point Cloud Scenes
1 Removing Moving Objects from Point Cloud Scenes Krystof Litomisky [email protected] Abstract. Three-dimensional simultaneous localization and mapping is a topic of significant interest in the research
Research Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network
Distributed Sensor Networks Volume 2015, Article ID 157453, 7 pages http://dx.doi.org/10.1155/2015/157453 Research Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network
A Neural Network Based System for Intrusion Detection and Classification of Attacks
A Neural Network Based System for Intrusion Detection and Classification of Attacks Mehdi MORADI and Mohammad ZULKERNINE Abstract-- With the rapid expansion of computer networks during the past decade,
Lecture 13: Validation
Lecture 3: Validation g Motivation g The Holdout g Re-sampling techniques g Three-way data splits Motivation g Validation techniques are motivated by two fundamental problems in pattern recognition: model
An Early Attempt at Applying Deep Reinforcement Learning to the Game 2048
An Early Attempt at Applying Deep Reinforcement Learning to the Game 2048 Hong Gui, Tinghan Wei, Ching-Bo Huang, I-Chen Wu 1 1 Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan
Deformable Part Models with CNN Features
Deformable Part Models with CNN Features Pierre-André Savalle 1, Stavros Tsogkas 1,2, George Papandreou 3, Iasonas Kokkinos 1,2 1 Ecole Centrale Paris, 2 INRIA, 3 TTI-Chicago Abstract. In this work we
Programming Exercise 3: Multi-class Classification and Neural Networks
Programming Exercise 3: Multi-class Classification and Neural Networks Machine Learning November 4, 2011 Introduction In this exercise, you will implement one-vs-all logistic regression and neural networks
The Big Data methodology in computer vision systems
The Big Data methodology in computer vision systems Popov S.B. Samara State Aerospace University, Image Processing Systems Institute, Russian Academy of Sciences Abstract. I consider the advantages of
Price Prediction of Share Market using Artificial Neural Network (ANN)
Prediction of Share Market using Artificial Neural Network (ANN) Zabir Haider Khan Department of CSE, SUST, Sylhet, Bangladesh Tasnim Sharmin Alin Department of CSE, SUST, Sylhet, Bangladesh Md. Akter
PLAANN as a Classification Tool for Customer Intelligence in Banking
PLAANN as a Classification Tool for Customer Intelligence in Banking EUNITE World Competition in domain of Intelligent Technologies The Research Report Ireneusz Czarnowski and Piotr Jedrzejowicz Department
Flexible, High Performance Convolutional Neural Networks for Image Classification
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Flexible, High Performance Convolutional Neural Networks for Image Classification Dan C. Cireşan, Ueli Meier,
Deep Convolutional Inverse Graphics Network
Deep Convolutional Inverse Graphics Network Tejas D. Kulkarni* 1, William F. Whitney* 2, Pushmeet Kohli 3, Joshua B. Tenenbaum 4 1,2,4 Massachusetts Institute of Technology, Cambridge, USA 3 Microsoft
Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations
Volume 3, No. 8, August 2012 Journal of Global Research in Computer Science REVIEW ARTICLE Available Online at www.jgrcs.info Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations
Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation
Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed A brief history of backpropagation
MulticoreWare. Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan
1 MulticoreWare Global Company, 250+ employees HQ = Sunnyvale, CA Other locations: US, China, India, Taiwan Focused on Heterogeneous Computing Multiple verticals spawned from core competency Machine Learning
An Approach for Utility Pole Recognition in Real Conditions
6th Pacific-Rim Symposium on Image and Video Technology 1st PSIVT Workshop on Quality Assessment and Control by Image and Video Analysis An Approach for Utility Pole Recognition in Real Conditions Barranco
LEUKEMIA CLASSIFICATION USING DEEP BELIEF NETWORK
Proceedings of the IASTED International Conference Artificial Intelligence and Applications (AIA 2013) February 11-13, 2013 Innsbruck, Austria LEUKEMIA CLASSIFICATION USING DEEP BELIEF NETWORK Wannipa
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
Feature Subset Selection in E-mail Spam Detection
Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature
On Optimization Methods for Deep Learning
Quoc V. Le [email protected] Jiquan Ngiam [email protected] Adam Coates [email protected] Abhik Lahiri [email protected] Bobby Prochnow [email protected] Andrew Y. Ng [email protected]
Real-Time Grasp Detection Using Convolutional Neural Networks
Real-Time Grasp Detection Using Convolutional Neural Networks Joseph Redmon 1, Anelia Angelova 2 Abstract We present an accurate, real-time approach to robotic grasp detection based on convolutional neural
Simultaneous Gamma Correction and Registration in the Frequency Domain
Simultaneous Gamma Correction and Registration in the Frequency Domain Alexander Wong [email protected] William Bishop [email protected] Department of Electrical and Computer Engineering University
NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling
1 Forecasting Women s Apparel Sales Using Mathematical Modeling Celia Frank* 1, Balaji Vemulapalli 1, Les M. Sztandera 2, Amar Raheja 3 1 School of Textiles and Materials Technology 2 Computer Information
Using Convolutional Neural Networks for Image Recognition
Using Convolutional Neural Networks for Image Recognition By Samer Hijazi, Rishi Kumar, and Chris Rowen, IP Group, Cadence Convolutional neural networks (CNNs) are widely used in pattern- and image-recognition
Deep Learning For Text Processing
Deep Learning For Text Processing Jeffrey A. Bilmes Professor Departments of Electrical Engineering & Computer Science and Engineering University of Washington, Seattle http://melodi.ee.washington.edu/~bilmes
Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016
Optical Flow Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Outline Introduction Variation Models Feature Matching Methods End-to-end Learning based Methods Discussion Optical Flow Goal: Pixel motion
ARTIFICIAL NEURAL NETWORKS IN THE SCOPE OF OPTICAL PERFORMANCE MONITORING
1 th Portuguese Conference on Automatic Control 16-18 July 212 CONTROLO 212 Funchal, Portugal ARTIFICIAL NEURAL NETWORKS IN THE SCOPE OF OPTICAL PERFORMANCE MONITORING Vítor Ribeiro,?? Mário Lima, António
Chapter 4: Artificial Neural Networks
Chapter 4: Artificial Neural Networks CS 536: Machine Learning Littman (Wu, TA) Administration icml-03: instructional Conference on Machine Learning http://www.cs.rutgers.edu/~mlittman/courses/ml03/icml03/
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,
ImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky University of Toronto [email protected] Ilya Sutskever University of Toronto [email protected] Geoffrey E. Hinton University
Neural Network Design in Cloud Computing
International Journal of Computer Trends and Technology- volume4issue2-2013 ABSTRACT: Neural Network Design in Cloud Computing B.Rajkumar #1,T.Gopikiran #2,S.Satyanarayana *3 #1,#2Department of Computer
