# Efficient Regression of General-Activity Human Poses from Depth Images

Save this PDF as:

Size: px
Start display at page:

## Transcription

4 l. shoulder joint l. elbow joint l. hand joint r. knee joint Figure 2. Empirical offset distributions. We visualize the set of 3D relative offsets for four body joints at two different leaf nodes each. For each set of axes we plot an orange square at the offset from each training pixel reaching a particular node to the target joint. (The red, green, and blue squares indicate respectively the positive x, y, and z axes; each half-axis represents 0.5m in world space). We show training images for each node illustrating the pixel that reached the leaf node as a cyan cross, and the offset vector as an orange arrow. Note how the decision trees tend to cluster pixels with similar local appearance at the leaves, but the inherent remaining ambiguity results in multi-modal offset distributions. Our algorithm compresses these distributions to a very small number of modes while maintaining high test accuracy. to Eq. 1, though here defined over relative offsets, without weighting, and using a learned bandwidth b. The positions of the modes form the relative votes ljk and the numbers of offsets that reached each mode form the vote weights w ljk. In Sec. 4.3, we show that there is no benefit to storing more than K = 2 relative votes per leaf. To maintain practical training times and keep memory consumption reasonable we use reservoir sampling [20] to maintain a fixed-size unbiased sample of C offsets. We discuss the effect of varying the reservoir capacity in Sec In our unoptimized implementation, learning these relative votes for 16 joints in 3 trees trained with 10k images took approximately 45 minutes on a single 8-core machine. The vast majority of that time is spent traversing the tree; the use of reservoir sampling ensures the time spent running mean shift totals only about 2 minutes. The trained model can be thought of as a Gaussian mixture model (GMM) with shared isotropic variance b 2 j. Note however that taking the top K modes found by mean shift Algorithm 2 Learning relative votes 1: // Collect relative offsets 2: initialize R lj = for all leaf nodes l and joints j 3: for all pixels q in all training images i do 4: lookup ground truth joint positions z ij 5: lookup 3D pixel position x iq 6: compute relative offset iq j = z ij x iq 7: descend tree to reach leaf node l 8: store iq j in R lj with reservoir sampling 9: // Cluster 10: for all leaf nodes l and joints j do 11: cluster offsets R lj using mean shift 12: take top K weighted modes as V lj 13: return relative votes V lj for all nodes and joints gives a very different output to learning a GMM using EM with a fixed K. In particular, fitting a model with K = 1 is not the same as fitting a single Gaussian to all the data Learning the hyper-parameters The training-time clustering bandwidth b and length threshold ρ, and the test-time per-joint aggregation bandwidth b j and vote length thresholds λ j were optimized by grid search to maximize mean average precision over a 5000 image validation set. The optimal bandwidth b was 0.05m. The optimized length thresholds λ j fall between 0.1m and 0.55m, indicating that some joints benefit from fairly long range offsets, while other joints require shorter range predictions. For comparison with [18], the main error metric does not penalize failures to predict occluded joints. Interestingly, when the error metric is changed to penalize such failures, the optimized λ j are typically larger (in some cases dramatically so) than when optimizing with the main error metric. We include a table of the λ j found under both error metrics in the supplementary material Learning the tree structure To train the tree structure and image feature tests used at the split nodes, we use the standard greedy decision tree training algorithm. The set of all training pixels Q = {(i, q)} (pixels q in images i) is recursively partitioned into left Q l (φ) and right Q r (φ) subsets by evaluating many splitting function candidates φ under an error function E(Q) (see below). The best splitting function is selected according to φ = argmin φ s {l,r} Q s (φ) E(Q s (φ)) (2) Q which minimizes the error while balancing the sizes of the left and right partitions. If the tree is not too deep, the al-

5 gorithm recurses on the examples Q l (φ ) and Q r (φ ) for the left and right children respectively. See e.g. [12, 18] for more details. As proxies to our ideal objective of maximizing joint detection accuracy on the training set, we investigated both regression and classification objective functions, as described below. Regression. Here, the objective is to partition the examples to give nodes with minimal uncertainty in their joint offset distributions [9, 7]. In our problem, the offset distribution for a given tree node is likely to be highly multimodal. A good approach might be to fit a GMM to the offsets (perhaps similarly to how the leaf regression models are learned) and use the negative log likelihood of the offsets under this model as the objective. However, GMM fitting would need to be repeated at each node for thousands of splitting candidates, making this prohibitively expensive. Following existing work [7], we employ the much cheaper sum-of-squared-differences objective: E reg (Q) = iq j µ j 2 2, (3) j (i,q) Q j 1 µ j = iq j, where (4) Q j (i,q) Q j Q j = { (i, q) Q iq j 2 < ρ } (5) Unlike [7], we introduce an offset vector length threshold ρ to remove offsets that are large and thus likely to be outliers (results in Sec. 4.1 highlight its importance). While this model implicitly assumes a uni-modal Gaussian, which we know to be unrealistic, for learning the tree structure, this assumption can still produce satisfactory results. Classification. Since the tree structure is learned independently from the regression models at the leaves, we explored a second objective E cls (Q) that minimizes the Shannon entropy for a classification task. Entropy is computed over the normalized histogram of a set of ground truth body part labels c iq for all (i, q) Q. We re-use the parts proposed in [18], a set of 31 body parts defined to localize particular regions of the body close to joints of interest. Optimizing for body part classification is a good proxy for the regression task, as image patches reaching the resulting leaf nodes tend to have both similar appearances and local body joint configurations. This means that for nearby joints the leaf node offsets are likely to be small and tightly clustered. The objective further avoids the assumption of the offset vectors being Gaussian distributed. We experimented with other node splitting objectives, including various forms of mixing body part classification and regression (as used in [7]), as well as separate regression forests for each joint. As we discuss in Sec. 4.1, we found that trees trained for the body part classification task outperformed trees trained with all other tested methods, including single joint regression. 4. Experimental Results and Discussion In this section we evaluate several aspects of our work and compare with existing techniques. We evaluate on several datasets including the MSRC dataset of 5000 synthetic depth images [18] and the Stanford dataset of real depth images [8], obtaining state of the art results. We follow the protocol from [18] as follows. Joint prediction is cast as a detection problem, and average precision (AP) and its mean across joints (map) is used to measure accuracy. The most confident joint hypothesis within distance D = 0.1m of the ground truth counts as a true positive; any others count as false positives. Missing predictions for occluded joints do not count as false negatives, except in one experiment below Tree structure training objectives We evaluated several objective functions for training the structure of the decision trees, using forests of 3 trees each trained to depth 20 with 5000 images. The results, comparing average precision on all joints, are summarized in Fig. 3. The task of predicting continuous joint locations from depth pixels is fundamentally a regression problem. Intuitively, we might expect a regression-style objective function to produce the best trees for our approach. Perhaps surprisingly then, for all joints except head, neck, and shoulders, trees trained using the body part classification objective from [18] gave the highest accuracy. We believe the uni-modal assumption implicit in the regression objective may be causing this, and that body part classification is a reasonable proxy for a regression objective that correctly accounts for multi-modality. Investigating efficient methods for fitting multi-modal distributions in a regression objective remains future work. In the remainder of this section we base all of our experiments on 3 trees trained to depth 20 using the body part classification objective Comparisons Hough forests [7]. We compare our use of offset clustering during training and a continuous voting space for testing, with the approach taken by [7] where all offset vectors are stored during training and a discretized voting volume is used for testing, given a fixed tree structure. The diverse MSRC-5000 test data covers a large voting volume of 4m 4m 5m. To allow accurate localization we used a voxel resolution of 2cm per side, resulting in a voting volume with 10 million bins. Note that the inherent 3D nature of the problem makes discrete voting much less attractive than for 2D prediction. At test time, we smooth the voting volume using a Gaussian of fixed standard deviation 1.3cm. We re-trained the leaf nodes, storing up to 400 offset votes for each joint, uniformly sampled (using all votes would have been prohibitive). Due to memory and runtime constraints we compare on two representative joints (head and left hand) in Fig. 4. Interestingly, even with discrete

8 Mean average precision Mean average precision Mean average precision Tree depth 5 5 shared threshold threshold tuned per joint Offset vote length threshold (m) Our algorithm [Shotton et al.] Mean Shift [Shotton et al.] Agglomerative Frames per second (a) (b) (c) Figure 7. Effect of various system parameters. (a) Mean average precision versus tree depth; (b) mean average precision versus a single, shared vote length threshold for all joints; and (c) mean average precision versus frames per second as a function of the number of votes retained before running mean shift in the voting space at test time. ages. To test 2D prediction accuracy, we flattened our training and test images to a fixed depth, 2m, producing silhouettes in which the pose scale is unknown. To compute average precision for this 2D prediction task, we modified the true positive radius to an absolute pixel distance, D = 10 pixels. Our algorithm compares very favorably to [18] on this task, achieving map vs Example inferences from silhouettes appear in the right column of Fig. 5. Note how the hypothesis confidences correlate well with the ambiguity in the signal. 5. Discussion We have presented a new, extremely efficient regression forest based method for human pose estimation in single depth or silhouette images. Unlike previous work [18], our method does not predict segmentations of the surface of the body, but instead directly predicts the positions of interior body joints even under occlusion. Our method extends Hough forests [7] by using compact models for storing and aggregating relative offset votes. We investigate different objectives for learning the structure of the forest and choices of vote representation and aggregation. Due to our emphasis on efficiency of training, these models can be automatically learned from large training sets. Results on several challenging real and synthetic datasets show considerable improvement over the state of the art while maintaining super-realtime speeds. Our experimental evaluation reveals that, perhaps surprisingly, for learning the structure of the regression forest on our difficult multiple body joint prediction problem, the Hough forest objective function performs poorly relative to the body part classification objective used in [18]. We believe that this effect is due to the errors introduced by the implicit modeling of the inherently multi-modal offset distributions at the interior tree nodes as uni-modal Gaussians in the tree structure regression objective. This also may be compounded by the greedy nature of the tree learning procedure. Further investigation of alternative efficient tree structure learning regression objectives is a promising direction for future work. Acknowledgements. We thank Toby Sharp, Mat Cook, John Winn, and Duncan Robertson for their help and interesting discussions. References [1] A. Agarwal and B. Triggs. 3D human pose from silhouettes by relevance vector regression. In Proc. CVPR, [2] L. Bourdev and J. Malik. Poselets: Body part detectors trained using 3D human pose annotations. In Proc. ICCV, [3] L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen. Classification and Regression Trees. Chapman and Hall/CRC, [4] M. Brubaker, D. Fleet, and A. Hertzmann. Physics-based person tracking using the anthropomorphic walker. IJCV, [5] D. Comaniciu and P. Meer. Mean shift: A robust approach toward feature space analysis. IEEE TPAMI, 24(5): , [6] A. Criminisi, J. Shotton, D. Robertson, and E. Konukoglu. Regression forests for efficient anatomy detection and localization in CT studies. In MCV, MICCAI workshop, [7] J. Gall and V. Lempitsky. Class-specific Hough forests for object detection. In PAMI. IEEE, , 2, 5, 6, 8 [8] V. Ganapathi, C. Plagemann, D. Koller, and S. Thrun. Real time motion capture using a single time-of-flight camera. In Proc. CVPR, pages IEEE, , 6 [9] T. Hastie, R. Tibshirani, J. Friedman, and J. Franklin. The elements of statistical learning: data mining, inference and prediction. The Mathematical Intelligencer, 27(2):83 85, [10] A. Kanaujia, C. Sminchisescu, and D. Metaxas. Semi-supervised hierarchical models for 3D human pose reconstruction. In Proc. CVPR, [11] B. Leibe, A. Leonardis, and B. Schiele. Robust object detection with interleaved categorization and segmentation. IJCV, 77(1-3): , , 2 [12] V. Lepetit, P. Lagger, and P. Fua. Randomized trees for real-time keypoint recognition. In Proc. CVPR, pages 2: , , 5 [13] Microsoft Corp. Redmond WA. Kinect for Xbox , 2 [14] J. Müller and M. Arens. Human pose estimation with implicit shape models. In ARTEMIS, , 2 [15] R. Navaratnam, A. W. Fitzgibbon, and R. Cipolla. The joint manifold model for semi-supervised multi-valued regression. In Proc. ICCV, [16] C. Plagemann, V. Ganapathi, D. Koller, and S. Thrun. Real-time identification and localization of body parts from depth images. In Proc. ICRA, [17] G. Rogez, J. Rihan, S. Ramalingam, C. Orrite, and P. Torr. Randomized trees for human pose detection. In Proc. CVPR, [18] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake. Real-time human pose recognition in parts from a single depth image. In Proc. CVPR. IEEE, , 2, 3, 4, 5, 6, 7, 8 [19] R. Urtasun and T. Darrell. Local probabilistic regression for activityindependent human pose inference. In Proc. CVPR, [20] J. S. Vitter. Random sampling with a reservoir. ACM Trans. Math. Software, 11(1):37 57,

### Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC

Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain

### Human Pose Estimation from RGB Input Using Synthetic Training Data

Human Pose Estimation from RGB Input Using Synthetic Training Data Oscar Danielsson and Omid Aghazadeh School of Computer Science and Communication KTH, Stockholm, Sweden {osda02, omida}@kth.se arxiv:1405.1213v2

### The Scientific Data Mining Process

Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

### Mean-Shift Tracking with Random Sampling

1 Mean-Shift Tracking with Random Sampling Alex Po Leung, Shaogang Gong Department of Computer Science Queen Mary, University of London, London, E1 4NS Abstract In this work, boosting the efficiency of

### How does the Kinect work? John MacCormick

How does the Kinect work? John MacCormick Xbox demo Laptop demo The Kinect uses structured light and machine learning Inferring body position is a two-stage process: first compute a depth map (using structured

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### 3D Model based Object Class Detection in An Arbitrary View

3D Model based Object Class Detection in An Arbitrary View Pingkun Yan, Saad M. Khan, Mubarak Shah School of Electrical Engineering and Computer Science University of Central Florida http://www.eecs.ucf.edu/

### Real-Time Human Pose Recognition in Parts from Single Depth Images

Real-Time Human Pose Recognition in Parts from Single Depth Images Jamie Shotton Andrew Fitzgibbon Mat Cook Toby Sharp Mark Finocchio Richard Moore Alex Kipman Andrew Blake Microsoft Research Cambridge

### The Visual Internet of Things System Based on Depth Camera

The Visual Internet of Things System Based on Depth Camera Xucong Zhang 1, Xiaoyun Wang and Yingmin Jia Abstract The Visual Internet of Things is an important part of information technology. It is proposed

### Accurate fully automatic femur segmentation in pelvic radiographs using regression voting

Accurate fully automatic femur segmentation in pelvic radiographs using regression voting C. Lindner 1, S. Thiagarajah 2, J.M. Wilkinson 2, arcogen Consortium, G.A. Wallis 3, and T.F. Cootes 1 1 Imaging

### Data, Measurements, Features

Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

### Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Practical Tour of Visual tracking David Fleet and Allan Jepson January, 2006 Designing a Visual Tracker: What is the state? pose and motion (position, velocity, acceleration, ) shape (size, deformation,

### DYNAMIC RANGE IMPROVEMENT THROUGH MULTIPLE EXPOSURES. Mark A. Robertson, Sean Borman, and Robert L. Stevenson

c 1999 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or

### A Genetic Algorithm-Evolved 3D Point Cloud Descriptor

A Genetic Algorithm-Evolved 3D Point Cloud Descriptor Dominik Wȩgrzyn and Luís A. Alexandre IT - Instituto de Telecomunicações Dept. of Computer Science, Univ. Beira Interior, 6200-001 Covilhã, Portugal

### Classification and Regression Trees

Classification and Regression Trees Bob Stine Dept of Statistics, School University of Pennsylvania Trees Familiar metaphor Biology Decision tree Medical diagnosis Org chart Properties Recursive, partitioning

### Gerry Hobbs, Department of Statistics, West Virginia University

Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

### Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

### A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

### Class-Specific Hough Forests for Object Detection

Class-Specific Hough Forests for Object Detection Juergen Gall BIWI, ETH Zürich and MPI Informatik gall@vision.ee.ethz.ch Victor Lempitsky Microsoft Research Cambridge victlem@microsoft.com Abstract We

### Lecture 10: Regression Trees

Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,

### Fast Matching of Binary Features

Fast Matching of Binary Features Marius Muja and David G. Lowe Laboratory for Computational Intelligence University of British Columbia, Vancouver, Canada {mariusm,lowe}@cs.ubc.ca Abstract There has been

### Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

### Kinect Gesture Recognition for Interactive System

1 Kinect Gesture Recognition for Interactive System Hao Zhang, WenXiao Du, and Haoran Li Abstract Gaming systems like Kinect and XBox always have to tackle the problem of extracting features from video

### VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS Norbert Buch 1, Mark Cracknell 2, James Orwell 1 and Sergio A. Velastin 1 1. Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE,

### Virtual Fitting by Single-shot Body Shape Estimation

Virtual Fitting by Single-shot Body Shape Estimation Masahiro Sekine* 1 Kaoru Sugita 1 Frank Perbet 2 Björn Stenger 2 Masashi Nishiyama 1 1 Corporate Research & Development Center, Toshiba Corporation,

### Distributed forests for MapReduce-based machine learning

Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication

### Edge tracking for motion segmentation and depth ordering

Edge tracking for motion segmentation and depth ordering P. Smith, T. Drummond and R. Cipolla Department of Engineering University of Cambridge Cambridge CB2 1PZ,UK {pas1001 twd20 cipolla}@eng.cam.ac.uk

### Data Mining Cluster Analysis: Advanced Concepts and Algorithms. ref. Chapter 9. Introduction to Data Mining

Data Mining Cluster Analysis: Advanced Concepts and Algorithms ref. Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar 1 Outline Prototype-based Fuzzy c-means Mixture Model Clustering Density-based

### Color Segmentation Based Depth Image Filtering

Color Segmentation Based Depth Image Filtering Michael Schmeing and Xiaoyi Jiang Department of Computer Science, University of Münster Einsteinstraße 62, 48149 Münster, Germany, {m.schmeing xjiang}@uni-muenster.de

### Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References

### Tracking in flussi video 3D. Ing. Samuele Salti

Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking

### A Unified Framework for 3D Hand Tracking

A Unified Framework for 3D Hand Tracking Rudra P K Poudel, Jose A S Fonseca, Jian J Zhang, Hammadi Nait-Charif Bournemouth University, Pool BH12 5BB, UK, {rpoudel, jfonseca, jzhang, hncharif} @bournemouth.ac.uk

### Bootstrapping Big Data

Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu

### The Artificial Prediction Market

The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory

### Data Mining Practical Machine Learning Tools and Techniques

Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

### Finding people in repeated shots of the same scene

Finding people in repeated shots of the same scene Josef Sivic 1 C. Lawrence Zitnick Richard Szeliski 1 University of Oxford Microsoft Research Abstract The goal of this work is to find all occurrences

### Automatic and Efficient Long Term Arm and Hand Tracking for Continuous Sign Language TV Broadcasts

PFISTER et al.: AUTOMATIC AND EFFICIENT LONG TERM ARM AND HAND TRACKING 1 Automatic and Efficient Long Term Arm and Hand Tracking for Continuous Sign Language TV Broadcasts Tomas Pfister 1 tp@robots.ox.ac.uk

### Pedestrian Detection with RCNN

Pedestrian Detection with RCNN Matthew Chen Department of Computer Science Stanford University mcc17@stanford.edu Abstract In this paper we evaluate the effectiveness of using a Region-based Convolutional

### Building an Advanced Invariant Real-Time Human Tracking System

UDC 004.41 Building an Advanced Invariant Real-Time Human Tracking System Fayez Idris 1, Mazen Abu_Zaher 2, Rashad J. Rasras 3, and Ibrahiem M. M. El Emary 4 1 School of Informatics and Computing, German-Jordanian

### Tracking of Small Unmanned Aerial Vehicles

Tracking of Small Unmanned Aerial Vehicles Steven Krukowski Adrien Perkins Aeronautics and Astronautics Stanford University Stanford, CA 94305 Email: spk170@stanford.edu Aeronautics and Astronautics Stanford

### Tracking Groups of Pedestrians in Video Sequences

Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal

### Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

### Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling

, March 13-15, 2013, Hong Kong Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling Naveed Ahmed Abstract We present a system for spatio-temporally

### EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM Amol Ambardekar, Mircea Nicolescu, and George Bebis Department of Computer Science and Engineering University

### BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

### Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Henrik Boström School of Humanities and Informatics University of Skövde P.O. Box 408, SE-541 28 Skövde

### Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

### Chapter 20: Data Analysis

Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

### High Quality Image Magnification using Cross-Scale Self-Similarity

High Quality Image Magnification using Cross-Scale Self-Similarity André Gooßen 1, Arne Ehlers 1, Thomas Pralow 2, Rolf-Rainer Grigat 1 1 Vision Systems, Hamburg University of Technology, D-21079 Hamburg

### Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

### Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail

### Speed Performance Improvement of Vehicle Blob Tracking System

Speed Performance Improvement of Vehicle Blob Tracking System Sung Chun Lee and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu, nevatia@usc.edu Abstract. A speed

### Chapter 6. The stacking ensemble approach

82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

### Automatic Labeling of Lane Markings for Autonomous Vehicles

Automatic Labeling of Lane Markings for Autonomous Vehicles Jeffrey Kiske Stanford University 450 Serra Mall, Stanford, CA 94305 jkiske@stanford.edu 1. Introduction As autonomous vehicles become more popular,

### Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr

Image Classification and Object Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Image classification Image (scene) classification is a fundamental

### Removing Moving Objects from Point Cloud Scenes

1 Removing Moving Objects from Point Cloud Scenes Krystof Litomisky klitomis@cs.ucr.edu Abstract. Three-dimensional simultaneous localization and mapping is a topic of significant interest in the research

### Employer Health Insurance Premium Prediction Elliott Lui

Employer Health Insurance Premium Prediction Elliott Lui 1 Introduction The US spends 15.2% of its GDP on health care, more than any other country, and the cost of health insurance is rising faster than

### ColorCrack: Identifying Cracks in Glass

ColorCrack: Identifying Cracks in Glass James Max Kanter Massachusetts Institute of Technology 77 Massachusetts Ave Cambridge, MA 02139 kanter@mit.edu Figure 1: ColorCrack automatically identifies cracks

### Outlier Ensembles. Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY 10598. Keynote, Outlier Detection and Description Workshop, 2013

Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY 10598 Outlier Ensembles Keynote, Outlier Detection and Description Workshop, 2013 Based on the ACM SIGKDD Explorations Position Paper: Outlier

### TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP

TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

### Model Combination. 24 Novembre 2009

Model Combination 24 Novembre 2009 Datamining 1 2009-2010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy

### Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

### Decision Trees from large Databases: SLIQ

Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values

### Local features and matching. Image classification & object localization

Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

### Interactive Offline Tracking for Color Objects

Interactive Offline Tracking for Color Objects Yichen Wei Jian Sun Xiaoou Tang Heung-Yeung Shum Microsoft Research Asia, Beijing, China {yichenw,jiansun,xitang,hshum}@microsoft.com Abstract In this paper,

### Image Classification using Random Forests and Ferns

Image Classification using Random Forests and Ferns Anna Bosch Computer Vision Group University of Girona aboschr@eia.udg.es Andrew Zisserman Dept. of Engineering Science University of Oxford az@robots.ox.ac.uk

### Prime Sensor NITE 1.3 Algorithms notes

Giving Devices a Prime Sense Prime Sensor NITE 1.3 Algorithms notes Version 1.0 COPYRIGHT 2010. PrimeSense Inc. ALL RIGHTS RESERVED. NO PART OF THIS DOCUMENT MAY REPRODUCED, PHOTOCOPIED, STORED ON A RETRIEVAL

### EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

### CS231M Project Report - Automated Real-Time Face Tracking and Blending

CS231M Project Report - Automated Real-Time Face Tracking and Blending Steven Lee, slee2010@stanford.edu June 6, 2015 1 Introduction Summary statement: The goal of this project is to create an Android

### Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection

CSED703R: Deep Learning for Visual Recognition (206S) Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 3 Object detection

### Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

### Determining optimal window size for texture feature extraction methods

IX Spanish Symposium on Pattern Recognition and Image Analysis, Castellon, Spain, May 2001, vol.2, 237-242, ISBN: 84-8021-351-5. Determining optimal window size for texture feature extraction methods Domènec

### Interpreting American Sign Language with Kinect

Interpreting American Sign Language with Kinect Frank Huang and Sandy Huang December 16, 2011 1 Introduction Accurate and real-time machine translation of sign language has the potential to significantly

### Scale-Invariant Object Categorization using a Scale-Adaptive Mean-Shift Search

in DAGM 04 Pattern Recognition Symposium, Tübingen, Germany, Aug 2004. Scale-Invariant Object Categorization using a Scale-Adaptive Mean-Shift Search Bastian Leibe ½ and Bernt Schiele ½¾ ½ Perceptual Computing

### Learning Motion Categories using both Semantic and Structural Information

Learning Motion Categories using both Semantic and Structural Information Shu-Fai Wong, Tae-Kyun Kim and Roberto Cipolla Department of Engineering, University of Cambridge, Cambridge, CB2 1PZ, UK {sfw26,

### 6. If there is no improvement of the categories after several steps, then choose new seeds using another criterion (e.g. the objects near the edge of

Clustering Clustering is an unsupervised learning method: there is no target value (class label) to be predicted, the goal is finding common patterns or grouping similar examples. Differences between models/algorithms

### Data Mining with R. Decision Trees and Random Forests. Hugh Murrell

Data Mining with R Decision Trees and Random Forests Hugh Murrell reference books These slides are based on a book by Graham Williams: Data Mining with Rattle and R, The Art of Excavating Data for Knowledge

### TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

### Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

### Human Pose Estimation using Body Parts Dependent Joint Regressors

Human Pose Estimation using Body Parts Dependent Joint Regressors Matthias Dantone 1 Juergen Gall 2 Christian Leistner 3 Luc Van Gool 1 ETH Zurich, Switzerland 1 MPI for Intelligent Systems, Germany 2

### CHAPTER 3 DATA MINING AND CLUSTERING

CHAPTER 3 DATA MINING AND CLUSTERING 3.1 Introduction Nowadays, large quantities of data are being accumulated. The amount of data collected is said to be almost doubled every 9 months. Seeking knowledge

### Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012

Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering

### Template-based Eye and Mouth Detection for 3D Video Conferencing

Template-based Eye and Mouth Detection for 3D Video Conferencing Jürgen Rurainsky and Peter Eisert Fraunhofer Institute for Telecommunications - Heinrich-Hertz-Institute, Image Processing Department, Einsteinufer

### Segmentation & Clustering

EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,

### Data Preprocessing. Week 2

Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.

### Tracking performance evaluation on PETS 2015 Challenge datasets

Tracking performance evaluation on PETS 2015 Challenge datasets Tahir Nawaz, Jonathan Boyle, Longzhen Li and James Ferryman Computational Vision Group, School of Systems Engineering University of Reading,

### 10-810 /02-710 Computational Genomics. Clustering expression data

10-810 /02-710 Computational Genomics Clustering expression data What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally,

### Final Exam, Spring 2007

10-701 Final Exam, Spring 2007 1. Personal info: Name: Andrew account: E-mail address: 2. There should be 16 numbered pages in this exam (including this cover sheet). 3. You can use any material you brought:

### Fast Multipole Method for particle interactions: an open source parallel library component

Fast Multipole Method for particle interactions: an open source parallel library component F. A. Cruz 1,M.G.Knepley 2,andL.A.Barba 1 1 Department of Mathematics, University of Bristol, University Walk,

### Lecture 20: Clustering

Lecture 20: Clustering Wrap-up of neural nets (from last lecture Introduction to unsupervised learning K-means clustering COMP-424, Lecture 20 - April 3, 2013 1 Unsupervised learning In supervised learning,

### CS 534: Computer Vision 3D Model-based recognition

CS 534: Computer Vision 3D Model-based recognition Ahmed Elgammal Dept of Computer Science CS 534 3D Model-based Vision - 1 High Level Vision Object Recognition: What it means? Two main recognition tasks:!

### CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES

CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,

### Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis

Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Abdun Mahmood, Christopher Leckie, Parampalli Udaya Department of Computer Science and Software Engineering University of

### CS229 Project Final Report. Sign Language Gesture Recognition with Unsupervised Feature Learning

CS229 Project Final Report Sign Language Gesture Recognition with Unsupervised Feature Learning Justin K. Chen, Debabrata Sengupta, Rukmani Ravi Sundaram 1. Introduction The problem we are investigating

### Efficient visual search of local features. Cordelia Schmid

Efficient visual search of local features Cordelia Schmid Visual search change in viewing angle Matches 22 correct matches Image search system for large datasets Large image dataset (one million images

June 2010 From: StatSoft Analytics White Papers To: Internal release Re: Performance comparison of STATISTICA Version 9 on multi-core 64-bit machines with current 64-bit releases of SAS (Version 9.2) and

### Supporting Online Material for

www.sciencemag.org/cgi/content/full/313/5786/504/dc1 Supporting Online Material for Reducing the Dimensionality of Data with Neural Networks G. E. Hinton* and R. R. Salakhutdinov *To whom correspondence

### Data Clustering. Dec 2nd, 2013 Kyrylo Bessonov

Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main