Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC



Similar documents
Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) ( ) Roman Kern. KTI, TU Graz

Active Learning SVM for Blogs recommendation

Data Mining. Nonlinear Classification

Image Registration and Fusion. Professor Michael Brady FRS FREng Department of Engineering Science Oxford University

2. MATERIALS AND METHODS

Norbert Schuff Professor of Radiology VA Medical Center and UCSF

How does the Kinect work? John MacCormick

Semi-Supervised Support Vector Machines and Application to Spam Filtering

Semantic Image Segmentation and Web-Supervised Visual Learning

Diffeomorphic Diffusion Registration of Lung CT Images

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

Face Model Fitting on Low Resolution Images

Image Segmentation and Registration

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

Deformable Registration for Image-Guided Radiation Therapy

Distributed forests for MapReduce-based machine learning

Machine Learning with MATLAB David Willingham Application Engineer

Lecture 10: Regression Trees

Lecture 3: Linear methods for classification

Alessandro Laio, Maria d Errico and Alex Rodriguez SISSA (Trieste)

Multisensor Data Fusion and Applications

Environmental Remote Sensing GEOG 2021

Vision based Vehicle Tracking using a high angle camera

FlowMergeCluster Documentation

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.

Learning from Diversity

Human Pose Estimation from RGB Input Using Synthetic Training Data

Data Mining Practical Machine Learning Tools and Techniques

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

Probabilistic Latent Semantic Analysis (plsa)

The Problem With Atlas Encoding and Network Marketing

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

MACHINE LEARNING IN HIGH ENERGY PHYSICS

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Leveraging Ensemble Models in SAS Enterprise Miner

Using Ensemble of Decision Trees to Forecast Travel Time

An Overview of Knowledge Discovery Database and Data mining Techniques

ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS

Linear Threshold Units

CS570 Data Mining Classification: Ensemble Methods

Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel

Module 5. Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016

Predict Influencers in the Social Network

Knowledge Discovery and Data Mining

How To Make A Credit Risk Model For A Bank Account

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data

Data Mining and Neural Networks in Stata

Better credit models benefit us all

Morphological analysis on structural MRI for the early diagnosis of neurodegenerative diseases. Marco Aiello On behalf of MAGIC-5 collaboration

Subspace Analysis and Optimization for AAM Based Face Alignment

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

ALLEN Mouse Brain Atlas

Principles of Dat Da a t Mining Pham Tho Hoan hoanpt@hnue.edu.v hoanpt@hnue.edu. n

Neuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease

Chapter 11 Boosting. Xiaogang Su Department of Statistics University of Central Florida - 1 -

Data Mining Algorithms Part 1. Dejan Sarka

Blog Post Extraction Using Title Finding

Probabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur

Efficient Regression of General-Activity Human Poses from Depth Images

Volume visualization I Elvins

Clustering Artificial Intelligence Henry Lin. Organizing data into clusters such that there is

STA 4273H: Statistical Machine Learning

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

Lecture 6: Logistic Regression

The Scientific Data Mining Process

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu

Simple and efficient online algorithms for real world applications

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

Colour Image Segmentation Technique for Screen Printing

Manifold Learning with Variational Auto-encoder for Medical Image Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis

Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Data Mining: An Overview. David Madigan

Machine Learning. Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos)

Image Analysis for Volumetric Industrial Inspection and Interaction

Advanced Ensemble Strategies for Polynomial Models

An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis]

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute

Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree.

Part-Based Recognition

Why do we have so many brain coordinate systems? Lilla ZölleiZ WhyNHow seminar 12/04/08

The Wondrous World of fmri statistics

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

Exploratory Data Analysis with MATLAB

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

QAV-PET: A Free Software for Quantitative Analysis and Visualization of PET Images

A Learning Based Method for Super-Resolution of Low Resolution Images

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

Transcription:

Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC

Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain - MR Normal or diseased? How big is the diseased area? How fast is it progressing? Is the treatment working? Fast data retrieval from databases Retrieving similar cases Big data, 2D, 3D, 4D images Spine - CT Lungs - XRay Diagnosing is easy. Measuring is hard!

Medical image analysis do we need machine learning? Example: how big is the tumour? - need to segment the tumour automatically Actively proliferating tumour region Healthy region Need to look beyond single pixels, to context A cascade of manual if-then, if-then statements too brittle We need to select discriminative features and their combination from training data.

Medical image analysis what kind of machine learning? Efficient training classifier Efficient testing/runtime Interpretability of classifier. No black boxes. (Why does it work? What are the limitations?)?

Medical image analysis what about the data? We need labelled training data (expensive) Non ML-based algorithms are no different. You need labelled data for validation So? Data being collected and organized in http://research.microsoft.com/medimaging

Machine learning in medical image analysis @ MSRC

The InnerEye project Measuring brain tumours Localizing and identifying vertebrae Kinect for surgery

Decision forests

Decision trees A decision tree Is top part blue? Is bottom part green? Is bottom part blue? A. Criminisi, J. Shotton Springer 2013

Decision trees A decision tree A decision tree is nothing else but a typical if-then algorithm. However, it is learned automatically. Is bottom part green? Is top part blue? Is bottom part blue? Interpretability A. Criminisi, J. Shotton Springer 2013

11 Decision forests Classification Regression Density estimation Manifold learning Semi-supervised classification The Sherwood forest software library is also available for download.

Regression forests Training data? Input data point Output/labels are continuous Node weak learner Regularized training loss Randomness provides regularization Obj. funct. for node j Information gain defined on continuous variables. Training node j Leaf model A. Criminisi, J. Shotton Springer 2013

Regression forests: training objective At node j Computing the regression information gain at node j Regression information gain Differential entropy of Gaussian Our regression information gain A. Criminisi, J. Shotton Springer 2013

Regression forests: the ensemble model Tree t=1 t=2 t=3 The ensemble model Forest output probability A. Criminisi, J. Shotton Springer 2013

Regression forests: probabilistic, non-linear regression Training points Training different trees in the forest (max tree depth D=2) Testing different trees in the forest Tree posteriors Forest posterior Generalization properties: - Smooth interpolating behaviour in gaps between training data. - Uncertainty increases with distance from training data. A. Criminisi, J. Shotton Springer 2013

Regression forests: multi-model posteriors Experiment 5 Exp. 6 Exp. 7 Exp. 8 Forest posteriors for six different experiments and varying depth D. mean mode Exp. 9 Exp. 10 A. Criminisi, J. Shotton Springer 2013

Regression forests: comparisons with Gaussian Processes Regression forests Gaussian processes (GP) In regr. forests the posterior can be multimodal In GP the posterior is Gaussian (unimodal) Observations - Similar behaviour of the mean. Forests can capture multimodality. - Forests are a parallel algorithm that does not need large matrix inversion. - In forests the quality and shape of the uncertainty depends on the predictor model. A. Criminisi, J. Shotton Springer 2013

Regression forests: comparisons with Gaussian Processes Regression forests Gaussian processes (GP) In regr. forests the posterior can be multimodal In GP the posterior is Gaussian (unimodal) Observations - Similar behaviour of the mean. Forests can capture multimodality. - Forests are a parallel algorithm that does not need large matrix inversion. - In forests the quality and shape of the uncertainty depends on the leaf model. A. Criminisi, J. Shotton Springer 2013

Anatomy localization in CT scans

Anatomy localization - Direct mapping of voxels to organ bounding boxes. - No search, no sliding window. - No atlas registration. Input CT scan Output anatomy localization Key idea: each voxel votes (probabilistically) for the position of each organ s bounding box. A. Criminisi, et al. Med Image Analysis 2013

Anatomy localization the labelled database Different image cropping, noise, contrast/no-contrast, resolution, scanners, body shapes/sizes, patient position

Anatomy localization via regression forests Each voxel in the volume votes for the position of the 6 box sides We wish to learn a set of discriminative points (landmarks, clusters) which can predict the kidney position with high confidence. Input data point Output Multiple organs Node split function Node optimization Node training Feature response (voxel position in volume) (bound. box continuous pos.) (mean over displaced 3D boxes) Error in model fit (weighted uncertainty for all organs) (relative displacement) (Gaussian repres. of distribs) Regressing an n-d piece-wise constant model

Anatomy localization context-rich features Possible visual features Computing the feature response Capturing spatial context

Anatomy localization quantitative results a b Experimental setup - Forest trained on 55 volumes and tested on remaining 45. - MedINRIA used for atlas-based registration. Best atlas selected to run optimally on testing set! - Errors are defined as differences between detected box sides and ground truth. Computational efficiency - About 1 2 sec per tree (unoptimized C++ code). - All trees can be trained/tested in parallel and independently from one another.

Anatomy localization landmark discovery Discovery of landmark regions Here the system is trained to detect left and right kidneys. The system learns to use bottom of lung and top of pelvis to localize kidneys with highest confidence. interpretability Input CT scan and detected landmark regions A. Criminisi, et al. Med Image Analysis 2013

Segmentation of Multiple Sclerosis Lesions

Segmentation of multiple sclerosis lesions E. Geremia, et al. Neuroimage 2011

Segmentation of multiple sclerosis lesions interpretability E. Geremia, et al. Neuroimage 2011

Semantic image registration

Further applications semantic image registration Running anatomy localization Img.1 - Apr 26 th Img.2 - Apr 21 st A. Criminisi, et al. Med Image Analysis 2013

Further applications semantic image registration Registration of longitudinal (same patient) CT images The energy function The energy. A 1D illustration Mutual information T = arg min T E T, E T = MI I x, J T x + KL I x, J T x Running semantic registration KL divergence of organ location probabilities Shifting one image wrt the other in the X direction and computing the energy. ------------- Warping img.1 into img.2 ------------ E. Konukoglu et al 2011

Further applications semantic image registration Registration of different patients CT images Elastix (no semantics) The moving image remains stuck in an incorrect local minimum. Target image Moving image (after convergence) S. Klein, M. Staring, K. Murphy, M.A. Viergever, J.P.W. Pluim, "elastix: a toolbox for intensity based medical image registration," IEEE Transactions on Medical Imaging, vol. 29, no. 1, pp. 196-205, January 201012 Our semantic registration Target image Moving image (after convergence) Better alignment of organs thanks to anatomy localization Probabilities. In the correct basin of convergence. E. Konukoglu, A. Criminisi, S. Pathak, D. Robertson, S. White, D. Haynor, and K. Siddiqui, Robust Linear Registration of CT Images using Random Regression Forests, in SPIE Medical Imaging, February 2011

Further applications semantic image registration Registration of different patients CT images Our proposed energy (top curve, solid) has fewer local minima than the conventional mutual information based one (bottom curve, dotted) T = arg min T E T, E T = MI I x, J T x + KL I x, J T x E. Konukoglu et al 2011

Further applications semantic image registration Quantitative comparison on hundreds of image pair registrations Percentage of failed pair-wise registrations for different error thresholds E. Konukoglu et al 2011

2013 Microsoft Corporation. All rights reserved.