White Matter Supervoxel Segmentation by Axial DP-means Clustering
|
|
|
- Violet Bridges
- 10 years ago
- Views:
Transcription
1 White Matter Supervoxel Segmentation by Axial DP-means Clustering Ryan P. Cabeen, David H. Laidlaw Computer Science Department, Brown University, RI, USA Abstract. A powerful aspect of diffusion MR imaging is the ability to reconstruct fiber orientations in brain white matter; however, the application of traditional learning algorithms is challenging due to the directional nature of the data. In this paper, we present an algorithmic approach to clustering such spatial and orientation data and apply it to brain white matter supervoxel segmentation. This approach is an extension of the DP-means algorithm to support axial data, and we present its theoretical connection to probabilistic models, including the Gaussian and Watson distributions. We evaluate our method with the analysis of synthetic data and an application to diffusion tensor atlas segmentation. We find our approach to be efficient and effective for the automatic extraction of regions of interest that respect the structure of brain white matter. The resulting supervoxel segmentation could be used to map regional anatomical changes in clinical studies or serve as a domain for more complex modeling. Keywords: diffusion tensor imaging, atlasing, segmentation, parcellation, white matter, clustering, supervoxels, bregman divergence, directional statistics 1 Introduction and Related Work Diffusion MR imaging enables the quantitative measurement of water molecule diffusion, which exhibits anisotropy in brain white matter due to axonal morphometry and coherence. Consequently, the orientation of fibers passing through a voxel can be estimated from the diffusion signal, allowing the local analysis of tissue or more global analysis of fiber bundles. Methods from computer vision and machine learning offer many opportunities to understand the structure captured by diffusion MRI; however, the directional nature of the data poses a challenge to traditional methods. A successful approach for dealing with this directional data has been probabilistic models on the sphere; the two most common being the von Mises-Fisher and Watson distributions. The Watson distinguishes itself by being defined for axial variables, that is, points on the sphere with anti-podal equivalence. These models have been known in the statistics community for decades and have applications to a number of disciplines, including geophysics, chemistry, genomics, and information retrieval [16]. Recently, there has been increasing interest in exploring directional mixture models for neuroimage analysis [6, 10]. They can be
2 computationally challenging, however, as their normalization constants have no closed form and require either strong assumptions [11] or approximations [12]. This problem is not unique to directional data, however, and there has been much work to develop efficient and scalable alternatives to probabilistic models. Some success in this area applies to the exponential families, which constitute most distributions in use today. Banerjee et al. made a powerful finding which established a bijection between the exponential families and Bregman divergences. They also explored the asymptotic relationship between mixture models and hard clustering algorithms [2]. These hard clustering algorithms tend to be more efficient, scalable, and easy to implement, at the cost of some flexibility in data modeling. We build on these ideas to derive hard clustering for data that has both spatial and directional components. Our approach extends this idea to axial data by considering the Bregman divergergence of the Watson distribution [12]. We employ additional prior work in this area that derived the DP-means algorithm from the asymptotic limit of a Dirichlet process mixture [7, 5], providing a data-driven way to select the number of clusters. We define our segmentation from the hard clustering of voxels, so we use the terms interchangably for the rest of the paper. Segmentation algorithms offer an opportunity to study neuroanatomy in an automated way, reducing the cost of manual delineation of anatomical structures. We consider voxelwise segmentation, as opposed to methods that cluster curves extracted by tractography. One common application of the voxelwise approach is the segmentation of the thalamic nuclei, which has been achieved by mean shift analysis, spectral clustering, level-sets, and modified k-means [17]. The work of Weigell et al. is most similar to the approach we propose, as they apply a k- means-like algorithm, modified to operate in the joint spatial-tensor domain. In contrast, we include optimization for model complexity, instead of selecting a fixed number of clusters as in k-means. We also operate on the fiber orientation, instead of the full tensor, a reasonable simplification for the segmentation of white matter, which is more anisotropic than gray matter. We consider a segmentation of whole brain white matter in which numerous small and homogenous regions (or supervoxels) are extracted, an approach that is similar to superpixel segmentation in the computer vision literature [13]. Superpixels have been found to offer both a more natural representation of images compared to pixels and a simpler domain for more complex models [9]. In the field of biomedical imaging, this idea has recently been successfully applied to spectral label fusion [15], cellular imaging [8] and fiber orientation distributionbased segmentation [3], to which our work bears similarity. The rest of the paper is as follows. First, we review the exponential families and their relation to Bregman divergences. We then present the models and Bregman divergences of the Gaussian and Watson distributions. From these, we derive an axial DP-means algorithm that clusters voxels in the joint spatial-axial domain. Finally, we present an evaluation with synthetic data and an application to diffusion tensor atlas white matter supervoxel segmentation.
3 2 Methods 2.1 Exponential Families and Bregman Divergences In this section, we review the exponential families, its relationship to Bregman divergences, and applications of this divergence to clustering problems. A parameteric family of distributions is considered exponential when members have a density of the following form [1]: P G (x θ) = P 0 (x) exp(< θ, x > G(θ)) (1) where θ is the natural (or canonical) parameter, x is the sufficient statistic, G(θ) is the cumulant (or log-partition) function, and P 0 (x) is the carrier measure. The exponential families have a variety of useful properties, but here we consider their relation to Bregman divergences, a measure which can be interpreted as relative entropy. Banerjee et al. showed a bijection between the exponential families and Bregman divergences [2], and consequently, the divergence corresponding to a given member of the exponential family can be defined from the cumulant [1]: G (ˆθ, θ) = G(ˆθ) G(θ) < ˆθ θ, θ G(θ) > (2) This result gives a probabilistic interpretation to many distance measures. In particular, a number of hard clustering objectives may be expressed in terms of Bregman divergences associated with exponential mixture models, showing a relationship between soft and hard clustering algorithms [5]. This result can also be used to derive hard clustering algorithms for directional data, as described in the following sections. 2.2 Gaussian and Watson Distributions We now consider two probabilistic models for our data, namely the Gaussian and Watson distributions, which represent spatial and axial data, respectively. For each distribution, we show the form of their distribution, derive their associated Bregman divergences, and discuss their relationship to hard clustering algorithms in the literature. The isotropic Gaussian distribution is defined on R n and is commonly used to represent spatial data. Its probability density N and associated Bregman divergence D N are given by: N (p q, σ 2 ) = ( 1 exp 1 ) p (2πσ 2 q 2 ) n/2 2σ2 D N (ˆp, p) = 1 2σ 2 ˆp p 2 (4) (3)
4 given input positions p, ˆp R n, mean position q R n, and constant variance parameter σ 2 > 0. The associated Bregman divergence D N is the scaled Euclidean distance between two positions, an observation which gives a probabilistic interpretation to the k-means and DP-means algorithms, which are asymptotic limits of Gaussian mixture and Dirichlet process mixture models, respectively [7]. The Watson distribution is analagous to a Gaussian but defined on the hypersphere S n 1 R n with anti-podal symmetry, i.e. d d. This structure is well-suited to diffusion MR, for which a sign is not associated with the diffusion direction. Samples from this distribution are often called axial variables to distinguish them from spherical data without symmetry. Its probability density W and associated Bregman divergence D W are given by: W (d v, κ) = Γ (n/2) ( (2π) n/2 M( 1 2, n 2, κ) exp κ ( v T d ) ) 2 (5) D W (ˆd, d) = κ M(1 2, n 2, κ) M ( 1 2, n 2, κ) ( (ˆdT ) ) 2 1 d for input axial directions d, ˆd S n 1, mean axial direction v S n 1, and Kummer s confluent hypergeometric function M(a, b, z). We also assume a constant and positive concentration parameter κ > 0. The associated Bregman divergence D W is then a scaled cosine-squared dissimilarity measure, which is equivalent to the measure used for diametrical clustering the asymptotic limit of a mixture of Watsons [4, 12]. The Bregman divergences for the Gaussian and Watson distributions can then be used to define hard clustering algorithms, which we ll describe next. (6) 2.3 Hard Clustering We now present an objective function for clustering axial data and an iterative algorithm for optimizing it. In particular, this approach is a hard clustering algorithm that behaves similarly to a Dirichlet process (DP) mixture model learned with Gibbs sampling, as a result of recent work on small-variance asymptotic analysis of the exponential family and Bregman divergences by Jiang et al. [5]. We apply their work to our case of spatial and axial data, which is assumed to be modeled jointly by the Gaussian and Watson distributions. For imaging applications, segmentation is often performed in the joint spatialintensity space. A common application is superpixel segmentation, which provides a domain for image understanding that is both more simple and natural than the original pixel domain [9]. This is also the case for diffusion MR, where we want to segment voxels based on both proximity and fiber orientation similarity, perhaps to aid more complex anatomical modeling [3]. This motivates the development of a clustering algorithm that accounts for these two aspects, which we achieve by the linear combination of the Bregman divergences presented in the previous section. The resulting objective function E can be defined similarly to the DP-means algorithm:
5 K E = E = k=1 K k=1 (p,d) l k D N (p, q k ) + D W (d, v k ) + λk (7) (p,d) l k α p q k 2 + β ( 1 ( v T k d ) 2 ) + λk (8) where l k is the set of voxels in the k-th cluster, λ is a cluster-penalty term that controls model complexity, and the parameters α and β control the relative contributions of the spatial and axial terms to the total cost. These parameters have probabilistic interpretations, where λ relates to the Dirichlet process mixture prior, and α and β relate to the Gaussian σ and Watson κ, respectively. A procedure to minimize this objective is presented in Algorithm 1. In a modification of the DP-means algorithm [7], the assignment step measures distance by the linear combination of divergences. In the update step, the spatial cluster center q c is computed by the Euclidean average. The axial cluster center v c is computed by the maximum likelihood approach of Schwartzman et al. [11], where dyadic tensors are computed by the outer product of each axial variable and the mean axial direction is found from the principal eigenvector of the mean tensor, as shown by the prineig function. Algorithm 1: joint spatial-axial DP-means clustering Input: (p 1, d 1),..., (p N, d N ): input position/axial direction pairs, α, β, λ: objective weighting parameters Output: K: number of clusters, L 1,..., L N : labels, (q 1, v 1),..., (q K, v K): cluster centers Initialize: K 1, q 1 i pi/n, v1 prineig ( i didt i /N ) while not converged do Assign cluster labels: for i=1 to N do for j=1 to K do D ij α p i q j 2 + β (1 ( ) ) d T 2 i v j if min j D ij > λ then K K + 1, q K p i, v K d i, L i K else L i argmin j D ij Update cluster centers: for j=1 to K do q j ( i δ(j, ) Li)pi / i δ(j, Li) v j prineig (( i δ(j, ) Li)didT i / i δ(j, Li)) return
6 3 Experiments and Results 3.1 Synthetic Data In our first experiment, we investigate the choice of the cluster penalty parameter λ by synthesizing axial data and testing performance of the axial DP-means algorithm across varying numbers and sizes of clusters. Here, we ignore the spatial component and only test the relationship between λ and the axial clustering. The number of clusters N ranged from 3 to 10 and were generated by sampling a Gaussian and normalizing with size σ ranging from 0.1 to 0.3. We evaluated ground truth agreement with the adjusted mutual information score (AMI) [14], a statistical measure of similarity between clusterings that accounts for chance groupings and takes a maximum value of one when clusterings are equivalent. For a constant number of clusters, we found the optimal choice of λ increased with cluster size σ. For a constant cluster size σ, we found the optimal λ to be relatively stable across variable numbers of clusters N. In Fig. 1, we show examples of the clustering for the two conditions. In Fig. 2, we show plots of the relationship between the AMI and λ. We found our implementation to converge in fewer than 20 iterations on average. All cases except one found the correct number of clusters for the optimal λ. This could be due to the presence of local minima, an issue that could possibly be addressed with a randomized initialization scheme and restarts. 3.2 White Matter Segmentation In our second experiment, we apply the axial DP-means algorithm to the supervoxel segmentation of brain white matter in a diffusion tensor atlas. We used the IXI aging brain atlas, which was constructed by deformable tensor-based registration with DTI-TK [19]. White matter was extracted by thresholding the fractional anisotropy map at a value of 0.2, which excludes white matter voxels with complex fiber configurations that are not accurately represented by tensors. We then performed white matter segmentation of the remaining voxels with the joint spatial-axial DP-means algorithm. For each voxel, the spatial component was taken to be the voxel center and the directional component was taken to be the principal direction of the voxel s tensor. We found several structures which were spatially disconnected but given the same label, for example, the bilateral cingulum bundles, which are both close and similarly oriented. To account for this, we finally performed connected components labeling using an efficient two-pass procedure [18]. We investigated the effect of λ and β (holding α constant) by measuring the mean cluster orientation dispersion and cluster volume. We found that as λ was increased, both the volume and angular dispersion increased. As β increased, we observed decreased cluster volume and angular dispersion. These results are shown in the plots of Fig. 3. We also generated visualizations from a segmentation with λ = 25 and β = 15, which are shown in Fig. 3. From slice views, we found the segmentation to reflect known anatomical boundaries, such as the
7 cingulum/corpus callosum and corona radiata/superior longitudinal fasciculus. By overlaying fiber models, we see the region boundaries tended to coincide with large changes in fiber orientation. From an boundary surface rendering, we found lateral symmetry and a separation of gyral white matter from deeper white matter. Our serial implementation on a 2.3 GHz Intel i5 ran in several minutes end converged in 125 iterations. 4 Discussion and Conclusions In this paper, we presented an efficient approach to hard clustering of spatial and axial data that is effective for segmenting brain white matter. This algorithm has a probabilistic interpretation that relates its objective to the Bregman divergences of the Gaussian and Watson distributions. Through our experiments, we first found the parameter λ to be more affected by cluster size σ than the number of clusters N. This may suggest the need to account for noise level when selecting λ and possible limitations when applied to datasets with clusters of heterogeneous size. In our second experiment, we found our approach to efficiently perform white matter atlas segmentation, producing regions that respect anatomical boundies in white matter structure. We also found considerable variablility across different hyperparameters. On one hand, this offers fine-grained control of region size, but it also suggests that a comparison with a simpler k- means type approach could be valuable. A limitation is the restriction of this approach to single fiber voxels and extensions to multiple fibers could be valuable. This could be possibly used with other methods for reconstucting fiber orientations, such as orientation distribution function (ODF) and fiber orientation distribution (FOD) approaches. This method may also aid several aspects of clinical studies of white matter. The resulting segmentation could provide automatically defined regions-ofinterest for clinical study, similar to voxel-based population studies of neurological disease. This approach may also offers a domain for more efficient inference of complex anatomical models, such as graph-based methods for measuring brain connectivity. One interesting extension would generalize the clustering objective to include variable concentration κ, which may enable more sensitive segmentation for diffusion models that account for fiber orientation dispersion in each voxel. In conclusion, we find this approach to be an efficient and valuable tool for segmenting white matter with a desirable probabilistic interpretation and a number of applications to brain connectivity mapping.
8 Variable σ, Constant N Constant σ, Variable N Fig. 1: First experiment: visualizations of the synthetic axial data and the optimal lambda mean adjusted mutual information Sigma mean adjusted mutual information clusterings computed from the proposed method. The top shows results for variable cluster size σ = {0.05, , 0.125, , 0.2}, and constant number of clusters N = 4. The bottom shows results for constant cluster size σ = 0.10, and variable number of clusters N = {3, 4, 5, 6, 7}. The bottom right shows a single mislabeled cluster, possibly caused by finding a local minima in the optimization. N lambda Fig. 2: First experiment: clustering performance as a function of cluster penalty parameter λ [0, 1] given ground truth generated with cluster size σ, and number of clusters N, which are visualized in Fig. 1. We measured the adjusted mutual information (AMI), a statistical measure takes a maximal value when clusterings are equivalent. Shown are plots of the AMI vs. λ for two conditions. The first tested with variable σ = {0.05, , 0.125, , 0.2} and constant N = 4. The second tested with constant σ = 0.10 and variable N = {3, 4, 5, 6, 7}. These results indicate that the optimal λ depends more on σ than N, suggesting that performance may depend on the noise level and may dimishish when differing clusters sizes are present. Also note that the maximum AMI decreases with increasing σ, which may be due to cluster overlap or over-sensitivity in the AMI measure.
9 mean cluster volume beta mean angular dispersion beta lambda lambda Fig. 3: Second experiment: white matter supervoxel segmentation by axial DP-means clustering. The top row shows mean cluster volume (mm 3 ) and mean cluster angular dispersion (degrees) as a function of cluster penalty λ [10, 40] and axial weighting β [0, 30]. We found increasing λ also increased the volume and dispersion, while increasing β reduced the volume and dispersion. The middle and bottom rows show an example result given λ = 25 and β = 15. The middle shows boundary surfaces of the regions, which illustrates the symmetry and separation of gyral and deep white matter. The bottom shows a partial coronal slice with the fractional anisotropy (left) and computed segmentation (right), whish shows region boundaries that match known anatomical interfaces, such as the corpus callosum/cingulum bundle and corona radiata/superior longitudinal fasciculus.
10 References 1. Azoury, K.S., Warmuth, M.K.: Relative loss bounds for on-line density estimation with the exponential family of distributions. Machine Learning 43(3) (2001) 2. Banerjee, A., Merugu, S., Dhillon, I.S., Ghosh, J.: Clustering with Bregman divergences. J. Mach. Learn. Res. 6, (Dec 2005) 3. Bloy, L., Ingalhalikar, M., Eavani, H., Schultz, R.T., Roberts, T.P., Verma, R.: White matter atlas generation using HARDI based automated parcellation. NeuroImage 59(4), (2012) 4. Dhillon, I.S., Marcotte, E.M., Roshan, U.: Diametrical clustering for identifying anti-correlated gene clusters. Bioinformatics 19(13), (2003) 5. Jiang, K., Kulis, B., Jordan, M.: Small-variance asymptotics for exponential family Dirichlet process mixture models. In: NIPS 2012 (2012) 6. Kaden, E., Kruggel, F.: Nonparametric Bayesian inference of the fiber orientation distribution from diffusion-weighted MR images. Med. Image Anal. 16(4), (2012) 7. Kulis, B., Jordan, M.I.: Revisiting k-means: New algorithms via Bayesian nonparametrics. In: ICML-12. pp (2012) 8. Lucchi, A., Smith, K., Achanta, R., Lepetit, V., Fua, P.: A fully automated approach to segmentation of irregularly shaped cellular structures in em images. In: Jiang, T., Navab, N., Pluim, J., Viergever, M. (eds.) MICCAI 2010, LNCS, vol. 6362, pp Springer Berlin Heidelberg (2010) 9. Mori, G.: Guiding model search using segmentation. In: ICCV vol. 2, pp Vol. 2 (2005) 10. Rathi, Y., Michailovich, O., Shenton, M.E., Bouix, S.: Directional functions for orientation distribution estimation. Med. Image Anal. 13(3), (2009) 11. Schwartzman, A., Dougherty, R.F., Taylor, J.E.: Cross-subject comparison of principal diffusion direction maps. Magnet. Reson. in Med. 53(6), (2005) 12. Sra, S., Karp, D.: The multivariate Watson distribution: Maximum-likelihood estimation and other aspects. J. of Multivariate Analysis 114(0), (2013) 13. Veksler, O., Boykov, Y., Mehrani, P.: Superpixels and supervoxels in an energy optimization framework. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, LNCS, vol. 6315, pp Springer Berlin Heidelberg (2010) 14. Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, (Dec 2010) 15. Wachinger, C., Golland, P.: Spectral label fusion. In: Ayache, N., Delingette, H., Golland, P., Mori, K. (eds.) MICCAI 2012, LNCS, vol. 7512, pp Springer Berlin Heidelberg (2012) 16. Watson, G.S.: Statistics on spheres, vol. 6. Wiley New York (1983) 17. Wiegell, M.R., Tuch, D.S., Larsson, H.B., Wedeen, V.J.: Automatic segmentation of thalamic nuclei from diffusion tensor magnetic resonance imaging. NeuroImage 19(2), (2003) 18. Wu, K., Otoo, E., Suzuki, K.: Optimizing two-pass connected-component labeling algorithms. Pattern Analysis and Applications 12(2), (2009) 19. Zhang, H., Yushkevich, P., Rueckert, D., Gee, J.: A computational white matter atlas for aging with surface-based representation of fasciculi. In: Fischer, B., Dawant, B., Lorenz, C. (eds.) Biomedical Image Registration, LNCS, vol. 6204, pp Springer Berlin Heidelberg (2010)
Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC
Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain
Neuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease
1 Neuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease The following contains a summary of the content of the neuroimaging module I on the postgraduate
2. MATERIALS AND METHODS
Difficulties of T1 brain MRI segmentation techniques M S. Atkins *a, K. Siu a, B. Law a, J. Orchard a, W. Rosenbaum a a School of Computing Science, Simon Fraser University ABSTRACT This paper looks at
Diffusione e perfusione in risonanza magnetica. E. Pagani, M. Filippi
Diffusione e perfusione in risonanza magnetica E. Pagani, M. Filippi DW-MRI DIFFUSION-WEIGHTED MRI Principles Diffusion results from a microspic random motion known as Brownian motion THE RANDOM WALK How
Adaptively Constrained Convex Optimization for Accurate Fiber Orientation Estimation with High Order Spherical Harmonics
Adaptively Constrained Convex Optimization for Accurate Fiber Orientation Estimation with High Order Spherical Harmonics Giang Tran 1 and Yonggang Shi 2, 1 Dept. of Mathematics, UCLA, Los Angeles, CA,
Norbert Schuff Professor of Radiology VA Medical Center and UCSF [email protected]
Norbert Schuff Professor of Radiology Medical Center and UCSF [email protected] Medical Imaging Informatics 2012, N.Schuff Course # 170.03 Slide 1/67 Overview Definitions Role of Segmentation Segmentation
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
Image Segmentation and Registration
Image Segmentation and Registration Dr. Christine Tanner ([email protected]) Computer Vision Laboratory, ETH Zürich Dr. Verena Kaynig, Machine Learning Laboratory, ETH Zürich Outline Segmentation
How To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
Probabilistic Latent Semantic Analysis (plsa)
Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg [email protected] www.multimedia-computing.{de,org} References
Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected].
Medical Image Processing on the GPU Past, Present and Future Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected] Outline Motivation why do we need GPUs? Past - how was GPU programming
Learning Vector Quantization: generalization ability and dynamics of competing prototypes
Learning Vector Quantization: generalization ability and dynamics of competing prototypes Aree Witoelar 1, Michael Biehl 1, and Barbara Hammer 2 1 University of Groningen, Mathematics and Computing Science
Linköping University Electronic Press
Linköping University Electronic Press Book Chapter Multi-modal Image Registration Using Polynomial Expansion and Mutual Information Daniel Forsberg, Gunnar Farnebäck, Hans Knutsson and Carl-Fredrik Westin
Software Packages The following data analysis software packages will be showcased:
Analyze This! Practicalities of fmri and Diffusion Data Analysis Data Download Instructions Weekday Educational Course, ISMRM 23 rd Annual Meeting and Exhibition Tuesday 2 nd June 2015, 10:00-12:00, Room
Cluster Analysis: Advanced Concepts
Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means
Diffusion MRI imaging for brain connectivity analysis
Diffusion MRI imaging for brain connectivity analysis principles and challenges Prof. Jean-Philippe Thiran EPFL - http://lts5www.epfl.ch - [email protected] From Brownian motion to Diffusion MRI - Motivation
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
Segmentation & Clustering
EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,
Binary Image Scanning Algorithm for Cane Segmentation
Binary Image Scanning Algorithm for Cane Segmentation Ricardo D. C. Marin Department of Computer Science University Of Canterbury Canterbury, Christchurch [email protected] Tom
LIBSVX and Video Segmentation Evaluation
CVPR 14 Tutorial! 1! LIBSVX and Video Segmentation Evaluation Chenliang Xu and Jason J. Corso!! Computer Science and Engineering! SUNY at Buffalo!! Electrical Engineering and Computer Science! University
CLUSTER ANALYSIS FOR SEGMENTATION
CLUSTER ANALYSIS FOR SEGMENTATION Introduction We all understand that consumers are not all alike. This provides a challenge for the development and marketing of profitable products and services. Not every
Statistical Considerations in Magnetic Resonance Imaging of Brain Function
Statistical Considerations in Magnetic Resonance Imaging of Brain Function Brian D. Ripley Professor of Applied Statistics University of Oxford [email protected] http://www.stats.ox.ac.uk/ ripley Acknowledgements
Clustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is
Clustering 15-381 Artificial Intelligence Henry Lin Modified from excellent slides of Eamonn Keogh, Ziv Bar-Joseph, and Andrew Moore What is Clustering? Organizing data into clusters such that there is
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will
Dirichlet Processes A gentle tutorial
Dirichlet Processes A gentle tutorial SELECT Lab Meeting October 14, 2008 Khalid El-Arini Motivation We are given a data set, and are told that it was generated from a mixture of Gaussian distributions.
Morphological analysis on structural MRI for the early diagnosis of neurodegenerative diseases. Marco Aiello On behalf of MAGIC-5 collaboration
Morphological analysis on structural MRI for the early diagnosis of neurodegenerative diseases Marco Aiello On behalf of MAGIC-5 collaboration Index Motivations of morphological analysis Segmentation of
Semi-Supervised Learning in Inferring Mobile Device Locations
Semi-Supervised Learning in Inferring Mobile Device Locations Rong Duan, Olivia Hong, Guangqin Ma rduan,ohong,gma @research.att.com AT&T Labs 200 S Laurel Ave Middletown, NJ 07748 Abstract With the development
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: [email protected]
Graph Embedding to Improve Supervised Classification and Novel Class Detection: Application to Prostate Cancer
Graph Embedding to Improve Supervised Classification and Novel Class Detection: Application to Prostate Cancer Anant Madabhushi 1, Jianbo Shi 2, Mark Rosen 2, John E. Tomaszeweski 2,andMichaelD.Feldman
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Creation of an Unlimited Database of Virtual Bone. Validation and Exploitation for Orthopedic Devices
Creation of an Unlimited Database of Virtual Bone Population using Mesh Morphing: Validation and Exploitation for Orthopedic Devices Najah Hraiech 1, Christelle Boichon 1, Michel Rochette 1, 2 Thierry
MAGNETIC resonance (MR) image segmentation is fundamental
IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 29, NO. 12, DECEMBER 2010 1959 Coupled Nonparametric Shape and Moment-Based Intershape Pose Priors for Multiple Basal Ganglia Structure Segmentation Mustafa Gökhan
IMA Preprint Series # 2233
ODF RECONSTRUCTION IN Q-BALL IMAGING WITH SOLID ANGLE CONSIDERATION By Iman Aganj Christophe Lenglet and Guillermo Sapiro IMA Preprint Series # 2233 ( January 2009 ) INSTITUTE FOR MATHEMATICS AND ITS APPLICATIONS
Multisensor Data Fusion and Applications
Multisensor Data Fusion and Applications Pramod K. Varshney Department of Electrical Engineering and Computer Science Syracuse University 121 Link Hall Syracuse, New York 13244 USA E-mail: [email protected]
COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS
COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS B.K. Mohan and S. N. Ladha Centre for Studies in Resources Engineering IIT
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization Ángela Blanco Universidad Pontificia de Salamanca [email protected] Spain Manuel Martín-Merino Universidad
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University
HT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
QAV-PET: A Free Software for Quantitative Analysis and Visualization of PET Images
QAV-PET: A Free Software for Quantitative Analysis and Visualization of PET Images Brent Foster, Ulas Bagci, and Daniel J. Mollura 1 Getting Started 1.1 What is QAV-PET used for? Quantitative Analysis
TDA and Machine Learning: Better Together
TDA and Machine Learning: Better Together TDA AND MACHINE LEARNING: BETTER TOGETHER 2 TABLE OF CONTENTS The New Data Analytics Dilemma... 3 Introducing Topology and Topological Data Analysis... 3 The Promise
MULTI-LEVEL SEMANTIC LABELING OF SKY/CLOUD IMAGES
MULTI-LEVEL SEMANTIC LABELING OF SKY/CLOUD IMAGES Soumyabrata Dev, Yee Hui Lee School of Electrical and Electronic Engineering Nanyang Technological University (NTU) Singapore 639798 Stefan Winkler Advanced
Stock Trading by Modelling Price Trend with Dynamic Bayesian Networks
Stock Trading by Modelling Price Trend with Dynamic Bayesian Networks Jangmin O 1,JaeWonLee 2, Sung-Bae Park 1, and Byoung-Tak Zhang 1 1 School of Computer Science and Engineering, Seoul National University
Diffeomorphic Diffusion Registration of Lung CT Images
Diffeomorphic Diffusion Registration of Lung CT Images Alexander Schmidt-Richberg, Jan Ehrhardt, René Werner, and Heinz Handels Institute of Medical Informatics, University of Lübeck, Lübeck, Germany,
Neural Networks Lesson 5 - Cluster Analysis
Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm [email protected] Rome, 29
Fast Matching of Binary Features
Fast Matching of Binary Features Marius Muja and David G. Lowe Laboratory for Computational Intelligence University of British Columbia, Vancouver, Canada {mariusm,lowe}@cs.ubc.ca Abstract There has been
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
A Learning Based Method for Super-Resolution of Low Resolution Images
A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 [email protected] Abstract The main objective of this project is the study of a learning based method
Model-Based Cluster Analysis for Web Users Sessions
Model-Based Cluster Analysis for Web Users Sessions George Pallis, Lefteris Angelis, and Athena Vakali Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece [email protected]
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA [email protected]
Towards running complex models on big data
Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation
Machine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
A Survey of Document Clustering Techniques & Comparison of LDA and movmf
A Survey of Document Clustering Techniques & Comparison of LDA and movmf Yu Xiao December 10, 2010 Abstract This course project is mainly an overview of some widely used document clustering techniques.
MEDIMAGE A Multimedia Database Management System for Alzheimer s Disease Patients
MEDIMAGE A Multimedia Database Management System for Alzheimer s Disease Patients Peter L. Stanchev 1, Farshad Fotouhi 2 1 Kettering University, Flint, Michigan, 48504 USA [email protected] http://www.kettering.edu/~pstanche
Probabilistic user behavior models in online stores for recommender systems
Probabilistic user behavior models in online stores for recommender systems Tomoharu Iwata Abstract Recommender systems are widely used in online stores because they are expected to improve both user
Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach
Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach S. M. Ashraful Kadir 1 and Tazrian Khan 2 1 Scientific Computing, Royal Institute of Technology (KTH), Stockholm, Sweden [email protected],
High-Dimensional Image Warping
Chapter 4 High-Dimensional Image Warping John Ashburner & Karl J. Friston The Wellcome Dept. of Imaging Neuroscience, 12 Queen Square, London WC1N 3BG, UK. Contents 4.1 Introduction.................................
SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs
SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs Fabian Hueske, TU Berlin June 26, 21 1 Review This document is a review report on the paper Towards Proximity Pattern Mining in Large
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott
Comparing large datasets structures through unsupervised learning
Comparing large datasets structures through unsupervised learning Guénaël Cabanes and Younès Bennani LIPN-CNRS, UMR 7030, Université de Paris 13 99, Avenue J-B. Clément, 93430 Villetaneuse, France [email protected]
Making the Most of Missing Values: Object Clustering with Partial Data in Astronomy
Astronomical Data Analysis Software and Systems XIV ASP Conference Series, Vol. XXX, 2005 P. L. Shopbell, M. C. Britton, and R. Ebert, eds. P2.1.25 Making the Most of Missing Values: Object Clustering
Clustering & Visualization
Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.
Forschungskolleg Data Analytics Methods and Techniques
Forschungskolleg Data Analytics Methods and Techniques Martin Hahmann, Gunnar Schröder, Phillip Grosse Prof. Dr.-Ing. Wolfgang Lehner Why do we need it? We are drowning in data, but starving for knowledge!
A Computational Framework for Exploratory Data Analysis
A Computational Framework for Exploratory Data Analysis Axel Wismüller Depts. of Radiology and Biomedical Engineering, University of Rochester, New York 601 Elmwood Avenue, Rochester, NY 14642-8648, U.S.A.
Exploratory data analysis for microarray data
Eploratory data analysis for microarray data Anja von Heydebreck Ma Planck Institute for Molecular Genetics, Dept. Computational Molecular Biology, Berlin, Germany [email protected] Visualization
Component Ordering in Independent Component Analysis Based on Data Power
Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals
IN current film media, the increase in areal density has
IEEE TRANSACTIONS ON MAGNETICS, VOL. 44, NO. 1, JANUARY 2008 193 A New Read Channel Model for Patterned Media Storage Seyhan Karakulak, Paul H. Siegel, Fellow, IEEE, Jack K. Wolf, Life Fellow, IEEE, and
GE Medical Systems Training in Partnership. Module 8: IQ: Acquisition Time
Module 8: IQ: Acquisition Time IQ : Acquisition Time Objectives...Describe types of data acquisition modes....compute acquisition times for 2D and 3D scans. 2D Acquisitions The 2D mode acquires and reconstructs
Local outlier detection in data forensics: data mining approach to flag unusual schools
Local outlier detection in data forensics: data mining approach to flag unusual schools Mayuko Simon Data Recognition Corporation Paper presented at the 2012 Conference on Statistical Detection of Potential
Image Registration and Fusion. Professor Michael Brady FRS FREng Department of Engineering Science Oxford University
Image Registration and Fusion Professor Michael Brady FRS FREng Department of Engineering Science Oxford University Image registration & information fusion Image Registration: Geometric (and Photometric)
Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca
Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?
Neovision2 Performance Evaluation Protocol
Neovision2 Performance Evaluation Protocol Version 3.0 4/16/2012 Public Release Prepared by Rajmadhan Ekambaram [email protected] Dmitry Goldgof, Ph.D. [email protected] Rangachar Kasturi, Ph.D.
Detection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
Subspace Analysis and Optimization for AAM Based Face Alignment
Subspace Analysis and Optimization for AAM Based Face Alignment Ming Zhao Chun Chen College of Computer Science Zhejiang University Hangzhou, 310027, P.R.China [email protected] Stan Z. Li Microsoft
FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not
Part-Based Recognition
Part-Based Recognition Benedict Brown CS597D, Fall 2003 Princeton University CS 597D, Part-Based Recognition p. 1/32 Introduction Many objects are made up of parts It s presumably easier to identify simple
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
Clarify Some Issues on the Sparse Bayesian Learning for Sparse Signal Recovery
Clarify Some Issues on the Sparse Bayesian Learning for Sparse Signal Recovery Zhilin Zhang and Bhaskar D. Rao Technical Report University of California at San Diego September, Abstract Sparse Bayesian
Machine Learning for Data Science (CS4786) Lecture 1
Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:
How To Cluster Of Complex Systems
Entropy based Graph Clustering: Application to Biological and Social Networks Edward C Kenley Young-Rae Cho Department of Computer Science Baylor University Complex Systems Definition Dynamically evolving
Visualization by Linear Projections as Information Retrieval
Visualization by Linear Projections as Information Retrieval Jaakko Peltonen Helsinki University of Technology, Department of Information and Computer Science, P. O. Box 5400, FI-0015 TKK, Finland [email protected]
How To Solve The Cluster Algorithm
Cluster Algorithms Adriano Cruz [email protected] 28 de outubro de 2013 Adriano Cruz [email protected] () Cluster Algorithms 28 de outubro de 2013 1 / 80 Summary 1 K-Means Adriano Cruz [email protected]
Advanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
Fast and Robust Variational Optical Flow for High-Resolution Images using SLIC Superpixels
Fast and Robust Variational Optical Flow for High-Resolution Images using SLIC Superpixels Simon Donné, Jan Aelterman, Bart Goossens, and Wilfried Philips iminds-ipi-ugent {Simon.Donne,Jan.Aelterman,Bart.Goossens,philips}@telin.ugent.be
Calculation of Minimum Distances. Minimum Distance to Means. Σi i = 1
Minimum Distance to Means Similar to Parallelepiped classifier, but instead of bounding areas, the user supplies spectral class means in n-dimensional space and the algorithm calculates the distance between
Manifold Learning with Variational Auto-encoder for Medical Image Analysis
Manifold Learning with Variational Auto-encoder for Medical Image Analysis Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill [email protected] Abstract Manifold
2.2 Creaseness operator
2.2. Creaseness operator 31 2.2 Creaseness operator Antonio López, a member of our group, has studied for his PhD dissertation the differential operators described in this section [72]. He has compared
