Functional Data Analysis of Big Neuroimaging Data


 Kelly Burns
 2 years ago
 Views:
Transcription
1 Functional Data Analysis of Big Neuroimaging Data Hongtu Zhu, Ph.D Department of Biostatistics and Biomedical Research Imaging Center The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Acknowledgement: Some pictures were copied from multiple resources including Suetens (2009), Fass (2008), Dr. Niethammer, Drs. Lindquist, Rowe Huettel, Wiki, google, etc.
2 Big Neuroimaging Data Is it really big?
3 aims to simulate the complete human brain on Supercomputers to better understand how it functions. The Brain Research through Advancing Innovative Neurotechnologies or BRAIN, aims to reconstruct the activity of every single neuron as they fire simultaneously in different brain circuits, or perhaps even whole brains.
4 Big Neuroimaging Data NIH normal brain development 1000 Functional Connectome Project Alzheimer s Disease Neuroimaging Initiative National Database for Autism Research (NDAR) Human Connectome Project
5 Complex Study Design crosssectional studies; clustered studies including longitudinal and twin/familial studies; ρ 1 5 ρ 5 9 ρ 1 9
6 Complex Data Structure Multivariate Imaging Measures Smooth Functional Imaging Measures Wholebrain Imaging Measures 4DTime Series Imaging Measures
7 Big Data Integration E IS I II IG IS SI G D Selection
8 Models for Big Data Integration ImageonScalar (IS) model Image data as response, clinical variables as predictors. ScalaronImage (SI) model Clinical variables as response, image data as predictors ImageonGenetic (IG) model Image data as response, genetic data as predictors ImageonImage (II) model Image data as response, image data as predictors
9 Noisy Imaging Data Spatial Maps Estimation `Registration Inference `Smoothing Prediction `Correlation `Spatial Heterogeneity
10 ImagingonScalar Regression
11 VBA versus FDA Data {(x i,y i ) :i =1,!, n} Y i = {Y i (d 0 ) : d 0! D 0 } Intrinsic Discrete Approach (VBA) Y i = {Y i (d 0 ) : d 0! D 0 } Intrinsic Functional Approach (FDA) Y i ( ) = {Y i (d) : d! D}
12 n Functional Data Analysis (FDA) y i (d) = x i T B (d)+! i (d)! i ( ) ~ SP(0,!! ) Hotellingtype Test Statistics Y =! Y i / n S Y = (Y i!y )! i=1 Big data Pro: Incorporate spatial smoothness and spatial correlation T n 2 Con: Computational and theoretical difficulties n! (Y i!y ) / n T 2 < Y,u > 2 n = nsup u =1 i=1 < u,s Y u > P(T n 2 =!) =1 S Y!!(S Y ) S Y! diag(s Y )
13 Highdimensional Regression Models Big data Y i = BX i +! i! i ( ) ~ SP(0,!! ) dim(y i ) >> n Hotellingtype Test Statistics T n 2 n n Y =! Y i / n S Y =!(Y i!y )(Y i!y ) T / n T 2 < Y,u > 2 n = nsup u =1 < u,s Y u > i=1 i=1 P(T n 2 =!) =1 (Y,S Y )!!(Y,S Y )???
14 Voxel Based Analysis (VBA) Data {(xi,yi ) : i = 1,!, n} Yi = {Yi (d0 ) : d0! D0 } VBA Stage 0: Gaussian Kernel Smoothing Stage 1: Model Fitting n n! p(yi xi ) =! i=1 i=1! p(y (d ) x,! (d )) i 0 i 0 d"d0 Ignore spatial smoothness Stage 2: Hypothesis Testing H 0 :! (d) =!* (d) for all voxels H1 :! (d)!!* (d) for some voxels Random Field Theory: functional data and local smoothness FDR
15 Cons VBA Potential large smoothing errors. Treat voxels as independent units/images as a collection of independent voxels. Ignore spatial correlation and smoothness in statistical analysis. Inaccurate for both Prediction and Estimation. Decrease statistical power.
16 VBA Bayesian Extensions Bayesian Modeling Spatial smooth prior (MRF) p(!) = p({!(d 0 ) : d 0! D 0 }) n p(! Y )!{" p(y i x i,!)}p(!) = {" " p(y i (d) x i,!(d))}p(!) i=1 i=1 Pro: Computationally straightforward; Bayesian inference based on MCMC samples n d#d 0 Con: Computationally heavy; Lack of understanding for Bayesian inference tools.
17 Varying Coefficient Models Reading materials: 1. Zhu, H. T., Chen, K. H., Yuan, Y. and Wang, J. L. (2014). Functional Mixed Processes Models for Repeated Functional Data. In submission. 2. Liang, J. L., Huang, C., and Zhu, H.T. (2014). Functional singleindex varying coefficient models. In submission. 3. Yuan, Y., Gilmore, J., Geng, X. J., Styner, M., Chen, K. H., Wang, J. L., and Zhu, H.T. (2013). A longitudinal functional analysis framework for analysis of white matter tract statistics. NeuroImage, in press. 4. Yuan, Y., Zhu, H.T., Styner, M., J. H. Gilmore., and Marron, J. S. (2013). Varying coefficient model for modeling diffusion tensors along white matter bundles. Annals of Applied Statistics. 7(1): Zhu, H.T., Li, R. Z., Kong, L.L. (2012). Multivariate varying coefficient models for functional responses. Ann. Stat. 40, Hua, Z.W., Dunson, D., Gilmore, J.H., Styner, M., and Zhu, HT. (2012). Semiparametric Bayesian local functional models for diffusion tensor tract statistics. NeuroImage, 63, Zhu, HT., Kong, L., Li, R., Styner, M., Gerig, G., Lin, W. and Gilmore, J. H. (2011). FADTTS: Functional Analysis of Diffusion Tensor Tract Statistics, NeuroImage, 56, Zhu, H.T., Styner, M., Tang, N.S., Liu, Z.X., Lin, W.L., Gilmore, J.H. (2010). FRATS: functional regression analysis of DTI tract statistics. IEEE Transactions on Medical Imaging, 29, Greven, S., Crainiceanu, C., Caffo, B., Reich, D. (2010). Longitudinal principal component analysis. E.J.Statist. 4, Goodlett, C.B., Fletcher, P. T., Gilmore, J. H., Gerig, G. (2009). Group analysis of dti fiber tract statistics with application to neurodevelopement. NeuroImage, 45, S133S Yushkevich, P. A., Zhang, H., Simon, T., Gee, J. C. (2008). Structurespecific statistical mapping of white matter tracts. NeuroImage, 41, Ramsay, J. O., Silverman, B. W. (2005). Functional Data Analysis, SpringerVerlag, New York.
18 Smoothed Functional Data Covariates (e.g., age, gender, diagnostic)
19 DTI Fiber Tract Data FA Data Diffusion properties (e.g., FA, RA)!! MD Y i (s j ) = (y i,1 (s j ),!, y i,m (s j )) T {s 1,!,s ng } Grids (e) Covariates (e.g., age, gender, diagnostic) x 1,!, x n " 1 " 2 " 3!!!!!!
20 Decomposition: MVCM y i,k (s) = x i T B k (s) + " i,k (s) + # i,k (s) Coefficients Longrange Correlation Shortrange Correlation x 1,!,x n " i,k ( ) ~ SP(0,# " )! i,k ( ) ~ SP(0,!! ), Covariance operator: Zhu, Li, and Kong (2012). AOS! y (s, s') =!! (s, s')+!! (s, s') n{vec( ˆB(d)! B(d)! 0.5O(H 2 L )) : d! D}! "! G(0,! B (d, d '))!
21 Motivation Diffusion Tensor Tract Statistics FA Tensor 2 week 2 week 1 year 1 year 2 year 2 year
22 Longitudinal Data Longitudinal Extensions Spatialtemporal Process t y i (s,t 3 ) y i (s,t 2 ) y i (s,t 1 ) Functional Mixed Effect Models s y i (s,t) = x i (t) T B (s)+ z i (t) T! i (s)+" i (s,t)+# i (s,t) Objectives: Dynamic functional effects of covariates of interest on functional response.
23 Decomposition: FMEM y i (d,t) = x i (t) T B (d)+ z i (t) T! i (d)+! i (d,t)+! i (d,t) Global Noise Components Local Correlated Noise! i (, ) ~ SP(0,!! ), " i ( ) ~ SP(0,!! )! i ( ) ~ SP(0,!! ), n{vec( ˆB(d)! B(d)! 0.5O(H 2 )) : d! D} L! "! G(0,! B (d, d ')) Ying et al. (2014). NeuroImage. Zhu, Chen, Yuan, and Wang (2014). Arxiv.
24 Real Data genu DTImaging parameters: TR/TE = 5200/73 ms Slice thickness = 2mm Inplane resolution = 2x2 mm^2 b = 1000 s/mm^2 One reference scan b = 0 s/mm^2 Repeated 5 times when 6 gradient directions applied.
25 Real Data
26 Real Data Analysis Results
27 Prediction y i,k (s) = f (x i, B k (s)+! i,k (s)) Longrange Correlation +" i,k (s) Smallrange Correlation Missing Big Data??? Hyun,J.W., Li, Y. M., J. H. Gilmore, Z. Lu, M. Styner, H. Zhu (2014) NeuroImage
28 Real Data Prediction Accuracy is much improved
29 Multiscale Adaptive Regression Models Reading materials: 1. Zhu, HT., Fan, J., and Kong, L. (2014). Spatial varying coefficient model and its applications in neuroimaging data with jump discontinuity. JASA, in press. 2. Li, YM, John Gilmore, JA Lin, Shen DG, Martin, S., Weili Lin, and Zhu, HT. (2013). Multiscale adaptive generalized estimating equations for longitudinal neuroimaging data. NeuroImage, 72, Li, YM, John Gilmore, JP Wang, M. Styner, Weili Lin, and Zhu, HT. (2012). Twostage spatial adaptive analysis of twin neuroimaging data. IEEE Transactions on Medical Imaging. 31, Skup, M., Zhu, H.T., and Zhang HP. (2012). Multiscale adaptive marginal analysis of longitudinal neuroimaging data with timevarying covariates. Biometrics, 68(4): Shi, XY, Ibrahim JG, Styner M., Yimei Li, and Zhu, HT. (2011). Twostage adjusted exponential tilted empirical likelihood for neuroimaging data. Annals of Applied Statistics, 5, Li, YM, Zhu HT, Shen DG, Lin WL, Gilmore J, and Ibrahim JG. (2011). Multiscale adaptive regression models for neuroimaging data. JRSS, Series B, 73, Polzehl, Jörg; Voss, Henning U.; Tabelow, Karsten. Structural adaptive segmentation for statistical parametric mapping. NeuroImage, 52 (2010) pp Polzehl, J. and Spokoiny, V. G. (2006). Propagationseparation approach for local likelihood estimation. Probability Theory and Related Fields, 135, J. Polzehl, V. Spokoiny, (2000) Adaptive Weights Smoothing with applications to image restoration, J. R. Stat. Soc. Ser. B Stat. Methodol., 62 pp
30 Piecewise Smooth Data Mathematics. Noisy Piecewise Smooth Functions with Unknown Jumps and Edges Image is the point or set of points in the range corresponding to a designated point in the domain of a given function. " is a compact set. f ( x ) " M # R m x "# $ R k f : " # M $ R m!!! " f (!x) k d!x < # for some k>0!
31 Neuroimaging Data with Discontinuity Noisy Piecewise Smooth Function with Unknown Jumps and Edges Subject1 Subject2 Covariates (e.g., age, gender, diagnostic, stimulus)
32 Decomposition: SVCM y i (d) = f (x i,b(d) +! i (d)) +! i (d),d! D Piecewise Smooth Shortrange Correlation Varying Coefficients Longrange Correlation B(d)! L K Covariance operator: η ij ()~ SP(0, Ση )! y (d,d ') =!! (d,d ') +!! (d,d)! ij ( ) ~ SP(0,!! ), 3D volume/ 2D surface
33 SVCM Cartoon Model Disjoint Partition B k (d) L D = l= 1 Dl and Dl Dl ' = φ Piecewise Smoothness: Lipschitz condition Smoothed Boundary Local Patch Degree of Jumps
34 Challenging Issues Smoothing coefficient images, while preserving unknown boundaries Different patterns in different coefficient images Calculating standard deviation images Asymptotic theory
35 Smoothing Methods PropogationSeperation Method J. Polzehl and V. Spokoiny, (2000,2005) Features Increasing Bandwidth 0 < h < h < < h = S r 0 1 Adaptive Weights 0 Adaptive Estimates
36 Smoothing Methods MARM At each voxel d Increasing Bandwidth Adaptive Weights Adaptive Estimates 0 < h < h < < h r S = 0 ˆµ(d;h 0 ) 1!(d, d ';h 1 )!(d, d ';h 2 ) ˆµ(d;h 1 ) 0!(d, d ';h 2 )!(d, d ';h s ) = K loc ( d! d ' /h s )K st (D µ (d, d ';h s!1 ) / C n ) D µ (d, d ';h s!1 ) = "( ˆµ(d;h s!1 ), ˆµ(d ';h s!1 )) Stopping Rule ˆµ(d;h S )
37 Simulation
38 Simulation
39 Simulation
40 ScalaronImaging Regression
41 ScalaronImage Models Reading materials: 1. Miranda, M. F., Zhu, H.T. and Ibrahim, J. G. (2014). Bayesian partial supervised tensor decomposition with applications in Neuorimaging data analysis. 2. Yang, H., Zhu, H. T., and Ibrahim, J. G. (2014). Multiscale projection model in RKHS. 3. Zhu, H. T. and Shen, D. (2014). Multiscale Weighted PCA for Imaging Prediction. 4. Guo, R.X., Ahye M., and Zhu, H. (2014). Spatially weighted PCA for imaging classification. JCGS. In revision. 5. Zhou, H., Li, L., and Zhu, H. (2013). Tensor regression with applications in Neuorimaging data analysis. JASA. In press. 6. Cuingnet, R., Glaunes, J. A., Chupin, M., Benali, H., Colliot, O., and ADNI. (2012). Spatial and anatomical regularization of SVM: a general framework for neuroimaging data. IEEE PAMI. In press. 7. Cai, T. & Yuan, M. (2012). Minimax and adaptive prediction for functional linear regression. JASA. 107, Fan, J. and Lv, J. (2010). A selective overview of variable selection in highdimensional feature space. Statistica Sinica, 20, Li, B., Kim, M. K., and Altman, N. (2010). On dimension folding of matrix or array valued statistical objects. Ann. Stat., 38, Reiss, P. T., and Ogden, R. T. (2007). Functional principal component regression and functional partial least squares. Journal of the American Statistical Association 102, Bair, E., Hastie, T., Paul, D. and Tibshirani, R. (2006). Prediction by supervised principal components. JASA. 101, Ramsay, J.O. and Silverman, B.W. (2005). Functional Data Analysis, 2nd Edition. Springer, New York. 13.James, G. (2002). Generalized linear models with functional predictor variables. JRSSB. 64, Tibshirani, Robert (1996). Regression shrinkage and selection via the lasso. JRSSB. 58,
42 HRM versus FRM Data {(y i, X i ) :i =1,!, n} X i = {X i (d) : d! D} y i =< X i,! > +" i Strategy 1: Discrete Approach (Highdimension Regression Model (HRM)) Strategy 2: Functional Regression Model (FRM) y i =! 0 +!!(d) X i (d)m(d)+" i D
43 Highdimension Regression Model Approach 1: Regularization Methods Key Conditions: Sparsity of S Restricted nullspace property for design matrix X
44 Highdimension Regression Model Tensor Structure: Ultrahigh dimensionality (256^3) Spatial structure Zhou, Li, and Zhu (2013) Li, Zhou, and Li (2013) CP decomposition Tucker decomposition!
45 ScalaronImage Models Simulations Key Conditions: Tensor Approximation B Restricted space property for X and B
46 ScalaronImage Models ADHD 200
47 ScalaronImage Models Strategy 2: Functional Approach y i =! 0 +!!!(d) X i (d)m(d)+" i D y i =! 0 + "! # k " k (d) X i (d)m(d)+# i k=1 D!!(d) = "! k " k (d) Basis Methods: fixed and datadriven basis functions! K! = {!(d) = "! k " k (d) : (! 1,!) # " 2! } C(d, d ') = Cov(X(d), X(d ')) =! k " k (d)" k (d ') k=1 k=1 " k=1
48 Key Conditions Key Conditions: an excellent set of basis functions Sparsity of {! k : k =1,!} Decay rate of spectral of C(d, d ') = Cov(X(d), X(d '))!(d)! K "! k " k (d) k=1 K << n
49 The ADNI data Mild Cognitive Impairment subjects Interested in predicting the timing of an MCI patient that converts to AD by integrating the imaging data, the clinical variables, and genetic covariates. Full Model: AUC=0.96 Partial Model: AUC=0.82
50 Limitations Is this the right space for statistical inference? {X i (d) : d! D} {!! k (d) X i (d)m(d) : d " D} D
51 Rabbit and Wolf Story {y i :i =1,!, n} y i = f 0 (X i )+! i X i = {X i (d) : d! D} y i = f (G i )+! i G i = G{!X i (d) : d! D 0 }!X i = {! X i (d) : d! D 0 }
52 Feature Space Determination X i = {X i (d) : d! D}!X i = {!X i (d) : d! D 0 } G i = G{! X i (d) : d! D 0 } Splitting/Weighting methods Level sets Factor Models y i =! 0 + "!!(d) X i (d)m(d) +" i y i =! 0 +!!(d) w(d)x i (d)m(d)+" i k Splitting D k D Weighting
53 Spatially Weighted PCA
54 Spatially Weighted PCA
55 Multiscale Factor Prediction Model
56 Multiscale Factor Prediction Model
57 Multiscale Factor Prediction Model
58 ASA: Statistics in Imaging Section SAMSI 2013 Neuroimaging Data Analysis Challenges in Computational Neuroscience
59 Acknowledgement
Big Data Integration in Biomedical Studies: A Few Personal Views and Experiences
Big Data Integration in Biomedical Studies: A Few Personal Views and Experiences Hongtu Zhu, Ph.D Department of Biostatistics and Biomedical Research Imaging Center The University of North Carolina at
More informationBig Data Integration in Biomedical Studies
Big Data Integration in Biomedical Studies Hongtu Zhu, Ph.D Department of Biostatistics and Biomedical Research Imaging Center The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
More informationBayesian Penalized Methods for High Dimensional Data
Bayesian Penalized Methods for High Dimensional Data Joseph G. Ibrahim Joint with Hongtu Zhu and Zakaria Khondker What is Covered? Motivation GLRR: Bayesian Generalized Low Rank Regression L2R2: Bayesian
More informationPublication List. Chen Zehua Department of Statistics & Applied Probability National University of Singapore
Publication List Chen Zehua Department of Statistics & Applied Probability National University of Singapore Publications Journal Papers 1. Y. He and Z. Chen (2014). A sequential procedure for feature selection
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002Topics in StatisticsBiological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationLasso on Categorical Data
Lasso on Categorical Data Yunjin Choi, Rina Park, Michael Seo December 14, 2012 1 Introduction In social science studies, the variables of interest are often categorical, such as race, gender, and nationality.
More informationFunctional Data Analysis of MALDI TOF Protein Spectra
Functional Data Analysis of MALDI TOF Protein Spectra Dean Billheimer dean.billheimer@vanderbilt.edu. Department of Biostatistics Vanderbilt University Vanderbilt Ingram Cancer Center FDA for MALDI TOF
More informationNeuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease
1 Neuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease The following contains a summary of the content of the neuroimaging module I on the postgraduate
More informationImproved Fuzzy Cmeans Clustering Algorithm Based on Cluster Density
Journal of Computational Information Systems 8: 2 (2012) 727 737 Available at http://www.jofcis.com Improved Fuzzy Cmeans Clustering Algorithm Based on Cluster Density Xiaojun LOU, Junying LI, Haitao
More informationData Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a
More informationThe Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
More informationUnsupervised and supervised dimension reduction: Algorithms and connections
Unsupervised and supervised dimension reduction: Algorithms and connections Jieping Ye Department of Computer Science and Engineering Evolutionary Functional Genomics Center The Biodesign Institute Arizona
More informationDR. YUEXIAO DONG. Department of Statistics Fox School of Business Temple University 1810 N. 13th Street Philadelphia, PA 19122
Department of Statistics Fox School of Business Temple University 1810 N. 13th Street Philadelphia, PA 19122 Phone: (1)2152040670 Email: ydong@temple.edu http://astro.temple.edu/ ydong/index.html Employment
More informationMapping Brain Changes Over Time during Development: Challenges, Limits and Potential
Mapping Brain Changes Over Time during Development: Challenges, Limits and Potential Guido Gerig University of Utah Scientific Computing and Imaging (SCI) Institute Gerig 092008 Outline 1. Imaging Technology
More informationBIG DATA AND HIGH DIMENSIONAL DATA ANALYSIS
BIG DATA AND HIGH DIMENSIONAL DATA ANALYSIS B.L.S. Prakasa Rao CR RAO ADVANCED INSTITUTE OF MATHEMATICS, STATISTICS AND COMPUTER SCIENCE (AIMSCS) University of Hyderabad Campus GACHIBOWLI, HYDERABAD 500046
More informationContribute 22 Classification of EEG recordings in auditory brain activity via a logistic functional linear regression model
Contribute 22 Classification of EEG recordings in auditory brain activity via a logistic functional linear regression model Irène Gannaz Abstract We want to analyse EEG recordings in order to investigate
More informationNonparametric Time Series Analysis: A review of Peter Lewis contributions to the field
Nonparametric Time Series Analysis: A review of Peter Lewis contributions to the field Bonnie Ray IBM T. J. Watson Research Center Joint Statistical Meetings 2012 Outline Background My connection to Peter
More informationIntroduction to machine learning and pattern recognition Lecture 1 Coryn BailerJones
Introduction to machine learning and pattern recognition Lecture 1 Coryn BailerJones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation
More informationTwo Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering
Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014
More informationBayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN13: 9780470860809 ISBN10: 0470860804 Editors Brian S Everitt & David
More informationThe Wondrous World of fmri statistics
Outline The Wondrous World of fmri statistics FMRI data and Statistics course, Leiden, 1132008 The General Linear Model Overview of fmri data analysis steps fmri timeseries Modeling effects of interest
More informationEmpowering Anatomical Shape Analysis with Medial Curves and 1D GroupWise Registration
Empowering Anatomical Shape Analysis with Medial Curves and 1D GroupWise Registration Abstract No: 5943 Authors: Boris Gut man 1, Yalin Wang 2, Priya Rajagopalan 3, Arthur Toga 4, Paul Thompson 5 Institutions:
More informationLABEL PROPAGATION ON GRAPHS. SEMISUPERVISED LEARNING. Changsheng Liu 10302014
LABEL PROPAGATION ON GRAPHS. SEMISUPERVISED LEARNING Changsheng Liu 10302014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph
More informationApplications to Data Smoothing and Image Processing I
Applications to Data Smoothing and Image Processing I MA 348 Kurt Bryan Signals and Images Let t denote time and consider a signal a(t) on some time interval, say t. We ll assume that the signal a(t) is
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation:  Feature vector X,  qualitative response Y, taking values in C
More informationA GAUSSIAN COMPOUND DECISION BAKEOFF
A GAUSSIAN COMPOUND DECISION BAKEOFF ROGER KOENKER Abstract. A nonparametric mixture model approach to empirical Bayes compound decisions for the Gaussian location model is compared with a parametric empirical
More informationArtificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network?  Perceptron learners  Multilayer networks What is a Support
More informationPrinciple Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate
More informationBayesX  Software for Bayesian Inference in Structured Additive Regression
BayesX  Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, LudwigMaximiliansUniversity Munich
More informationExact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure
Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure Belyaev Mikhail 1,2,3, Burnaev Evgeny 1,2,3, Kapushev Yermek 1,2 1 Institute for Information Transmission
More informationBootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
More informationStructural brain imaging is a window on postnatal brain development in children. Terry Jernigan
Structural brain imaging is a window on postnatal brain development in children Terry Jernigan Old (but still widely held) View of Human Brain Structure All neurons are present at birth. Myelination of
More informationOverview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models
Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear
More informationAdvanced Spatial Statistics Fall 2012 NCSU. Fuentes Lecture notes
Advanced Spatial Statistics Fall 2012 NCSU Fuentes Lecture notes Areal unit data 2 Areal Modelling Areal unit data Key Issues Is there spatial pattern? Spatial pattern implies that observations from units
More informationLassobased Spam Filtering with Chinese Emails
Journal of Computational Information Systems 8: 8 (2012) 3315 3322 Available at http://www.jofcis.com Lassobased Spam Filtering with Chinese Emails Zunxiong LIU 1, Xianlong ZHANG 1,, Shujuan ZHENG 2 1
More informationCancer Biostatistics Workshop Science of Doing Science  Biostatistics
Cancer Biostatistics Workshop Science of Doing Science  Biostatistics Yu Shyr, PhD Jan. 18, 2008 Cancer Biostatistics Center VanderbiltIngram Cancer Center Yu.Shyr@vanderbilt.edu Aims Cancer Biostatistics
More informationSupplementary Material: Covariateadjusted matrix visualization via correlation decomposition
Supplementary Material: Covariateadjusted matrix visualization via correlation decomposition HanMing Wu 1, YinJing Tien 2, MengRu Ho 3,4,5, HaiGwo Hwu 6, Wenchang Lin 5, MiHua Tao 5, and ChunHouh
More informationWorkshop on current challenges in statistical learning (11w5051)
Workshop on current challenges in statistical learning (11w5051) Hugh Chipman (Acadia University) Robert Tibshirani (Stanford University) Ji Zhu (University of Michigan) Xiaotong Shen (University of Minnesota)
More informationClass #6: Nonlinear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Nonlinear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Nonlinear classification Linear Support Vector Machines
More informationOn the degrees of freedom in shrinkage estimation
On the degrees of freedom in shrinkage estimation Kengo Kato Graduate School of Economics, University of Tokyo, 731 Hongo, Bunkyoku, Tokyo, 1130033, Japan kato ken@hkg.odn.ne.jp October, 2007 Abstract
More informationOneshot learning and big data with n = 2
Oneshot learning and big data with n = Lee H. Dicker Rutgers University Piscataway, NJ ldicker@stat.rutgers.edu Dean P. Foster University of Pennsylvania Philadelphia, PA dean@foster.net Abstract We model
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationHT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
More informationDTI Fiber TractOriented Quantitative and Visual Analysis of White Matter Integrity
DTI Fiber TractOriented Quantitative and Visual Analysis of White Matter Integrity Xuwei Liang, Ning Cao, and Jun Zhang Department of Computer Science, University of Kentucky, USA, jzhang@cs.uky.edu.
More informationStatistical issues in the analysis of microarray data
Statistical issues in the analysis of microarray data Daniel Gerhard Institute of Biostatistics Leibniz University of Hannover ESNATS Summerschool, Zermatt D. Gerhard (LUH) Analysis of microarray data
More informationDATA SCIENCE Workshop November 1213, 2015
DATA SCIENCE Workshop November 1213, 2015 ParisDauphine University Place du Maréchal de Lattre de Tassigny, 75016 Paris DATA SCIENCE Workshop will be held at ParisDauphine university, on November 12
More informationAcknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues
Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the
More informationYerevan State University, Yerevan, Armenia B.S., Applied Mathematics and Computer Science, 2004
Ani Eloyan Contact Information Department of Biostatistics Bloomberg School of Public Health Johns Hopkins University 615 North Wolfe Street, E3640 Baltimore, MD, 21205 Email: aeloyan1@jhu.edu Phone: 4105022988
More informationA New Sparse Simplex Model for Brain Anatomical and Genetic Network Analysis
A New Sparse Simplex Model for Brain Anatomical and Genetic Network Analysis Heng Huang 1, Jingwen Yan 2, Feiping Nie 1, Jin Huang 1, Weidong Cai 3, Andrew J. Saykin 2, and Li Shen 2 1 Computer Science
More informationDetection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
More informationNonnegative Matrix Factorization (NMF) in Semisupervised Learning Reducing Dimension and Maintaining Meaning
Nonnegative Matrix Factorization (NMF) in Semisupervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step
More informationBig Data, Statistics, and the Internet
Big Data, Statistics, and the Internet Steven L. Scott April, 4 Steve Scott (Google) Big Data, Statistics, and the Internet April, 4 / 39 Summary Big data live on more than one machine. Computing takes
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationEffective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data
Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the socalled high dimensional, low sample size (HDLSS) settings, LDA
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More informationJournal of Statistical Software
JSS Journal of Statistical Software August 26, Volume 16, Issue 8. http://www.jstatsoft.org/ Some Algorithms for the Conditional Mean Vector and Covariance Matrix John F. Monahan North Carolina State University
More informationA Composite Likelihood Approach to Analysis of Survey Data with Sampling Weights Incorporated under TwoLevel Models
A Composite Likelihood Approach to Analysis of Survey Data with Sampling Weights Incorporated under TwoLevel Models Grace Y. Yi 13, JNK Rao 2 and Haocheng Li 1 1. University of Waterloo, Waterloo, Canada
More informationIterative Solvers for Linear Systems
9th SimLab Course on Parallel Numerical Simulation, 4.10 8.10.2010 Iterative Solvers for Linear Systems Bernhard Gatzhammer Chair of Scientific Computing in Computer Science Technische Universität München
More informationStatistical Learning THE TENTH EUGENE LUKACS SYMPOSIUM DEPARTMENT OF MATHEMATICS AND STATISTICS BOWLING GREEN STATE UNIVERSITY
Statistical Learning THE TENTH EUGENE LUKACS SYMPOSIUM DEPARTMENT OF MATHEMATICS AND STATISTICS BOWLING GREEN STATE UNIVERSITY MAY 14, 2016 The 10th Eugene Lukacs Symposium Program Schedule (All symposium
More informationConservation, Constraints, and Comparisons
Conservation, Constraints, and Comparisons Dani S. Bassett Department of Physics University of California Santa Barbara Complexity in the Human Brain The human brain is complex over multiple scales of
More informationNON PARAMETRIC MODELING OF STOCK INDEX
National Journal on Advances in Computing and Management, Vol. 1, No.2, October 2010 49 Abstract NON PARAMETRIC MODELING OF STOCK INDEX Sujatha K.V. 1 and Meenakshi Sundaram S. 2 1 Research Scholar, Sathyabama
More information203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
More informationSyllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015
Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Lecture: MWF: 1:001:50pm, GEOLOGY 4645 Instructor: Mihai
More informationMachine Learning in FX Carry Basket Prediction
Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines
More informationShape Feature Extraction of Wheat Leaf Disease based on Invariant Moment Theory
Shape Feature Extraction of Wheat Leaf Disease based on Invariant Moment Theory Zhihua Diao 1,1, Anping Zheng 1, Yuanyuan Wu 1 1 Zhengzhou University of Light Industry, College of Electric and Information
More informationList of Ph.D. Courses
Research Methods Courses (5 courses/15 hours) List of Ph.D. Courses The research methods set consists of five courses (15 hours) that discuss the process of research and key methodological issues encountered
More informationData analysis in supersaturated designs
Statistics & Probability Letters 59 (2002) 35 44 Data analysis in supersaturated designs Runze Li a;b;, Dennis K.J. Lin a;b a Department of Statistics, The Pennsylvania State University, University Park,
More informationUniversity of California San Francisco, CA, USA 4 Department of Veterans Affairs Medical Center, San Francisco, CA, USA
Disrupted Brain Connectivity in Alzheimer s Disease: Effects of Network Thresholding Madelaine Daianu 1, Emily L. Dennis 1, Neda Jahanshad 1, Talia M. Nir 1, Arthur W. Toga 1, Clifford R. Jack, Jr. 2,
More informationNetwork Analysis I & II
Network Analysis I & II Dani S. Bassett Department of Physics University of California Santa Barbara Outline Lecture One: 1. Complexity in the Human Brain from processes to patterns 2. Graph Theory and
More informationMedical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt.
Medical Image Processing on the GPU Past, Present and Future Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt.edu Outline Motivation why do we need GPUs? Past  how was GPU programming
More informationChallenges in Medical Imaging
Challenges in Medical Imaging PhD Candidate Faculty of Engineering and Natural Sciences TÜBITAK 112E320 (20132016): New computational mathematical methods in preand postoperative brainstem white matter
More informationExamples. David Ruppert. April 25, 2009. Cornell University. Statistics for Financial Engineering: Some R. Examples. David Ruppert.
Cornell University April 25, 2009 Outline 1 2 3 4 A little about myself BA and MA in mathematics PhD in statistics in 1977 taught in the statistics department at North Carolina for 10 years have been in
More informationSVMBased Approaches for Predictive Modeling of Survival Data
SVMBased Approaches for Predictive Modeling of Survival Data HanTai Shiao and Vladimir Cherkassky Department of Electrical and Computer Engineering University of Minnesota, Twin Cities Minneapolis, Minnesota
More informationTreelets A Tool for Dimensionality Reduction and MultiScale Analysis of Unstructured Data
Treelets A Tool for Dimensionality Reduction and MultiScale Analysis of Unstructured Data Ann B Lee Department of Statistics Carnegie Mellon University Pittsburgh, PA 56 annlee@statcmuedu Abstract In
More informationU.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN 2286 3540
U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN 2286 3540 ENTERPRISE FINANCIAL DISTRESS PREDICTION BASED ON BACKWARD PROPAGATION NEURAL NETWORK: AN EMPIRICAL STUDY ON THE CHINESE LISTED EQUIPMENT
More informationPrinciples of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
More informationNonlinear Regression in Parametric Activation Studies
NEUROIMAGE 4, 60 66 (1996) ARTICLE NO. 0029 Nonlinear Regression in Parametric Activation Studies C. BÜCHEL, R. J. S. WISE, C. J. MUMMERY, J.B. POLINE, AND K. J. FRISTON The Wellcome Department of Cognitive
More informationThe fastclime Package for Linear Programming and LargeScale Precision Matrix Estimation in R
Journal of Machine Learning Research 15 (2014) 489493 Submitted 3/13; Revised 8/13; Published 2/14 The fastclime Package for Linear Programming and LargeScale Precision Matrix Estimation in R Haotian
More informationIdentifying Groupwise Consistent White Matter Landmarks via Novel Fiber Shape Descriptor
Identifying Groupwise Consistent White Matter Landmarks via Novel Fiber Shape Descriptor Hanbo Chen, Tuo Zhang, Tianming Liu the University of Georgia, US. Northwestern Polytechnical University, China.
More informationEfficient variable selection in support vector machines via the alternating direction method of multipliers
Efficient variable selection in support vector machines via the alternating direction method of multipliers GuiBo Ye Yifei Chen Xiaohui Xie University of California, Irvine University of California, Irvine
More informationAssessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall
Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin
More informationStatistical Considerations in Magnetic Resonance Imaging of Brain Function
Statistical Considerations in Magnetic Resonance Imaging of Brain Function Brian D. Ripley Professor of Applied Statistics University of Oxford ripley@stats.ox.ac.uk http://www.stats.ox.ac.uk/ ripley Acknowledgements
More informationComparing Functional Data Analysis Approach and Nonparametric MixedEffects Modeling Approach for Longitudinal Data Analysis
Comparing Functional Data Analysis Approach and Nonparametric MixedEffects Modeling Approach for Longitudinal Data Analysis Hulin Wu, PhD, Professor (with Dr. Shuang Wu) Department of Biostatistics &
More informationHANDLING DROPOUT AND WITHDRAWAL IN LONGITUDINAL CLINICAL TRIALS
HANDLING DROPOUT AND WITHDRAWAL IN LONGITUDINAL CLINICAL TRIALS Mike Kenward London School of Hygiene and Tropical Medicine Acknowledgements to James Carpenter (LSHTM) Geert Molenberghs (Universities of
More informationComparison of Nonlinear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Nonlinear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Nonlinear
More informationFrom Sparse Approximation to Forecast of Intraday Load Curves
From Sparse Approximation to Forecast of Intraday Load Curves Mathilde Mougeot Joint work with D. Picard, K. Tribouley (P7)& V. Lefieux, L. TeyssierMaillard (RTE) 1/43 Electrical Consumption Time series
More informationMachine learning for Neuroimaging: an introduction
Machine learning for Neuroimaging: an introduction Bertrand Thirion, INRIA Saclay Île de France, Parietal team http://parietal.saclay.inria.fr bertrand.thirion@inria.fr 1 Outline Neuroimaging in 5' Some
More information4. GPCRs PREDICTION USING GREY INCIDENCE DEGREE MEASURE AND PRINCIPAL COMPONENT ANALYIS
4. GPCRs PREDICTION USING GREY INCIDENCE DEGREE MEASURE AND PRINCIPAL COMPONENT ANALYIS The GPCRs sequences are made up of amino acid polypeptide chains. We can also call them sub units. The number and
More informationrunl I IUI%I/\L Magnetic Resonance Imaging
runl I IUI%I/\L Magnetic Resonance Imaging SECOND EDITION Scott A. HuetteS Brain Imaging and Analysis Center, Duke University Allen W. Song Brain Imaging and Analysis Center, Duke University Gregory McCarthy
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.
Lecture 1 Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationIntroduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011
Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning
More informationBLOCKWISE SPARSE REGRESSION
Statistica Sinica 16(2006), 375390 BLOCKWISE SPARSE REGRESSION Yuwon Kim, Jinseog Kim and Yongdai Kim Seoul National University Abstract: Yuan an Lin (2004) proposed the grouped LASSO, which achieves
More informationGraduate Programs in Statistics
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
More informationL10: Probability, statistics, and estimation theory
L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian
More informationA Survey of Classification Techniques in the Area of Big Data.
A Survey of Classification Techniques in the Area of Big Data. 1PrafulKoturwar, 2 SheetalGirase, 3 Debajyoti Mukhopadhyay 1Reseach Scholar, Department of Information Technology 2Assistance Professor,Department
More informationFeature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier
Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier D.Nithya a, *, V.Suganya b,1, R.Saranya Irudaya Mary c,1 Abstract  This paper presents,
More informationMachine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC
Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain
More information