Functional Data Analysis of Big Neuroimaging Data
|
|
|
- Kelly Burns
- 10 years ago
- Views:
Transcription
1 Functional Data Analysis of Big Neuroimaging Data Hongtu Zhu, Ph.D Department of Biostatistics and Biomedical Research Imaging Center The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Acknowledgement: Some pictures were copied from multiple resources including Suetens (2009), Fass (2008), Dr. Niethammer, Drs. Lindquist, Rowe Huettel, Wiki, google, etc.
2 Big Neuroimaging Data Is it really big?
3 aims to simulate the complete human brain on Supercomputers to better understand how it functions. The Brain Research through Advancing Innovative Neurotechnologies or BRAIN, aims to reconstruct the activity of every single neuron as they fire simultaneously in different brain circuits, or perhaps even whole brains.
4 Big Neuroimaging Data NIH normal brain development 1000 Functional Connectome Project Alzheimer s Disease Neuroimaging Initiative National Database for Autism Research (NDAR) Human Connectome Project
5 Complex Study Design cross-sectional studies; clustered studies including longitudinal and twin/familial studies; ρ 1 5 ρ 5 9 ρ 1 9
6 Complex Data Structure Multivariate Imaging Measures Smooth Functional Imaging Measures Whole-brain Imaging Measures 4D-Time Series Imaging Measures
7 Big Data Integration E IS I II IG IS SI G D Selection
8 Models for Big Data Integration Image-on-Scalar (IS) model Image data as response, clinical variables as predictors. Scalar-on-Image (SI) model Clinical variables as response, image data as predictors Image-on-Genetic (IG) model Image data as response, genetic data as predictors Image-on-Image (II) model Image data as response, image data as predictors
9 Noisy Imaging Data Spatial Maps Estimation `Registration Inference `Smoothing Prediction `Correlation `Spatial Heterogeneity
10 Imaging-on-Scalar Regression
11 VBA versus FDA Data {(x i,y i ) :i =1,!, n} Y i = {Y i (d 0 ) : d 0! D 0 } Intrinsic Discrete Approach (VBA) Y i = {Y i (d 0 ) : d 0! D 0 } Intrinsic Functional Approach (FDA) Y i ( ) = {Y i (d) : d! D}
12 n Functional Data Analysis (FDA) y i (d) = x i T B (d)+! i (d)! i ( ) ~ SP(0,!! ) Hotelling-type Test Statistics Y =! Y i / n S Y = (Y i!y )! i=1 Big data Pro: Incorporate spatial smoothness and spatial correlation T n 2 Con: Computational and theoretical difficulties n! (Y i!y ) / n T 2 < Y,u > 2 n = nsup u =1 i=1 < u,s Y u > P(T n 2 =!) =1 S Y!!(S Y ) S Y! diag(s Y )
13 High-dimensional Regression Models Big data Y i = BX i +! i! i ( ) ~ SP(0,!! ) dim(y i ) >> n Hotelling-type Test Statistics T n 2 n n Y =! Y i / n S Y =!(Y i!y )(Y i!y ) T / n T 2 < Y,u > 2 n = nsup u =1 < u,s Y u > i=1 i=1 P(T n 2 =!) =1 (Y,S Y )!!(Y,S Y )???
14 Voxel Based Analysis (VBA) Data {(xi,yi ) : i = 1,!, n} Yi = {Yi (d0 ) : d0! D0 } VBA Stage 0: Gaussian Kernel Smoothing Stage 1: Model Fitting n n! p(yi xi ) =! i=1 i=1! p(y (d ) x,! (d )) i 0 i 0 d"d0 Ignore spatial smoothness Stage 2: Hypothesis Testing H 0 :! (d) =!* (d) for all voxels H1 :! (d)!!* (d) for some voxels Random Field Theory: functional data and local smoothness FDR
15 Cons VBA Potential large smoothing errors. Treat voxels as independent units/images as a collection of independent voxels. Ignore spatial correlation and smoothness in statistical analysis. Inaccurate for both Prediction and Estimation. Decrease statistical power.
16 VBA Bayesian Extensions Bayesian Modeling Spatial smooth prior (MRF) p(!) = p({!(d 0 ) : d 0! D 0 }) n p(! Y )!{" p(y i x i,!)}p(!) = {" " p(y i (d) x i,!(d))}p(!) i=1 i=1 Pro: Computationally straightforward; Bayesian inference based on MCMC samples n d#d 0 Con: Computationally heavy; Lack of understanding for Bayesian inference tools.
17 Varying Coefficient Models Reading materials: 1. Zhu, H. T., Chen, K. H., Yuan, Y. and Wang, J. L. (2014). Functional Mixed Processes Models for Repeated Functional Data. In submission. 2. Liang, J. L., Huang, C., and Zhu, H.T. (2014). Functional single-index varying coefficient models. In submission. 3. Yuan, Y., Gilmore, J., Geng, X. J., Styner, M., Chen, K. H., Wang, J. L., and Zhu, H.T. (2013). A longitudinal functional analysis framework for analysis of white matter tract statistics. NeuroImage, in press. 4. Yuan, Y., Zhu, H.T., Styner, M., J. H. Gilmore., and Marron, J. S. (2013). Varying coefficient model for modeling diffusion tensors along white matter bundles. Annals of Applied Statistics. 7(1): Zhu, H.T., Li, R. Z., Kong, L.L. (2012). Multivariate varying coefficient models for functional responses. Ann. Stat. 40, Hua, Z.W., Dunson, D., Gilmore, J.H., Styner, M., and Zhu, HT. (2012). Semiparametric Bayesian local functional models for diffusion tensor tract statistics. NeuroImage, 63, Zhu, HT., Kong, L., Li, R., Styner, M., Gerig, G., Lin, W. and Gilmore, J. H. (2011). FADTTS: Functional Analysis of Diffusion Tensor Tract Statistics, NeuroImage, 56, Zhu, H.T., Styner, M., Tang, N.S., Liu, Z.X., Lin, W.L., Gilmore, J.H. (2010). FRATS: functional regression analysis of DTI tract statistics. IEEE Transactions on Medical Imaging, 29, Greven, S., Crainiceanu, C., Caffo, B., Reich, D. (2010). Longitudinal principal component analysis. E.J.Statist. 4, Goodlett, C.B., Fletcher, P. T., Gilmore, J. H., Gerig, G. (2009). Group analysis of dti fiber tract statistics with application to neurodevelopement. NeuroImage, 45, S133-S Yushkevich, P. A., Zhang, H., Simon, T., Gee, J. C. (2008). Structure-specific statistical mapping of white matter tracts. NeuroImage, 41, Ramsay, J. O., Silverman, B. W. (2005). Functional Data Analysis, Springer-Verlag, New York.
18 Smoothed Functional Data Covariates (e.g., age, gender, diagnostic)
19 DTI Fiber Tract Data FA Data Diffusion properties (e.g., FA, RA)!! MD Y i (s j ) = (y i,1 (s j ),!, y i,m (s j )) T {s 1,!,s ng } Grids (e) Covariates (e.g., age, gender, diagnostic) x 1,!, x n " 1 " 2 " 3!!!!!!
20 Decomposition: MVCM y i,k (s) = x i T B k (s) + " i,k (s) + # i,k (s) Coefficients Long-range Correlation Short-range Correlation x 1,!,x n " i,k ( ) ~ SP(0,# " )! i,k ( ) ~ SP(0,!! ), Covariance operator: Zhu, Li, and Kong (2012). AOS! y (s, s') =!! (s, s')+!! (s, s') n{vec( ˆB(d)! B(d)! 0.5O(H 2 L )) : d! D}! "! G(0,! B (d, d '))!
21 Motivation Diffusion Tensor Tract Statistics FA Tensor 2 week 2 week 1 year 1 year 2 year 2 year
22 Longitudinal Data Longitudinal Extensions Spatial-temporal Process t y i (s,t 3 ) y i (s,t 2 ) y i (s,t 1 ) Functional Mixed Effect Models s y i (s,t) = x i (t) T B (s)+ z i (t) T! i (s)+" i (s,t)+# i (s,t) Objectives: Dynamic functional effects of covariates of interest on functional response.
23 Decomposition: FMEM y i (d,t) = x i (t) T B (d)+ z i (t) T! i (d)+! i (d,t)+! i (d,t) Global Noise Components Local Correlated Noise! i (, ) ~ SP(0,!! ), " i ( ) ~ SP(0,!! )! i ( ) ~ SP(0,!! ), n{vec( ˆB(d)! B(d)! 0.5O(H 2 )) : d! D} L! "! G(0,! B (d, d ')) Ying et al. (2014). NeuroImage. Zhu, Chen, Yuan, and Wang (2014). Arxiv.
24 Real Data genu DTImaging parameters: TR/TE = 5200/73 ms Slice thickness = 2mm In-plane resolution = 2x2 mm^2 b = 1000 s/mm^2 One reference scan b = 0 s/mm^2 Repeated 5 times when 6 gradient directions applied.
25 Real Data
26 Real Data Analysis Results
27 Prediction y i,k (s) = f (x i, B k (s)+! i,k (s)) Long-range Correlation +" i,k (s) Small-range Correlation Missing Big Data??? Hyun,J.W., Li, Y. M., J. H. Gilmore, Z. Lu, M. Styner, H. Zhu (2014) NeuroImage
28 Real Data Prediction Accuracy is much improved
29 Multiscale Adaptive Regression Models Reading materials: 1. Zhu, HT., Fan, J., and Kong, L. (2014). Spatial varying coefficient model and its applications in neuroimaging data with jump discontinuity. JASA, in press. 2. Li, YM, John Gilmore, JA Lin, Shen DG, Martin, S., Weili Lin, and Zhu, HT. (2013). Multiscale adaptive generalized estimating equations for longitudinal neuroimaging data. NeuroImage, 72, Li, YM, John Gilmore, JP Wang, M. Styner, Weili Lin, and Zhu, HT. (2012). Two-stage spatial adaptive analysis of twin neuroimaging data. IEEE Transactions on Medical Imaging. 31, Skup, M., Zhu, H.T., and Zhang HP. (2012). Multiscale adaptive marginal analysis of longitudinal neuroimaging data with time-varying covariates. Biometrics, 68(4): Shi, XY, Ibrahim JG, Styner M., Yimei Li, and Zhu, HT. (2011). Two-stage adjusted exponential tilted empirical likelihood for neuroimaging data. Annals of Applied Statistics, 5, Li, YM, Zhu HT, Shen DG, Lin WL, Gilmore J, and Ibrahim JG. (2011). Multiscale adaptive regression models for neuroimaging data. JRSS, Series B, 73, Polzehl, Jörg; Voss, Henning U.; Tabelow, Karsten. Structural adaptive segmentation for statistical parametric mapping. NeuroImage, 52 (2010) pp Polzehl, J. and Spokoiny, V. G. (2006). Propagation-separation approach for local likelihood estimation. Probability Theory and Related Fields, 135, J. Polzehl, V. Spokoiny, (2000) Adaptive Weights Smoothing with applications to image restoration, J. R. Stat. Soc. Ser. B Stat. Methodol., 62 pp
30 Piecewise Smooth Data Mathematics. Noisy Piecewise Smooth Functions with Unknown Jumps and Edges Image is the point or set of points in the range corresponding to a designated point in the domain of a given function. " is a compact set. f ( x ) " M # R m x "# $ R k f : " # M $ R m!!! " f (!x) k d!x < # for some k>0!
31 Neuroimaging Data with Discontinuity Noisy Piecewise Smooth Function with Unknown Jumps and Edges Subject1 Subject2 Covariates (e.g., age, gender, diagnostic, stimulus)
32 Decomposition: SVCM y i (d) = f (x i,b(d) +! i (d)) +! i (d),d! D Piecewise Smooth Short-range Correlation Varying Coefficients Long-range Correlation B(d)! L K Covariance operator: η ij ()~ SP(0, Ση )! y (d,d ') =!! (d,d ') +!! (d,d)! ij ( ) ~ SP(0,!! ), 3D volume/ 2D surface
33 SVCM Cartoon Model Disjoint Partition B k (d) L D = l= 1 Dl and Dl Dl ' = φ Piecewise Smoothness: Lipschitz condition Smoothed Boundary Local Patch Degree of Jumps
34 Challenging Issues Smoothing coefficient images, while preserving unknown boundaries Different patterns in different coefficient images Calculating standard deviation images Asymptotic theory
35 Smoothing Methods Propogation-Seperation Method J. Polzehl and V. Spokoiny, (2000,2005) Features Increasing Bandwidth 0 < h < h < < h = S r 0 1 Adaptive Weights 0 Adaptive Estimates
36 Smoothing Methods MARM At each voxel d Increasing Bandwidth Adaptive Weights Adaptive Estimates 0 < h < h < < h r S = 0 ˆµ(d;h 0 ) 1!(d, d ';h 1 )!(d, d ';h 2 ) ˆµ(d;h 1 ) 0!(d, d ';h 2 )!(d, d ';h s ) = K loc ( d! d ' /h s )K st (D µ (d, d ';h s!1 ) / C n ) D µ (d, d ';h s!1 ) = "( ˆµ(d;h s!1 ), ˆµ(d ';h s!1 )) Stopping Rule ˆµ(d;h S )
37 Simulation
38 Simulation
39 Simulation
40 Scalar-on-Imaging Regression
41 Scalar-on-Image Models Reading materials: 1. Miranda, M. F., Zhu, H.T. and Ibrahim, J. G. (2014). Bayesian partial supervised tensor decomposition with applications in Neuorimaging data analysis. 2. Yang, H., Zhu, H. T., and Ibrahim, J. G. (2014). Multiscale projection model in RKHS. 3. Zhu, H. T. and Shen, D. (2014). Multiscale Weighted PCA for Imaging Prediction. 4. Guo, R.X., Ahye M., and Zhu, H. (2014). Spatially weighted PCA for imaging classification. JCGS. In revision. 5. Zhou, H., Li, L., and Zhu, H. (2013). Tensor regression with applications in Neuorimaging data analysis. JASA. In press. 6. Cuingnet, R., Glaunes, J. A., Chupin, M., Benali, H., Colliot, O., and ADNI. (2012). Spatial and anatomical regularization of SVM: a general framework for neuroimaging data. IEEE PAMI. In press. 7. Cai, T. & Yuan, M. (2012). Minimax and adaptive prediction for functional linear regression. JASA. 107, Fan, J. and Lv, J. (2010). A selective overview of variable selection in high-dimensional feature space. Statistica Sinica, 20, Li, B., Kim, M. K., and Altman, N. (2010). On dimension folding of matrix or array valued statistical objects. Ann. Stat., 38, Reiss, P. T., and Ogden, R. T. (2007). Functional principal component regression and functional partial least squares. Journal of the American Statistical Association 102, Bair, E., Hastie, T., Paul, D. and Tibshirani, R. (2006). Prediction by supervised principal components. JASA. 101, Ramsay, J.O. and Silverman, B.W. (2005). Functional Data Analysis, 2nd Edition. Springer, New York. 13.James, G. (2002). Generalized linear models with functional predictor variables. JRSSB. 64, Tibshirani, Robert (1996). Regression shrinkage and selection via the lasso. JRSSB. 58,
42 HRM versus FRM Data {(y i, X i ) :i =1,!, n} X i = {X i (d) : d! D} y i =< X i,! > +" i Strategy 1: Discrete Approach (High-dimension Regression Model (HRM)) Strategy 2: Functional Regression Model (FRM) y i =! 0 +!!(d) X i (d)m(d)+" i D
43 High-dimension Regression Model Approach 1: Regularization Methods Key Conditions: Sparsity of S Restricted null-space property for design matrix X
44 High-dimension Regression Model Tensor Structure: Ultra-high dimensionality (256^3) Spatial structure Zhou, Li, and Zhu (2013) Li, Zhou, and Li (2013) CP decomposition Tucker decomposition!
45 Scalar-on-Image Models Simulations Key Conditions: Tensor Approximation B Restricted space property for X and B
46 Scalar-on-Image Models ADHD 200
47 Scalar-on-Image Models Strategy 2: Functional Approach y i =! 0 +!!!(d) X i (d)m(d)+" i D y i =! 0 + "! # k " k (d) X i (d)m(d)+# i k=1 D!!(d) = "! k " k (d) Basis Methods: fixed and data-driven basis functions! K! = {!(d) = "! k " k (d) : (! 1,!) # " 2! } C(d, d ') = Cov(X(d), X(d ')) =! k " k (d)" k (d ') k=1 k=1 " k=1
48 Key Conditions Key Conditions: an excellent set of basis functions Sparsity of {! k : k =1,!} Decay rate of spectral of C(d, d ') = Cov(X(d), X(d '))!(d)! K "! k " k (d) k=1 K << n
49 The ADNI data Mild Cognitive Impairment subjects Interested in predicting the timing of an MCI patient that converts to AD by integrating the imaging data, the clinical variables, and genetic covariates. Full Model: AUC=0.96 Partial Model: AUC=0.82
50 Limitations Is this the right space for statistical inference? {X i (d) : d! D} {!! k (d) X i (d)m(d) : d " D} D
51 Rabbit and Wolf Story {y i :i =1,!, n} y i = f 0 (X i )+! i X i = {X i (d) : d! D} y i = f (G i )+! i G i = G{!X i (d) : d! D 0 }!X i = {! X i (d) : d! D 0 }
52 Feature Space Determination X i = {X i (d) : d! D}!X i = {!X i (d) : d! D 0 } G i = G{! X i (d) : d! D 0 } Splitting/Weighting methods Level sets Factor Models y i =! 0 + "!!(d) X i (d)m(d) +" i y i =! 0 +!!(d) w(d)x i (d)m(d)+" i k Splitting D k D Weighting
53 Spatially Weighted PCA
54 Spatially Weighted PCA
55 Multiscale Factor Prediction Model
56 Multiscale Factor Prediction Model
57 Multiscale Factor Prediction Model
58 ASA: Statistics in Imaging Section SAMSI 2013 Neuroimaging Data Analysis Challenges in Computational Neuroscience
59 Acknowledgement
Big Data Integration in Biomedical Studies
Big Data Integration in Biomedical Studies Hongtu Zhu, Ph.D Department of Biostatistics and Biomedical Research Imaging Center The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Bayesian Penalized Methods for High Dimensional Data
Bayesian Penalized Methods for High Dimensional Data Joseph G. Ibrahim Joint with Hongtu Zhu and Zakaria Khondker What is Covered? Motivation GLRR: Bayesian Generalized Low Rank Regression L2R2: Bayesian
Publication List. Chen Zehua Department of Statistics & Applied Probability National University of Singapore
Publication List Chen Zehua Department of Statistics & Applied Probability National University of Singapore Publications Journal Papers 1. Y. He and Z. Chen (2014). A sequential procedure for feature selection
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Lasso on Categorical Data
Lasso on Categorical Data Yunjin Choi, Rina Park, Michael Seo December 14, 2012 1 Introduction In social science studies, the variables of interest are often categorical, such as race, gender, and nationality.
Functional Data Analysis of MALDI TOF Protein Spectra
Functional Data Analysis of MALDI TOF Protein Spectra Dean Billheimer [email protected]. Department of Biostatistics Vanderbilt University Vanderbilt Ingram Cancer Center FDA for MALDI TOF
Neuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease
1 Neuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease The following contains a summary of the content of the neuroimaging module I on the postgraduate
The Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
Nonparametric Time Series Analysis: A review of Peter Lewis contributions to the field
Nonparametric Time Series Analysis: A review of Peter Lewis contributions to the field Bonnie Ray IBM T. J. Watson Research Center Joint Statistical Meetings 2012 Outline Background My connection to Peter
DR. YUEXIAO DONG. Department of Statistics Fox School of Business Temple University 1810 N. 13th Street Philadelphia, PA 19122
Department of Statistics Fox School of Business Temple University 1810 N. 13th Street Philadelphia, PA 19122 Phone: (1-)215-204-0670 Email: [email protected] http://astro.temple.edu/ ydong/index.html Employment
Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering
Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate
BayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University [email protected]
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University [email protected] 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
BIG DATA AND HIGH DIMENSIONAL DATA ANALYSIS
BIG DATA AND HIGH DIMENSIONAL DATA ANALYSIS B.L.S. Prakasa Rao CR RAO ADVANCED INSTITUTE OF MATHEMATICS, STATISTICS AND COMPUTER SCIENCE (AIMSCS) University of Hyderabad Campus GACHIBOWLI, HYDERABAD 500046
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
Applications to Data Smoothing and Image Processing I
Applications to Data Smoothing and Image Processing I MA 348 Kurt Bryan Signals and Images Let t denote time and consider a signal a(t) on some time interval, say t. We ll assume that the signal a(t) is
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph
Unsupervised and supervised dimension reduction: Algorithms and connections
Unsupervised and supervised dimension reduction: Algorithms and connections Jieping Ye Department of Computer Science and Engineering Evolutionary Functional Genomics Center The Biodesign Institute Arizona
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
The Wondrous World of fmri statistics
Outline The Wondrous World of fmri statistics FMRI data and Statistics course, Leiden, 11-3-2008 The General Linear Model Overview of fmri data analysis steps fmri timeseries Modeling effects of interest
Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure
Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure Belyaev Mikhail 1,2,3, Burnaev Evgeny 1,2,3, Kapushev Yermek 1,2 1 Institute for Information Transmission
Bootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
Lasso-based Spam Filtering with Chinese Emails
Journal of Computational Information Systems 8: 8 (2012) 3315 3322 Available at http://www.jofcis.com Lasso-based Spam Filtering with Chinese Emails Zunxiong LIU 1, Xianlong ZHANG 1,, Shujuan ZHENG 2 1
Yerevan State University, Yerevan, Armenia B.S., Applied Mathematics and Computer Science, 2004
Ani Eloyan Contact Information Department of Biostatistics Bloomberg School of Public Health Johns Hopkins University 615 North Wolfe Street, E3640 Baltimore, MD, 21205 Email: [email protected] Phone: 410-502-2988
U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN 2286 3540
U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN 2286 3540 ENTERPRISE FINANCIAL DISTRESS PREDICTION BASED ON BACKWARD PROPAGATION NEURAL NETWORK: AN EMPIRICAL STUDY ON THE CHINESE LISTED EQUIPMENT
Big Data, Statistics, and the Internet
Big Data, Statistics, and the Internet Steven L. Scott April, 4 Steve Scott (Google) Big Data, Statistics, and the Internet April, 4 / 39 Summary Big data live on more than one machine. Computing takes
Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models
Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear
203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
HT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
Cancer Biostatistics Workshop Science of Doing Science - Biostatistics
Cancer Biostatistics Workshop Science of Doing Science - Biostatistics Yu Shyr, PhD Jan. 18, 2008 Cancer Biostatistics Center Vanderbilt-Ingram Cancer Center [email protected] Aims Cancer Biostatistics
Network Analysis I & II
Network Analysis I & II Dani S. Bassett Department of Physics University of California Santa Barbara Outline Lecture One: 1. Complexity in the Human Brain from processes to patterns 2. Graph Theory and
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning
Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step
Efficient variable selection in support vector machines via the alternating direction method of multipliers
Efficient variable selection in support vector machines via the alternating direction method of multipliers Gui-Bo Ye Yifei Chen Xiaohui Xie University of California, Irvine University of California, Irvine
A Composite Likelihood Approach to Analysis of Survey Data with Sampling Weights Incorporated under Two-Level Models
A Composite Likelihood Approach to Analysis of Survey Data with Sampling Weights Incorporated under Two-Level Models Grace Y. Yi 13, JNK Rao 2 and Haocheng Li 1 1. University of Waterloo, Waterloo, Canada
Statistical Learning THE TENTH EUGENE LUKACS SYMPOSIUM DEPARTMENT OF MATHEMATICS AND STATISTICS BOWLING GREEN STATE UNIVERSITY
Statistical Learning THE TENTH EUGENE LUKACS SYMPOSIUM DEPARTMENT OF MATHEMATICS AND STATISTICS BOWLING GREEN STATE UNIVERSITY MAY 14, 2016 The 10th Eugene Lukacs Symposium Program Schedule (All symposium
DATA SCIENCE Workshop November 12-13, 2015
DATA SCIENCE Workshop November 12-13, 2015 Paris-Dauphine University Place du Maréchal de Lattre de Tassigny, 75016 Paris DATA SCIENCE Workshop will be held at Paris-Dauphine university, on November 12
Iterative Solvers for Linear Systems
9th SimLab Course on Parallel Numerical Simulation, 4.10 8.10.2010 Iterative Solvers for Linear Systems Bernhard Gatzhammer Chair of Scientific Computing in Computer Science Technische Universität München
Statistical issues in the analysis of microarray data
Statistical issues in the analysis of microarray data Daniel Gerhard Institute of Biostatistics Leibniz University of Hannover ESNATS Summerschool, Zermatt D. Gerhard (LUH) Analysis of microarray data
Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015
Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Lecture: MWF: 1:00-1:50pm, GEOLOGY 4645 Instructor: Mihai
Detection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
List of Ph.D. Courses
Research Methods Courses (5 courses/15 hours) List of Ph.D. Courses The research methods set consists of five courses (15 hours) that discuss the process of research and key methodological issues encountered
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott
Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues
Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the
How To Understand The Theory Of Probability
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected].
Medical Image Processing on the GPU Past, Present and Future Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected] Outline Motivation why do we need GPUs? Past - how was GPU programming
Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data
Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA
Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
Machine Learning in FX Carry Basket Prediction
Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
Examples. David Ruppert. April 25, 2009. Cornell University. Statistics for Financial Engineering: Some R. Examples. David Ruppert.
Cornell University April 25, 2009 Outline 1 2 3 4 A little about myself BA and MA in mathematics PhD in statistics in 1977 taught in the statistics department at North Carolina for 10 years have been in
Treelets A Tool for Dimensionality Reduction and Multi-Scale Analysis of Unstructured Data
Treelets A Tool for Dimensionality Reduction and Multi-Scale Analysis of Unstructured Data Ann B Lee Department of Statistics Carnegie Mellon University Pittsburgh, PA 56 annlee@statcmuedu Abstract In
Principles of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
University of California San Francisco, CA, USA 4 Department of Veterans Affairs Medical Center, San Francisco, CA, USA
Disrupted Brain Connectivity in Alzheimer s Disease: Effects of Network Thresholding Madelaine Daianu 1, Emily L. Dennis 1, Neda Jahanshad 1, Talia M. Nir 1, Arthur W. Toga 1, Clifford R. Jack, Jr. 2,
Identifying Group-wise Consistent White Matter Landmarks via Novel Fiber Shape Descriptor
Identifying Group-wise Consistent White Matter Landmarks via Novel Fiber Shape Descriptor Hanbo Chen, Tuo Zhang, Tianming Liu the University of Georgia, US. Northwestern Polytechnical University, China.
Statistical Considerations in Magnetic Resonance Imaging of Brain Function
Statistical Considerations in Magnetic Resonance Imaging of Brain Function Brian D. Ripley Professor of Applied Statistics University of Oxford [email protected] http://www.stats.ox.ac.uk/ ripley Acknowledgements
runl I IUI%I/\L Magnetic Resonance Imaging
runl I IUI%I/\L Magnetic Resonance Imaging SECOND EDITION Scott A. HuetteS Brain Imaging and Analysis Center, Duke University Allen W. Song Brain Imaging and Analysis Center, Duke University Gregory McCarthy
the points are called control points approximating curve
Chapter 4 Spline Curves A spline curve is a mathematical representation for which it is easy to build an interface that will allow a user to design and control the shape of complex curves and surfaces.
http://www.springer.com/978-0-387-71392-2
Preface This book, like many other books, was delivered under tremendous inspiration and encouragement from my teachers, research collaborators, and students. My interest in longitudinal data analysis
Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall
Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
SVM-Based Approaches for Predictive Modeling of Survival Data
SVM-Based Approaches for Predictive Modeling of Survival Data Han-Tai Shiao and Vladimir Cherkassky Department of Electrical and Computer Engineering University of Minnesota, Twin Cities Minneapolis, Minnesota
Gaussian Processes in Machine Learning
Gaussian Processes in Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany [email protected] WWW home page: http://www.tuebingen.mpg.de/ carl
Marketing Mix Modelling and Big Data P. M Cain
1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored
Accurate and robust image superresolution by neural processing of local image representations
Accurate and robust image superresolution by neural processing of local image representations Carlos Miravet 1,2 and Francisco B. Rodríguez 1 1 Grupo de Neurocomputación Biológica (GNB), Escuela Politécnica
Introduction to Machine Learning Using Python. Vikram Kamath
Introduction to Machine Learning Using Python Vikram Kamath Contents: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Introduction/Definition Where and Why ML is used Types of Learning Supervised Learning Linear Regression
Network Intrusion Detection using Semi Supervised Support Vector Machine
Network Intrusion Detection using Semi Supervised Support Vector Machine Jyoti Haweliya Department of Computer Engineering Institute of Engineering & Technology, Devi Ahilya University Indore, India ABSTRACT
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
BLOCKWISE SPARSE REGRESSION
Statistica Sinica 16(2006), 375-390 BLOCKWISE SPARSE REGRESSION Yuwon Kim, Jinseog Kim and Yongdai Kim Seoul National University Abstract: Yuan an Lin (2004) proposed the grouped LASSO, which achieves
Comparing Functional Data Analysis Approach and Nonparametric Mixed-Effects Modeling Approach for Longitudinal Data Analysis
Comparing Functional Data Analysis Approach and Nonparametric Mixed-Effects Modeling Approach for Longitudinal Data Analysis Hulin Wu, PhD, Professor (with Dr. Shuang Wu) Department of Biostatistics &
Modeling and Prediction of Network Traffic Based on Hybrid Covariance Function Gaussian Regressive
Journal of Information & Computational Science 12:9 (215) 3637 3646 June 1, 215 Available at http://www.joics.com Modeling and Prediction of Network Traffic Based on Hybrid Covariance Function Gaussian
Tensor Factorization for Multi-Relational Learning
Tensor Factorization for Multi-Relational Learning Maximilian Nickel 1 and Volker Tresp 2 1 Ludwig Maximilian University, Oettingenstr. 67, Munich, Germany [email protected] 2 Siemens AG, Corporate
So which is the best?
Manifold Learning Techniques: So which is the best? Todd Wittman Math 8600: Geometric Data Analysis Instructor: Gilad Lerman Spring 2005 Note: This presentation does not contain information on LTSA, which
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
The primary goal of this thesis was to understand how the spatial dependence of
5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
Factor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers
Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications
