Model-free Functional Data Analysis MELODIC Multivariate Exploratory Linear Optimised Decomposition into Independent Components



Similar documents
FMRI Group Analysis GLM. Voxel-wise group analysis. Design matrix. Subject groupings. Group effect size. statistics. Effect size

The Wondrous World of fmri statistics

Component Ordering in Independent Component Analysis Based on Data Power

Advances in Functional and Structural MR Image Analysis and Implementation as FSL Technical Report TR04SS2

Statistics Graduate Courses

STA 4273H: Statistical Machine Learning

Neuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease

Multivariate Normal Distribution

Multivariate analyses & decoding

ADVANCES IN INDEPENDENT COMPONENT ANALYSIS WITH APPLICATIONS TO DATA MINING

Principal Component Analysis

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

Statistical Considerations in Magnetic Resonance Imaging of Brain Function

Common factor analysis

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute

APPLIED MISSING DATA ANALYSIS

How To Cluster

runl I IUI%I/\L Magnetic Resonance Imaging

Review Jeopardy. Blue vs. Orange. Review Jeopardy

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Factor Analysis. Chapter 420. Introduction

Overview of Factor Analysis

Detecting Network Anomalies. Anant Shah

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

High Dimensional Data Analysis with Applications in IMS and fmri Processing

How To Understand Multivariate Models

Multiple Regression: What Is It?

Statistical issues in the analysis of microarray data

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Unsupervised Data Mining (Clustering)

Factor analysis. Angela Montanari

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

BayesX - Software for Bayesian Inference in Structured Additive Regression

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

Accurate and robust image superresolution by neural processing of local image representations

Understanding and Applying Kalman Filtering

Statistical Machine Learning

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU

Quantitative Methods for Finance

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

fmri 實 驗 設 計 與 統 計 分 析 簡 介 Introduction to fmri Experiment Design & Statistical Analysis

Geostatistics Exploratory Analysis

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data

Introduction to Longitudinal Data Analysis

Software Packages The following data analysis software packages will be showcased:

Principal components analysis

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

How To Understand The Theory Of Probability

T-test & factor analysis

Introduction to Principal Components and FactorAnalysis

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Mathematics within the Psychology Curriculum

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

Singular Spectrum Analysis with Rssa

Introduction. Karl J Friston

Exploratory Data Analysis with MATLAB

Automatic parameter regulation for a tracking system with an auto-critical function

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Partial Least Squares (PLS) Regression.

How to report the percentage of explained common variance in exploratory factor analysis

NONLINEAR TIME SERIES ANALYSIS

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Detection of changes in variance using binary segmentation and optimal partitioning

Subjects: Fourteen Princeton undergraduate and graduate students were recruited to

Organizing Your Approach to a Data Analysis

Multivariate Analysis. Overview

Multivariate Statistical Inference and Applications

Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models

HT2015: SC4 Statistical Data Mining and Machine Learning

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Computational Optical Imaging - Optique Numerique. -- Deconvolution --

Bootstrapping Big Data

CS Introduction to Data Mining Instructor: Abdullah Mueen

An introduction to OBJECTIVE ASSESSMENT OF IMAGE QUALITY. Harrison H. Barrett University of Arizona Tucson, AZ

Why do statisticians "hate" us?

A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH

STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION

Subspace Analysis and Optimization for AAM Based Face Alignment

Comparison of two exploratory data analysis methods for fmri: fuzzy clustering vs. principal component analysis

Week 1. Exploratory Data Analysis

Exploratory data analysis for microarray data

CS 688 Pattern Recognition Lecture 4. Linear Models for Classification

Numerical Methods For Image Restoration

Linear Classification. Volker Tresp Summer 2015

MIMO CHANNEL CAPACITY

Environmental Remote Sensing GEOG 2021

Transcription:

Model-free Functional Data Analysis MELODIC Multivariate Exploratory Linear Optimised Decomposition into Independent Components decomposes data into a set of statistically independent spatial component maps and associated time courses can perform multi-subject/ multi-session analysis fully automated (incl. estimation of the number of components) inference on IC maps using alternative hypothesis testing The FMRI inferential path Experiment Interpretation of final results Physiology Analysis MR Physics Variability in FMRI Experiment Interpretation of final results suboptimal event timing, Physiology inefficient design, etc. secondary activation, illdefined baseline, restingfluctuations etc. Analysis filtering & sampling artefacts, design misspecification, stats & thresholding issues etc. MR Physics MR noise, field inhomogeneity, MR artefacts etc. 1

Model-based (GLM) analysis = + model each measured time-series as a linear combination of signal and noise If the design matrix does not capture every signal, we typically get wrong inferences! Data Analysis Confirmatory How well does my model fit to the data? Problem Model Results Data Analysis results depend on the model Exploratory Is there anything interesting in the data? Problem Analysis Results Data Model can give unexpected results EDA techniques try to explain / represent the data by calculating quantities that summarise the data by extracting underlying hidden features that are interesting differ in what is considered interesting are localised in time and/or space (Clustering) explain observed data variance (PCA, FDA, FA) are maximally independent (ICA) typically are multivariate and linear 2

EDA techniques for FMRI are mostly multivariate often provide a multivariate linear decomposition: space # maps" space time" Scan #k FMRI data" =" time" # maps" spatial maps Data is represented as a 2D matrix and decomposed into factor matrices (or modes) What are components? + + express observed data as linear combination of spatio-temporal processes techniques differ in the way data is represented by components PCA for FMRI 1. assemble all (de-meaned) data as a matrix 2. calculate the data covariance matrix 3. calculate SVD 4. project data onto the Eigenvectors 3

PCA for FMRI 1. assemble all (de-meaned) data as a matrix 2. calculate the data covariance matrix 3. calculate SVD 4. project data onto the Eigenvectors PCA for FMRI 1. assemble all (de-meaned) data as a matrix 2. calculate the data covariance matrix 3. calculate SVD 4. project data onto the Eigenvectors time" # PCs" PCA for FMRI 1. assemble all (de-meaned) data as a matrix 2. calculate the data covariance matrix 3. calculate SVD 4. project data onto the Eigenvectors 4

Correlation vs. independece de-correlated signals can still be dependent higher-order statistics (beyond mean and variance) can reveal these dependencies Stone et al. 2002 PCA vs. ICA PCA finds projections of maximum amount of variance in Gaussian data (uses 2nd order statistics only) Independent Component Analysis (ICA) finds projections of maximal independence in non- Gaussian data (using higher-order statistics) non-gaussian data data Spatial ICA for FMRI space # ICs" space time" Scan #k FMRI data" =" time" # ICs" spatial maps impose spatial independence data is decomposed into a set of spatially independent maps and a set of time-courses McKeown et al. HBM 1998 5

PCA vs. ICA? Simulated Data (2 components, slightly different timecourses) PCA Timecourses orthogonal Spatial maps and timecourses wrong ICA Timecourses nonco-linear Spatial maps and timecourses right The overfitting problem fitting a noise-free model to noisy observations: no control over signal vs. noise (non-interpretable results) statistical significance testing not possible GLM analysis standard ICA (unconstrained) Probabilistic ICA model statistical latent variables model: we observe linear mixtures of hidden sources in the presence of Gaussian noise Issues: Model Order Selection: how many components? Component estimation: how to find ICs? Inference: how to threshold ICs? 6

Model Order Selection The sample covariance matrix has a Wishart distribution and we can calculate the empirical distribution function for the Everson eigenvalues & Roberts, IEEE Trans. Sig. Proc. 48(7), 2000 Empirical distribution function observed Eigenspectrum Model Order Selection under-fitting optimal fit over-fitting 15 under-fitting: the amount of explained data variance is insufficient to obtain good estimates of the signals 33 optimal fitting: the amount of explained data variance is sufficient to obtain good estimates of the signals while preventing further splits into spurious components #components observed Eigenspectrum of the data covariance matrix Laplace approximation of the posterior probability of the model order theoretical Eigenspectrum from Gaussian noise 165 over-fitting: the inclusion of too many components leads to fragmentation of signal across multiple component maps, reducing the ability to identify the signals of interest Variance Normalisation we might choose to ignore temporal autocorrelation in the EPI time-series but: generally need to normalise by the voxel-wise variance Estimated voxel-wise std. deviation (log-scale) from FMRI data obtained under rest condition. 7

PCA tissue-type bias voxel-wise temporal standard deviations 100 (log) - frequency of voxels appearing in major PCA space 3 0 0 Without normalisation PCA is biased towards tissue exhibiting strong temporal variation ICA estimation need to find an unmixing matrix such that the dependency between estimated sources is minimised need (i) a contrast (objective/cost) function to drive the unmixing which measures statistical independence and (ii) an optimisation technique: kurtosis or cumulants & gradient descent (Jade) maximum entropy & gradient descent (Infomax) neg-entropy & fixed point iteration (FastICA) Non-Gaussianity mixing sources mixtures 8

Non-Gaussianity mixing non-gaussian Gaussian ICA estimation Random mixing results in more Gaussian-shaped PDFs (Central Limit Theorem) conversely: if mixing matrix produces less Gaussian-shaped PDFs this is unlikely to be a random result measure non-gaussianity can use neg-entropy as a measure of non- Gaussianity Hyvärinen & Oja 1997 Thresholding classical null-hypothesis testing is invalid After estimation, the spatial maps are a projection of data on to the unmixing matrix data is assumed to be a linear combination of signals and noise the distribution of the estimated spatial maps is a mixture distribution! 9

Alternative Hypothesis Test use Gaussian/Gamma mixture model fitted to the histogram of intensity values (using EM) Full PICA model Beckmann and Smith IEEE TMI 2004 Probabilistic ICA GLM analysis standard ICA (unconstrained) probabilistic ICA designed to address the overfitting problem : tries to avoid generation of spurious results high spatial sensitivity and specificity 10

Applications EDA techniques can be useful to investigate the BOLD response estimate artefacts in the data find areas of activation which respond in a nonstandard way analyse data for which no model of the BOLD response is available Applications EDA techniques can be useful to investigate the BOLD response estimate artefacts in the data find areas of activation which respond in a nonstandard way analyse data for which no model of the BOLD response is available Investigate BOLD response estimated signal time course standard hrf model 11

Applications EDA techniques can be useful to investigate the BOLD response estimate artefacts in the data find areas of activation which respond in a nonstandard way analyse data for which no model of the BOLD response is available Artefact detection FMRI data contain a variety of source processes Artifactual sources typically have unknown spatial and temporal extent and cannot easily be modelled accurately Exploratory techniques do not require a priori knowledge of time-courses and spatial maps slice drop-outs 12

gradient instability EPI ghost high-frequency noise 13

head motion field inhomogeneity eye-related artefacts Wrap around 14

Structured Noise and the GLM structured noise appears: in the GLM residuals - inflates variance estimates (more false negatives) in the parameter estimates (more false positives and/or false negatives) In either case leads to suboptimal estimates and wrong inference! Correlations of the noise time courses with typical FMRI regressors can cause a shift in the histogram of the Z- statistics Thresholded maps will have wrong false-positive rate Structured noise and GLM Z-stats bias 15

Denoising FMRI before denoising Example: left vs right hand finger tapping LEFT - RIGHT LEFT contrast after denoising Apparent variability McGonigle et al.: 33 Sessions under motor paradigm de-noising data by regressing out noise: reduced apparent session variability Applications EDA techniques can be useful to investigate the BOLD response estimate artefacts in the data find areas of activation which respond in a nonstandard way analyse data for which no model of the BOLD response is available 16

Example of nonstandard temporal response use EPI readout gradient noise to evoke auditory responses in patients before cochlear implant surgery Bartsch et al. HBM 2004 Applications EDA techniques can be useful to investigate the BOLD response estimate artefacts in the data find areas of activation which respond in a nonstandard way analyse data for which no model of the BOLD response is available PICA on resting data perform ICA on null data and compare spatial maps between subjects/scans ICA maps depict spatially localised and temporally coherent signal changes Example: ICA maps - 1 subject at 3 different sessions 17

Multi-Subject ICA Multi-level GLMs group subject 1 subject 2... subject N...... Higher-level: between subjects First-level: within session Different ICA models Single-Session ICA each ICA component comprises: spatial map & timecourse Multi-Session or Multi-Subject ICA: Concatenation approach each ICA component comprises: spatial map & timecourse (that can be split up into subject-specific chunks) Multi-Session or Multi-Subject ICA: Tensor-ICA approach each ICA component comprises: spatial map, session-long-timecourse & subject-strength plot 18

Different ICA models Single-Session ICA each ICA component comprises: spatial map & timecourse Multi-Session or Multi-Subject ICA: Concatenation approach good when: each subject has DIFFERENT timeseries e.g. resting-state FMRI Multi-Session or Multi-Subject ICA: Tensor-ICA approach good when: each subject has SAME timeseries e.g. activation FMRI # subjects" time" space space Scan #k ICA Group analysis Extend single ICA to higher dimensions FMRI data" =" time" # ICs" # subjects" # ICs" space spatial maps Tri-linear Bi-linear model Tensor-ICA multi-session example 10 sessions under motor paradigm (right index finger tapping) Group-level GLM results (mixedeffects) McGonigle et al. NI 2000 19

Tensor-ICA multi-session example tensor-ica map Tensor-ICA multi-session example estimated de-activation Tensor-ICA multi-session example residual stimulus correlated motion tensor-pica map Z-stats map (session 9) Group-GLM (ME) Z-stats map 20

Tensor-ICA multi-subject example reaction-time task (index vs. sequential vs. random finger movement r > 0.84 regression fit of GLM model against time course GLM on associated timecourse shows significant (p<0.01) differences: I < S < R Tensor-ICA simultaneously estimates processes in the temporal & spatial & session/subject domain full mixed-effects model can use parametric stats on estimated time courses and session/subject modes robust statistics: can deal with outlier processes MELODIC 3 ICA-based methodology for multi-subject RSN analysis Why not just run ICA on each subject separately? Correspondence problem (of RSNs across subjects) Different splittings sometimes caused by small changes in the data (naughty ICA!) Instead - start with a group-average ICA But then need to relate group maps back to the individual subjects 21

Methodology for multi-subject RSN analysis 1: Concat-ICA Concatenate all subjects data temporally (Actually: group-based PCA reduction on each subject, then concat, then PCA reduction) Then run ICA More appropriate than tensor ICA (for RSNs) voxels #components voxels time time Subject 1 Subject 2.. time = group mixing matrix.. #components group ICA maps Methodology for multi-subject RSN analysis2: Dual Regression For each subject separately: Take all Concat-ICA spatial maps and use with the subject s data in GLM: Gives subject-specific timecourses Take these timecoures and use with the subject s data in GLM: Gives subject-specific spatial maps (that look similar to the group maps but are subjectspecific) Can now do voxelwise testing across subjects, separately for each original group ICA map Can choose to look at strength-and-shape differences Methodology for multi-subject RSN analysis2: Dual Regression Regress spatial ICs into each subject s 4D data to find subject-specific timecourses regress these back into the 4D data to find subjectspecific spatial ICs associated with the group ICs voxels time #components voxels FMRI data 1 voxels voxels #components group ICA maps = #components time time courses 1 time FMRI data 1 time time courses 1 = #components spatial maps 1 22

Methodology for multi-subject RSN analysis2: Dual Regression Collect maps and perform voxel-wise test (e.g. randomisation test on GLM) #subjects voxels collected components #subjects #EVs = #EVs voxels group difference.. Can now do voxelwise testing across subjects, separately for each original group ICA map Can choose to look at strength-and-shape differences 1. Separate-subject ICAs E.g. SOG-ICA - BrainVoyager (Esposito, NeuroImage, 2005) Separate ICA for each subject can add robustness via multiple runs ( ICASSO ) Cluster components across subjects, to achieve robust matching of any given RSN across subjects Good: Keeps benefits of single-subject ICA - better modelling/ ignoring of structured noise in data Bad: Correspondence problem, in particular different splittings in different subjects caused by even small changes in the data; PCA bias! 2. Group-ICA and Back-Projection E.g. GICA - GIFT (Calhoun, HBM, 2001) Separate PCA (dimensionality reduction) for each subject Concat PCA-output across subjects and do group-ica Back-project (invert) ICA results to get individual subject maps Good: No correspondence problem Bad: Lose benefits of single-subject ICA PCA-bias 23

3. Group-ICA and Dual-Regression E.g. MELODIC+dual-reg - FSL (Beckmann, OHBM, 2009) Group-average PCA (dimensionality reduction) Project each subject onto reduced group-average-pca-space Concat PCA-output across subjects and do group-ica Regress group-ica maps onto individual subject datasets to get individual subject maps Good: No correspondence problem, no group/pca bias Bad: Lose benefits of single-subject ICA That s all folks 24