Model-free Functional Data Analysis MELODIC Multivariate Exploratory Linear Optimised Decomposition into Independent Components
|
|
|
- Brice Price
- 10 years ago
- Views:
Transcription
1 Model-free Functional Data Analysis MELODIC Multivariate Exploratory Linear Optimised Decomposition into Independent Components decomposes data into a set of statistically independent spatial component maps and associated time courses can perform multi-subject/ multi-session analysis fully automated (incl. estimation of the number of components) inference on IC maps using alternative hypothesis testing The FMRI inferential path Experiment Interpretation of final results Physiology Analysis MR Physics Variability in FMRI Experiment Interpretation of final results suboptimal event timing, Physiology inefficient design, etc. secondary activation, illdefined baseline, restingfluctuations etc. Analysis filtering & sampling artefacts, design misspecification, stats & thresholding issues etc. MR Physics MR noise, field inhomogeneity, MR artefacts etc. 1
2 Model-based (GLM) analysis = + model each measured time-series as a linear combination of signal and noise If the design matrix does not capture every signal, we typically get wrong inferences! Data Analysis Confirmatory How well does my model fit to the data? Problem Model Results Data Analysis results depend on the model Exploratory Is there anything interesting in the data? Problem Analysis Results Data Model can give unexpected results EDA techniques try to explain / represent the data by calculating quantities that summarise the data by extracting underlying hidden features that are interesting differ in what is considered interesting are localised in time and/or space (Clustering) explain observed data variance (PCA, FDA, FA) are maximally independent (ICA) typically are multivariate and linear 2
3 EDA techniques for FMRI are mostly multivariate often provide a multivariate linear decomposition: space # maps" space time" Scan #k FMRI data" =" time" # maps" spatial maps Data is represented as a 2D matrix and decomposed into factor matrices (or modes) What are components? + + express observed data as linear combination of spatio-temporal processes techniques differ in the way data is represented by components PCA for FMRI 1. assemble all (de-meaned) data as a matrix 2. calculate the data covariance matrix 3. calculate SVD 4. project data onto the Eigenvectors 3
4 PCA for FMRI 1. assemble all (de-meaned) data as a matrix 2. calculate the data covariance matrix 3. calculate SVD 4. project data onto the Eigenvectors PCA for FMRI 1. assemble all (de-meaned) data as a matrix 2. calculate the data covariance matrix 3. calculate SVD 4. project data onto the Eigenvectors time" # PCs" PCA for FMRI 1. assemble all (de-meaned) data as a matrix 2. calculate the data covariance matrix 3. calculate SVD 4. project data onto the Eigenvectors 4
5 Correlation vs. independece de-correlated signals can still be dependent higher-order statistics (beyond mean and variance) can reveal these dependencies Stone et al PCA vs. ICA PCA finds projections of maximum amount of variance in Gaussian data (uses 2nd order statistics only) Independent Component Analysis (ICA) finds projections of maximal independence in non- Gaussian data (using higher-order statistics) non-gaussian data data Spatial ICA for FMRI space # ICs" space time" Scan #k FMRI data" =" time" # ICs" spatial maps impose spatial independence data is decomposed into a set of spatially independent maps and a set of time-courses McKeown et al. HBM
6 PCA vs. ICA? Simulated Data (2 components, slightly different timecourses) PCA Timecourses orthogonal Spatial maps and timecourses wrong ICA Timecourses nonco-linear Spatial maps and timecourses right The overfitting problem fitting a noise-free model to noisy observations: no control over signal vs. noise (non-interpretable results) statistical significance testing not possible GLM analysis standard ICA (unconstrained) Probabilistic ICA model statistical latent variables model: we observe linear mixtures of hidden sources in the presence of Gaussian noise Issues: Model Order Selection: how many components? Component estimation: how to find ICs? Inference: how to threshold ICs? 6
7 Model Order Selection The sample covariance matrix has a Wishart distribution and we can calculate the empirical distribution function for the Everson eigenvalues & Roberts, IEEE Trans. Sig. Proc. 48(7), 2000 Empirical distribution function observed Eigenspectrum Model Order Selection under-fitting optimal fit over-fitting 15 under-fitting: the amount of explained data variance is insufficient to obtain good estimates of the signals 33 optimal fitting: the amount of explained data variance is sufficient to obtain good estimates of the signals while preventing further splits into spurious components #components observed Eigenspectrum of the data covariance matrix Laplace approximation of the posterior probability of the model order theoretical Eigenspectrum from Gaussian noise 165 over-fitting: the inclusion of too many components leads to fragmentation of signal across multiple component maps, reducing the ability to identify the signals of interest Variance Normalisation we might choose to ignore temporal autocorrelation in the EPI time-series but: generally need to normalise by the voxel-wise variance Estimated voxel-wise std. deviation (log-scale) from FMRI data obtained under rest condition. 7
8 PCA tissue-type bias voxel-wise temporal standard deviations 100 (log) - frequency of voxels appearing in major PCA space Without normalisation PCA is biased towards tissue exhibiting strong temporal variation ICA estimation need to find an unmixing matrix such that the dependency between estimated sources is minimised need (i) a contrast (objective/cost) function to drive the unmixing which measures statistical independence and (ii) an optimisation technique: kurtosis or cumulants & gradient descent (Jade) maximum entropy & gradient descent (Infomax) neg-entropy & fixed point iteration (FastICA) Non-Gaussianity mixing sources mixtures 8
9 Non-Gaussianity mixing non-gaussian Gaussian ICA estimation Random mixing results in more Gaussian-shaped PDFs (Central Limit Theorem) conversely: if mixing matrix produces less Gaussian-shaped PDFs this is unlikely to be a random result measure non-gaussianity can use neg-entropy as a measure of non- Gaussianity Hyvärinen & Oja 1997 Thresholding classical null-hypothesis testing is invalid After estimation, the spatial maps are a projection of data on to the unmixing matrix data is assumed to be a linear combination of signals and noise the distribution of the estimated spatial maps is a mixture distribution! 9
10 Alternative Hypothesis Test use Gaussian/Gamma mixture model fitted to the histogram of intensity values (using EM) Full PICA model Beckmann and Smith IEEE TMI 2004 Probabilistic ICA GLM analysis standard ICA (unconstrained) probabilistic ICA designed to address the overfitting problem : tries to avoid generation of spurious results high spatial sensitivity and specificity 10
11 Applications EDA techniques can be useful to investigate the BOLD response estimate artefacts in the data find areas of activation which respond in a nonstandard way analyse data for which no model of the BOLD response is available Applications EDA techniques can be useful to investigate the BOLD response estimate artefacts in the data find areas of activation which respond in a nonstandard way analyse data for which no model of the BOLD response is available Investigate BOLD response estimated signal time course standard hrf model 11
12 Applications EDA techniques can be useful to investigate the BOLD response estimate artefacts in the data find areas of activation which respond in a nonstandard way analyse data for which no model of the BOLD response is available Artefact detection FMRI data contain a variety of source processes Artifactual sources typically have unknown spatial and temporal extent and cannot easily be modelled accurately Exploratory techniques do not require a priori knowledge of time-courses and spatial maps slice drop-outs 12
13 gradient instability EPI ghost high-frequency noise 13
14 head motion field inhomogeneity eye-related artefacts Wrap around 14
15 Structured Noise and the GLM structured noise appears: in the GLM residuals - inflates variance estimates (more false negatives) in the parameter estimates (more false positives and/or false negatives) In either case leads to suboptimal estimates and wrong inference! Correlations of the noise time courses with typical FMRI regressors can cause a shift in the histogram of the Z- statistics Thresholded maps will have wrong false-positive rate Structured noise and GLM Z-stats bias 15
16 Denoising FMRI before denoising Example: left vs right hand finger tapping LEFT - RIGHT LEFT contrast after denoising Apparent variability McGonigle et al.: 33 Sessions under motor paradigm de-noising data by regressing out noise: reduced apparent session variability Applications EDA techniques can be useful to investigate the BOLD response estimate artefacts in the data find areas of activation which respond in a nonstandard way analyse data for which no model of the BOLD response is available 16
17 Example of nonstandard temporal response use EPI readout gradient noise to evoke auditory responses in patients before cochlear implant surgery Bartsch et al. HBM 2004 Applications EDA techniques can be useful to investigate the BOLD response estimate artefacts in the data find areas of activation which respond in a nonstandard way analyse data for which no model of the BOLD response is available PICA on resting data perform ICA on null data and compare spatial maps between subjects/scans ICA maps depict spatially localised and temporally coherent signal changes Example: ICA maps - 1 subject at 3 different sessions 17
18 Multi-Subject ICA Multi-level GLMs group subject 1 subject 2... subject N Higher-level: between subjects First-level: within session Different ICA models Single-Session ICA each ICA component comprises: spatial map & timecourse Multi-Session or Multi-Subject ICA: Concatenation approach each ICA component comprises: spatial map & timecourse (that can be split up into subject-specific chunks) Multi-Session or Multi-Subject ICA: Tensor-ICA approach each ICA component comprises: spatial map, session-long-timecourse & subject-strength plot 18
19 Different ICA models Single-Session ICA each ICA component comprises: spatial map & timecourse Multi-Session or Multi-Subject ICA: Concatenation approach good when: each subject has DIFFERENT timeseries e.g. resting-state FMRI Multi-Session or Multi-Subject ICA: Tensor-ICA approach good when: each subject has SAME timeseries e.g. activation FMRI # subjects" time" space space Scan #k ICA Group analysis Extend single ICA to higher dimensions FMRI data" =" time" # ICs" # subjects" # ICs" space spatial maps Tri-linear Bi-linear model Tensor-ICA multi-session example 10 sessions under motor paradigm (right index finger tapping) Group-level GLM results (mixedeffects) McGonigle et al. NI
20 Tensor-ICA multi-session example tensor-ica map Tensor-ICA multi-session example estimated de-activation Tensor-ICA multi-session example residual stimulus correlated motion tensor-pica map Z-stats map (session 9) Group-GLM (ME) Z-stats map 20
21 Tensor-ICA multi-subject example reaction-time task (index vs. sequential vs. random finger movement r > 0.84 regression fit of GLM model against time course GLM on associated timecourse shows significant (p<0.01) differences: I < S < R Tensor-ICA simultaneously estimates processes in the temporal & spatial & session/subject domain full mixed-effects model can use parametric stats on estimated time courses and session/subject modes robust statistics: can deal with outlier processes MELODIC 3 ICA-based methodology for multi-subject RSN analysis Why not just run ICA on each subject separately? Correspondence problem (of RSNs across subjects) Different splittings sometimes caused by small changes in the data (naughty ICA!) Instead - start with a group-average ICA But then need to relate group maps back to the individual subjects 21
22 Methodology for multi-subject RSN analysis 1: Concat-ICA Concatenate all subjects data temporally (Actually: group-based PCA reduction on each subject, then concat, then PCA reduction) Then run ICA More appropriate than tensor ICA (for RSNs) voxels #components voxels time time Subject 1 Subject 2.. time = group mixing matrix.. #components group ICA maps Methodology for multi-subject RSN analysis2: Dual Regression For each subject separately: Take all Concat-ICA spatial maps and use with the subject s data in GLM: Gives subject-specific timecourses Take these timecoures and use with the subject s data in GLM: Gives subject-specific spatial maps (that look similar to the group maps but are subjectspecific) Can now do voxelwise testing across subjects, separately for each original group ICA map Can choose to look at strength-and-shape differences Methodology for multi-subject RSN analysis2: Dual Regression Regress spatial ICs into each subject s 4D data to find subject-specific timecourses regress these back into the 4D data to find subjectspecific spatial ICs associated with the group ICs voxels time #components voxels FMRI data 1 voxels voxels #components group ICA maps = #components time time courses 1 time FMRI data 1 time time courses 1 = #components spatial maps 1 22
23 Methodology for multi-subject RSN analysis2: Dual Regression Collect maps and perform voxel-wise test (e.g. randomisation test on GLM) #subjects voxels collected components #subjects #EVs = #EVs voxels group difference.. Can now do voxelwise testing across subjects, separately for each original group ICA map Can choose to look at strength-and-shape differences 1. Separate-subject ICAs E.g. SOG-ICA - BrainVoyager (Esposito, NeuroImage, 2005) Separate ICA for each subject can add robustness via multiple runs ( ICASSO ) Cluster components across subjects, to achieve robust matching of any given RSN across subjects Good: Keeps benefits of single-subject ICA - better modelling/ ignoring of structured noise in data Bad: Correspondence problem, in particular different splittings in different subjects caused by even small changes in the data; PCA bias! 2. Group-ICA and Back-Projection E.g. GICA - GIFT (Calhoun, HBM, 2001) Separate PCA (dimensionality reduction) for each subject Concat PCA-output across subjects and do group-ica Back-project (invert) ICA results to get individual subject maps Good: No correspondence problem Bad: Lose benefits of single-subject ICA PCA-bias 23
24 3. Group-ICA and Dual-Regression E.g. MELODIC+dual-reg - FSL (Beckmann, OHBM, 2009) Group-average PCA (dimensionality reduction) Project each subject onto reduced group-average-pca-space Concat PCA-output across subjects and do group-ica Regress group-ica maps onto individual subject datasets to get individual subject maps Good: No correspondence problem, no group/pca bias Bad: Lose benefits of single-subject ICA That s all folks 24
FMRI Group Analysis GLM. Voxel-wise group analysis. Design matrix. Subject groupings. Group effect size. statistics. Effect size
FMRI Group Analysis Voxel-wise group analysis Standard-space brain atlas Subject groupings Design matrix 1 1 1 1 1 1 1 1 1 1 1 1 subjects Single-subject Single-subject effect size Single-subject effect
The Wondrous World of fmri statistics
Outline The Wondrous World of fmri statistics FMRI data and Statistics course, Leiden, 11-3-2008 The General Linear Model Overview of fmri data analysis steps fmri timeseries Modeling effects of interest
Component Ordering in Independent Component Analysis Based on Data Power
Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals
Advances in Functional and Structural MR Image Analysis and Implementation as FSL Technical Report TR04SS2
Advances in Functional and Structural MR Image Analysis and Implementation as FSL Technical Report TR04SS2 Stephen M. Smith, Mark Jenkinson, Mark W. Woolrich, Christian F. Beckmann, Timothy E.J. Behrens,
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
Neuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease
1 Neuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease The following contains a summary of the content of the neuroimaging module I on the postgraduate
Multivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
Multivariate analyses & decoding
Multivariate analyses & decoding Kay Henning Brodersen Computational Neuroeconomics Group Department of Economics, University of Zurich Machine Learning and Pattern Recognition Group Department of Computer
ADVANCES IN INDEPENDENT COMPONENT ANALYSIS WITH APPLICATIONS TO DATA MINING
Helsinki University of Technology Dissertations in Computer and Information Science Espoo 2003 Report D4 ADVANCES IN INDEPENDENT COMPONENT ANALYSIS WITH APPLICATIONS TO DATA MINING Ella Bingham Dissertation
Principal Component Analysis
Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
Statistical Considerations in Magnetic Resonance Imaging of Brain Function
Statistical Considerations in Magnetic Resonance Imaging of Brain Function Brian D. Ripley Professor of Applied Statistics University of Oxford [email protected] http://www.stats.ox.ac.uk/ ripley Acknowledgements
Common factor analysis
Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor
Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected].
Medical Image Processing on the GPU Past, Present and Future Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected] Outline Motivation why do we need GPUs? Past - how was GPU programming
APPLIED MISSING DATA ANALYSIS
APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview
How To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
runl I IUI%I/\L Magnetic Resonance Imaging
runl I IUI%I/\L Magnetic Resonance Imaging SECOND EDITION Scott A. HuetteS Brain Imaging and Analysis Center, Duke University Allen W. Song Brain Imaging and Analysis Center, Duke University Gregory McCarthy
Review Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder
APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
Factor Analysis. Chapter 420. Introduction
Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.
Overview of Factor Analysis
Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,
Detecting Network Anomalies. Anant Shah
Detecting Network Anomalies using Traffic Modeling Anant Shah Anomaly Detection Anomalies are deviations from established behavior In most cases anomalies are indications of problems The science of extracting
Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics
INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree
High Dimensional Data Analysis with Applications in IMS and fmri Processing
High Dimensional Data Analysis with Applications in IMS and fmri Processing Don Hong Department of Mathematical Sciences Center for Computational Sciences Middle Tennessee State University Murfreesboro,
How To Understand Multivariate Models
Neil H. Timm Applied Multivariate Analysis With 42 Figures Springer Contents Preface Acknowledgments List of Tables List of Figures vii ix xix xxiii 1 Introduction 1 1.1 Overview 1 1.2 Multivariate Models
Multiple Regression: What Is It?
Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in
Statistical issues in the analysis of microarray data
Statistical issues in the analysis of microarray data Daniel Gerhard Institute of Biostatistics Leibniz University of Hannover ESNATS Summerschool, Zermatt D. Gerhard (LUH) Analysis of microarray data
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate
Unsupervised Data Mining (Clustering)
Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in
Factor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
BayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not
Accurate and robust image superresolution by neural processing of local image representations
Accurate and robust image superresolution by neural processing of local image representations Carlos Miravet 1,2 and Francisco B. Rodríguez 1 1 Grupo de Neurocomputación Biológica (GNB), Escuela Politécnica
Understanding and Applying Kalman Filtering
Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU
PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard
Quantitative Methods for Finance
Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
Lecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
fmri 實 驗 設 計 與 統 計 分 析 簡 介 Introduction to fmri Experiment Design & Statistical Analysis
成 功 大 學 心 智 影 像 研 究 中 心 功 能 性 磁 振 造 影 工 作 坊 fmri 實 驗 設 計 與 統 計 分 析 簡 介 Introduction to fmri Experiment Design & Statistical Analysis 陳 德 祐 7/5/2013 成 功 大 學. 國 際 會 議 廳 Primary Reference: Functional Magnetic
Geostatistics Exploratory Analysis
Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras [email protected]
Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu
Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data
Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data Neil D. Lawrence Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield,
Introduction to Longitudinal Data Analysis
Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction
Software Packages The following data analysis software packages will be showcased:
Analyze This! Practicalities of fmri and Diffusion Data Analysis Data Download Instructions Weekday Educational Course, ISMRM 23 rd Annual Meeting and Exhibition Tuesday 2 nd June 2015, 10:00-12:00, Room
Principal components analysis
CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x R n as approximately lying in some k-dimension subspace, where k
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
How To Understand The Theory Of Probability
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
T-test & factor analysis
Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue
Introduction to Principal Components and FactorAnalysis
Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
Mathematics within the Psychology Curriculum
Mathematics within the Psychology Curriculum Statistical Theory and Data Handling Statistical theory and data handling as studied on the GCSE Mathematics syllabus You may have learnt about statistics and
Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016
and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings
Singular Spectrum Analysis with Rssa
Singular Spectrum Analysis with Rssa Maurizio Sanarico Chief Data Scientist SDG Consulting Milano R June 4, 2014 Motivation Discover structural components in complex time series Working hypothesis: a signal
Introduction. Karl J Friston
Introduction Experimental design and Statistical Parametric Mapping Karl J Friston The Wellcome Dept. of Cognitive Neurology, University College London Queen Square, London, UK WCN 3BG Tel 44 00 7833 7456
Exploratory Data Analysis with MATLAB
Computer Science and Data Analysis Series Exploratory Data Analysis with MATLAB Second Edition Wendy L Martinez Angel R. Martinez Jeffrey L. Solka ( r ec) CRC Press VV J Taylor & Francis Group Boca Raton
Automatic parameter regulation for a tracking system with an auto-critical function
Automatic parameter regulation for a tracking system with an auto-critical function Daniela Hall INRIA Rhône-Alpes, St. Ismier, France Email: [email protected] Abstract In this article we propose
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
Partial Least Squares (PLS) Regression.
Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component
How to report the percentage of explained common variance in exploratory factor analysis
UNIVERSITAT ROVIRA I VIRGILI How to report the percentage of explained common variance in exploratory factor analysis Tarragona 2013 Please reference this document as: Lorenzo-Seva, U. (2013). How to report
NONLINEAR TIME SERIES ANALYSIS
NONLINEAR TIME SERIES ANALYSIS HOLGER KANTZ AND THOMAS SCHREIBER Max Planck Institute for the Physics of Complex Sy stems, Dresden I CAMBRIDGE UNIVERSITY PRESS Preface to the first edition pug e xi Preface
BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
Detection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
Subjects: Fourteen Princeton undergraduate and graduate students were recruited to
Supplementary Methods Subjects: Fourteen Princeton undergraduate and graduate students were recruited to participate in the study, including 9 females and 5 males. The mean age was 21.4 years, with standard
Organizing Your Approach to a Data Analysis
Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize
Multivariate Analysis. Overview
Multivariate Analysis Overview Introduction Multivariate thinking Body of thought processes that illuminate the interrelatedness between and within sets of variables. The essence of multivariate thinking
Multivariate Statistical Inference and Applications
Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim
Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models
Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear
HT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models
Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis
Computational Optical Imaging - Optique Numerique. -- Deconvolution --
Computational Optical Imaging - Optique Numerique -- Deconvolution -- Winter 2014 Ivo Ihrke Deconvolution Ivo Ihrke Outline Deconvolution Theory example 1D deconvolution Fourier method Algebraic method
Bootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 3: DATA TRANSFORMATION AND DIMENSIONALITY REDUCTION Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major
An introduction to OBJECTIVE ASSESSMENT OF IMAGE QUALITY. Harrison H. Barrett University of Arizona Tucson, AZ
An introduction to OBJECTIVE ASSESSMENT OF IMAGE QUALITY Harrison H. Barrett University of Arizona Tucson, AZ Outline! Approaches to image quality! Why not fidelity?! Basic premises of the task-based approach!
Why do statisticians "hate" us?
Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data
A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH [email protected]
A Regime-Switching Model for Electricity Spot Prices Gero Schindlmayr EnBW Trading GmbH [email protected] May 31, 25 A Regime-Switching Model for Electricity Spot Prices Abstract Electricity markets
STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION
STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION Adiel Ben-Shalom, Michael Werman School of Computer Science Hebrew University Jerusalem, Israel. {chopin,werman}@cs.huji.ac.il
Subspace Analysis and Optimization for AAM Based Face Alignment
Subspace Analysis and Optimization for AAM Based Face Alignment Ming Zhao Chun Chen College of Computer Science Zhejiang University Hangzhou, 310027, P.R.China [email protected] Stan Z. Li Microsoft
Comparison of two exploratory data analysis methods for fmri: fuzzy clustering vs. principal component analysis
Magnetic Resonance Imaging 18 (2000) 89 94 Technical Note Comparison of two exploratory data analysis methods for fmri: fuzzy clustering vs. principal component analysis R. Baumgartner, L. Ryner, W. Richter,
Week 1. Exploratory Data Analysis
Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam
Exploratory data analysis for microarray data
Eploratory data analysis for microarray data Anja von Heydebreck Ma Planck Institute for Molecular Genetics, Dept. Computational Molecular Biology, Berlin, Germany [email protected] Visualization
CS 688 Pattern Recognition Lecture 4. Linear Models for Classification
CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(
Numerical Methods For Image Restoration
Numerical Methods For Image Restoration CIRAM Alessandro Lanza University of Bologna, Italy Faculty of Engineering CIRAM Outline 1. Image Restoration as an inverse problem 2. Image degradation models:
Linear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
MIMO CHANNEL CAPACITY
MIMO CHANNEL CAPACITY Ochi Laboratory Nguyen Dang Khoa (D1) 1 Contents Introduction Review of information theory Fixed MIMO channel Fading MIMO channel Summary and Conclusions 2 1. Introduction The use
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
