Part II Multivariate data. Jiří Militky Computer assisted statistical modeling in the textile research
|
|
- Stuart Elliott
- 7 years ago
- Views:
Transcription
1 Part II Multivariate data Jiří Militky Computer assisted statistical modeling in the textile research
2 Experimental data peculiarities Small samples (thumb rule- 100 points per feature) Non normal distribution Presence of heterogeneities and outliers Data oriented models creation Physical sense of data Uncertainty in model selection concentration K C=f(t) C eq time
3 Style of analysis Data exploration Simplification of data structures Interactive model selection Interpretation of results Depth contours :Multidimensional analog of the median
4 Z Primary data All data are compiled to one matrix called X (n x m) each column of X represents one feature (variable) each row of X represents one obect (i.e. observation as one point in time, or one person, one piece etc.) Obects Rows Y X Features Columns DATA MATRIX (n x m)
5 Data transformation I Linear centering, scaling, standardization x µ / σ Nonlinear - logaritmic transformation 1. Reduces extremes contribution. Reduces right skewness of data. 3. Stabilizes variance (remove heteroskedasticity) Rank - (values are replaced by their ranks). x ( x µ ) / σ
6 Linear transformation I Common PCA is based on the column centered data (covariance matrix). Standardization leads to correlation matrix R. Differences are caused by different weighting. For centered data are columns of X weighted" according to their length x i (standard deviation in original data). For standardized data are columns of X weighted" to unit length. Weights are here the same For feature in various units is suitable to use correlation matrix.
7 Linear transformation II Centering removes absolute term and then reduce number of variables. The configuration of data is not changed. Only origin is shifted. Standardization removes dependence on units and remove heteroskedasticity.it have influence on parameter estimates (weighted least squares). It is inappropriate for cases where are some features on the noise level (improves their importance).
8 Linear transformation III Centering. Standardization
9 Outline Basic problems due to multivariate data. Proections of correlated data Dimension reduction techniques.
10 virginica Dimensionality problem I setosa versicolor Basic characteristics of multivariate data is their dimension (number of elements). High dimensions bring about huge problems in their statistical analysis. Variables reduction variables have often variability on the noise level and can be therefore excluded from data (bring no information). There are some redundancies due to near linear dependencies between some variables or due to linkages arising from their physical essence. In both cases it is possible to replace the original set by reduced number of uncorrelated new variables.
11 Dimensionality problem II Multivariate curse number of data necessary for achieving the multivariate estimates precision is exponential function of number of variables. Empty space phenomena multivariate data are concentrated on the peripheral part of variables space. Distance problem distance between obects is often weighted by the strength of the mutual links between variables,
12 Multivariate exploratory analysis I For n obects (points) are defined m variables (features) expressed in cardinal scale.input data matrix X has dimension n m. It is standard that n is higher than m. Value m defines problem dimension (number of features). x.. x Aims 11 n1 x x 1.. n x x 1.. n x x 1m.. nm. (a) Assess obects similarity or clustering tendency, (b) Retrieve outliers, or features of outliers, (c) Determine linear relations between features. (d) Prove assumptions about data (normality, no correlation, homogeneity).
13 Multivariate exploratory analysis II Graphical display for D or 3D representation of data Identification of obects or features appearing to be outlying. Indication of structures in data as heterogeneities, multiple groups, etc.
14 Multivariate exploratory analysis III Most of methods for multivariate data exploration can be divided to the following categories: generalized scatter graphs symbols proections
15 Obekty Profiles Each obect x i is characterized by m piecewise lines. Their size is proportional to corresponding value of feature x i = 1,..., m. Znaky On x-axis is index of features. Profile is created by oining of individual lines end points. It is suitable to scale y axis (values of features). Profiles are simple and easily interpretable. It is possible to identify outliers and groups of obects with similar behavior.
16 Chernoff faces Humans have specialized brain areas for face recognition. For d < 0 use face elements.
17 Latent variables I First principal component Direction is equal to direction of maximum variability Scatter graphs in modified coordinates enable to simpler interpretation and reduce distortion or artifacts. As suitable coordinates the latent variables are used. Typical latent variables are based on principal component (PCA). This method is useful for cases where columns of X matrix are highly correlated. Second principal component
18 Latent variables II PCA combined with dynamic graphics (rotation of coordinates).
19 Multivariate geometry Date are lying in hypercube on the diagonal vector v starting from center and ending in one corner.angle between this vector and selected axis e i is given by relation cosθ = i v * e v ± 1 m For higher m is this cosine approaching to zero. Diagonal vectors are then nearly perpendicular to all axes. In scatter graphs are then points clusters oriented in the diagonal directions proected to origin and are not identifiable. * i e i =
20 Volume concentration Volume concentration phenomena is not visible from classical D or 3D geometry. Volume of hyper sphere with radius r in m m / dimensional space is π Vk = Γ( m / + 1) Volume of hyper cube with edge length r is r m m Vh = * r Hyper sphere inscribed in hyper-cube. Ratio of volumes is Vk Vh m / = π 0 for Γ m + m m * ( / 1) Volume of hyper-cube is then concentrated in edges and central part is nearly empty m
21 Volumes ratio Influence of space dimension on volume ratio of hypercube and inscribed hyper sphere. From dimension m = 8 is volume of sphere negligible in comparison with cube volume. Data in the multidimensional space are concentrated in the periphery and for covering of central part it will be necessary to have huge number of points (obects).
22 Multivariate normal distribution I Let we have multivariate normal distribution with mean equal to zero and unit covariance matrix. Dependence of multivariate normal distribution function value in the point x= (1, 1,1.1) on the dimension m is shown on the figure.
23 Multivariate normal distribution II In central area is probability of random variable occurrence very low in comparison with area of tails. The dependence of standardized multivariate normal distribution function in the point x= (,,.) is shown on the figure. The decrease of probability for higher m is clearly visible.
24 Multivariate normal distribution III It is well known that sum of squares of independent standardized normally distributed elements of random vector x has chí squared distribution. x m = i= 1 x i Because is mean value equal to zero is this norm equal to distance form origin. The probability of occurrence of multivariate normal random vector in sphere having zero origin and radius r is then equal to P( x r) = χ P( χ m m r)
25 Multivariate normal distribution IV The dependence of probability of occurrence of multivariate normal random vector in sphere having zero origin and radius r = 3 on dimension m is shown on the figure. The quick decrease towards zero is visible. Starting from m = 8 has the occurrence of individual obects in central area small likelihood. For higher dimensions will be then maority of points in tail area. The tails play here more important role Paradox of dimensionality
26 Data Proection I Usually, for D proection first two PC are used. The information from the last two PC can be interesting as well. These proections preserve angles and distances between obects (points). On the other hand there is here no obective criterion for revealing the hidden structures in data.. The linear proections of multivariate data (proection pursuit) satisfy to some criterion called proection index IP(C i ). The proection vectors C i, maximizing IP(C i ) under the constraints C it C i = 1 are here computed. Proection on these vectors is then C it X.
27 Linear transformation D vectors X in a unit circle with mean (1,1); Y = A*X, A = x matrix Y Y 1 = a a 11 1 a a 1 X * X 1 The shape and the mean is changed. Scaling (a ii elements); rotation, mirror reflection. Distances between vectors are not invariant: Y 1 -Y X 1 -X
28 Data Proection II It is interesting that index IP, corresponding to principal components is T T IP( C) = max( Ci SCi ) for Ci Ci = 1 S is sample covariance matrix. (can be simply robustified) C i satisfying the maximum condition is eigenvector of matrix S having i -th largest eigenvalue λ i, i = 1,. The C 1 and C are orthogonal. Index IP(C) corresponds to minimum of all proections C of logarithm of likelihood function maximum for normally distributed data. N(c T µ, c T C c). For normally distributed samples in then proection to the first two principal components optimal.
29 PCA limitations PCA leads to vertical axis (no discrimination) PP leads to horizontal axis (two clusters)
30 PCA Data Proection III Frequently, the selection of clusters in proection is the main goal. Fro these purposes the index is equal to ratio between mean inter obect distance D ans mean distance between nearest neighbors d. Some indexes are based on the pdf of data in the proection f P (x). The estimator of f P (x) ) is usually the kernel pdf estimator IP( C) = f p ( x) dx Differences from normality expressed by pdf φ(x) are included in index IP( C) = φ( x)[ f ( x) φ( x)] dx p PP
31 Nonlinear proection Sammon algorithm proection from original space to reduced space (having nearly the same distances between obects). Let d i * are distances between two obects in original space and d i corresponding distances in reduced space.. Target function E (to be minimized) has form E = 1 * di i< i< ( d The iterative Newton method or heuristically oriented algorithms are used. * i - d d i * i )
32 Comparison of proections
33 PCA goals x Selection of original variables combinations (latent variables) explaining main part of overall variance. Discovering of data structures characterizing links between features of obects Dimension reduction Removal of noise and outliers x 1 Creation of optimal summary of features. (first PCA)
34 PC=principal component (latent variables) PC features x z z 1 Linear combination of original variables Mutual orthogonality Arrangement about importance Rotation of coordinate system Optimal low dimensional proection Explanation of maximal portion of data variance z z 1 x 1
35 6 PCA utilization Dimensionality reduction Multivariate normality testing Data exploration Indicator of multivariate data Data proection Special regression model
36 First PC PC1= y 1 explains maximum of original data variability Overall mean of the dataset
37 PC= y explains maximum of not included in y 1. Second PC y 1 is orthogonal to y Overall mean of the dataset
38 Mathematical formalization x C centered original data expressed in deviations from means T xc = (x1 - µ 1, x - µ,..., xm - µ m,) First PC y 1 Second PC y m T y1 = V 1 xc = V1 xc y = V x = V x = 1 m = 1 T T T D ( y 1 ) = D ( V1 xc) = E [(V1 xc) (V1 xc) ] = y )= C V T T T V1 E (xc xc ) V1 = V1 C V1 C.. covariance matrix, V1.. vector of loadings for PC1, V.. vector of loadings for PC. y.. PC, V.. factor loadings T D( C T C T V y = V * xc
39 Properties of loadings I Normalization conditions and mutual orthogonality V 1 T V 1 = 1 T T V V =1 a V1 V Computation of loadings V 1, V,..., V m,leads to maximization of variances under equality constraints. Solution shows that V is eigenvector of covariance matrix C, corresponding to -th maximal eigenvalue λ.. Covariance matrix decomposition = 0 C = V Λ V T V is (m x m) matrix, columns are loading vectors V Λ is (m x m) diagonal matrix, covariance matrix eigenvalues λ 1 <= λ <=... λ m are diagonal elements.
40 Properties of loadings II Matrix V is orthogonal, i.e. V T V = E, Variance D(y ) = λ is equal to -th eigenvalue. Overall data variance is equal to sum of PC m variances tr C = λ i=1 Relative contribution of -th PC for explanation of data variability λ P = m i=1 λ
41 Properties of loadings III Covariances between -th PC and vector of features x C are T cov (x, y ) = cov (x, V x ) = E (x x ) V = C V = λ V C C T C Covariance between i-th feature x i and -th PC y is cov (x Ci, y ) = λ V i, V i is i-th element of vector V. Correlation coefficient r(x Ci, y ) has form C C r( x C, y ) = λ V σ x i i λ = λ V σ x i i
42 Properties of loadings IV Replacement of centered features x C by normalized features x N x1 - µ 1 =, σ x - µ - µ Correlation coefficient r(x Ci, y ) reduces to form x x m,..., 1 σ x σ x m m cov (x N, y ) = r(x N, y ) = v * i λ * V i * and λ * are eigenvectors and eigenvalues of correlation matrix R.
43 Two features example I Two features x 1 and x, with covariance matrix C and correlation matrix R, σ 1 C1 C = C σ PCA for correlation matrix Condition for eigenvalues computation 1- l r det (R - l E ) = det = 0 r 1- l After rearrangements (1 - l) -r = 0. 1 R = 1 r r 1 l - * l + 1- r = 0
44 Two features example II Solution of quadratic equation l - * l + 1- r [ ] ( 1- r ) =1 r 1 = λ = l 1 = 0 [ ] ( 1- r ) =1 r = λ = 0.5 l Eigenvectors are solution of homogeneous equations ( R - λi E ) V i = 0, i =1,
45 Two features example III Normalized eigenvector V 1* has form Normalized eigenvector V * has form = 1 1-1) + ( r -1 r -1) + ( r = V / * 1 λ λ λ = (1- r -1) + (1- r -1) r + (1- r -1) r - r = V -1/ *
46 Two features example IV First PC Second PC 1 y = ( z + z ) y= ( z - z1 ) 1 =( x - E( x )) / D( x ) z=( x - E( x )) / D( x ) z Application of normalized features leads to independence of PC on correlation in original data The coordinate system is rotated of i.e. of 45 o. cosα = 1/
47 Scores plot 4 3 Acute Chronic PC scores PC Scores T Xc * V Values of PC s for all obects Reconstructed data PC 1 = (nxm)=(nxm)*(mxm) X c = T* V Reduction of PC number: Selection of few Pc (p) Replacing the loading matrix V (m x m) by reduced loading matrix Vf (m x p) Computation of reduced scores T f = Xc * Vf T
48 Reduced PC I The percentage of variability explained by the th principal component is: λ P = *100 m λ i=1 In practice, it is often the case that although hundreds of variables are measured, the first few PCs will explain almost all of the variability (information) in the data. Scree plot Bar diagram of ordered eigenvalues λ 1 <= λ <=... λ m. Often it is visible gap between important and non important PC
49 3.5 Scree Plot Reduced PC II Eigenvalue Due to use of reduced loadings there are differences between reconstructed data matrices. Centered matrix X C is decomposed on matrix of component scores T (n x k) and loading matrix V k T (k x m). Information loss i.e. error matrix O (n x m). Component Number X C = T * V T k + O
50 n T S( µ, y, V ) = I k (xi - µ - Vk yii ) (xi - µ - Vk yii) i=1 Bilinear regression model X = T * V Model C k has as parameters scores T, and eigenvalues V k. T T X Cr = T *V k = X C *V k V k The minimization of length of O or distance dist(x C - T V kt ) can be realized. The results are the same as maximization of variance. Residual matrix Or = Xc X cr T + O = Xc Tf * V T k = Xc *( E V k * V T k )
51 PC interpretation I Original matrix X c = (x 1,..., x m ) - creating n points in m dimensional feature space Score matrix (t 1,..., t m ) - creating n points in m or k dimensional space of PC. -th vector t of PC i-th vector of x Ci m m t = Vi xci x = Ci Vi t i=1 V i are elements of matrix V m, resp. V k if only k component are used. =1
52 PC interpretation II In feature space is t the weighted sum of vectors x Ci with weights V i. Length of this vector is d( t ) = t T t = v T X T C x α p=k*z X C v = cos α = cos α = z λ x T z ((x T x)*(z T z)) 1/ (x T x) (p T p) Proection t Pi of vectou x Ci on t is expressed as t Pi = t b, where b is slope. t T (x Ci - b t ) = 0 and b = t T t T x t Ci = t T x λ Ci
53 PC interpretation III Because t T t k = 0 for # k (vectors t are orthogonal) it is valid t T x Ci = t T m k=1 V Therefore b = V i. and proection vector t Pi = V i t has length p i = t T Pi t Pi Length of t vector is then. m m d( t ) = Vi p V ik = t k i=1 V = V i i λ λ = V i i λ = λ i=1 i
54 PC interpretation IV Contribution of each original variable to length of vector t is proportional to square of V i. Length of this vector is proportional to standard deviation of corresponding PC. Variance explained by -th PC is composed from contributions of original features and their importance is expressed by V i. Small V i means that i-th original feature has small contribution to variability of -th PC and is no important. If the row of matrix V has all small elements is the i-th feature no important for all PC.
55 Contribution plot Graph composed from m groups. In each group are m columns. Each group corresponds to one PC and each column represents one feature Heights of columns are related to Heights of columns in first group are standardized and their sum is 100 %. (division by sum of their lengths Ls) The same standardization (division by Ls) is used for rest of groups as well It is simple to investigate the influence of features on PC V i * λ
56 Correlations Relations between original features and PC are quantified by correlation coefficients between x Ci and t, r i = cosα = i (x σ i is standard deviation for i-th feature. x T Ci T Ci x t Ci ) t T t By using of normalized variables (replacement of matrix S by correlation matrix R) are correlation coefficients equal directly to partial proections r i = p i Higher r i indicates higher proections. It means that x i is close to t and contributes markedly to variance explained by -th PC. V i σ i λ = p σ i i
57 PCA for correlated data I Simulated data arises from 3D normal distribution with zero vector of mean values and correlation matrix r r r r N =500 data were generated for various paired correlation coefficient r r
58 PCA for correlated data II All correlations are zero
59 PCA for correlated data III All correlations are 0.9
60 False correlations I r 1 = H H 1 13 H Multiple correlation coefficient Partial correlation coefficients R 1 = 0 Variable x3 is then parasite and do not contribute to explanation of feature x1 variability For simulation was selected H = 0,9. r = r = H,3() 3 R = 1(,3) R 1,(3) = H H 1+ H
61 False correlations II Scree and contribution plots
62 Low paired correlations I r 1 = H H 0.01 r 13 = 0 Multiple correlation coefficient Partial correlation coefficients r3 = 1 H R 1 = (,3) R = 0.71 R ,3() 1 =,(3) All variables are therefore important.
63 Low paired correlations II Scree and contribution plots PCA is not able to fully replace the correlation analysis
64 Distances in feature spaces Data vector, d-dimensions X T = (X 1,... X m ), Y T = (Y 1,... Y m ) Distance, or metric function Popular distance functions: Minkowski distance (L r metric) Manhattan (city-block) distance (L 1 norm) Euclidean distance (L norm) = = = = = = m k k k m k k k r m k r k k Y X d Y X d Y X d ), ( ), ( 0, ), ( X Y Y X Y X Y X d d d = = ), ( ), ( ), ( Z Y Z X Y X d d d + i h
65 Basic metrics X X Manhattan (city-block) X 1 X 1 Euclidean Identical distance between two points: imagine that in 10 D! i
66 Other metric Ultrametric d i replaces [ d d ] max, ih h d i d ih + d h i h Four-point additive condition d d hi i d k replaces ih d h [( d + d ) ( d + d )] + d max, + h ik hk i
67 Invariant distances Euclidean distance is not invariant to linear transformations Y = A*X Scaling of units has strong influence on distances. Mahalanobis metric will replace A T A by inverse of the covariance matrix (1) () (1) () T (1) () Y Y = Y Y Y Y = ( ) ( ) ( (1) () ) T T ( (1) () Y Y A A Y Y ) Orthonormal matrices: A T A = I, rigid rotations. Invariance requires standardization + covariance matrix
68 Distances Mahalanobis distance d i = ( x i x A ) T S 1 ( x i x A ) Euclidean distance d i = ( x i x A ) T ( x i x A )
69 Outlying obects I Indication of outliers is sensitive on the presence of masking, outliers appear to be correct (due to covariance matrix augmentation) or swamping,correct values appear to be outlier (due to presence of outlying points).
70 Outlying obects II As outliers are identified these obects d i > c( p, N, α N ) For the case of multivariate normal distribution is c( p, N, α N ) Equal to quantile of chi squared distribution c( p, N, α N ) = χ (1 α / p N)
71 Outlying obects III For application of Mahalanobis distance approach it is necessary to known clean estimators x A a S. Robust estimator of covariance matrix can be obtained by the following ways: - M estimator - S estimator minimizing det C with constraints - Estimators minimizing volume of confidence ellipsoid. EDA analysis requires to visualization of outliers but not corruption of proections.
72 Simple solution Evaluation of clear subset of data 1. Selection of starting subset based on on the - Mahalanobis distance and trimming of suspicious data - Distance from multivariate median Result is subset of data with parameters x Ac S c. Calculation of residuals 1 d i = ( x i x 3. Iterative modification of clear subset to be containing the points with residuals lower that c * χ α where AC ) T S C ( x h = ( n + p +1) / c 1 = max(0,( h r) /( h + r)) c = 1+ ( p + 1) /( n p) + /( n 1 3p) c = c 1 +c i x AC )
73 PCA corrupted normal data Axis % ellipse 99% ellipse 99.9% ellipse -1 savings Rule Axis 1 Exceptions Yearly income
Factor Analysis. Chapter 420. Introduction
Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.
More informationLeast-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationBig Ideas in Mathematics
Big Ideas in Mathematics which are important to all mathematics learning. (Adapted from the NCTM Curriculum Focal Points, 2006) The Mathematics Big Ideas are organized using the PA Mathematics Standards
More informationIntroduction to Principal Components and FactorAnalysis
Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a
More informationDimensionality Reduction: Principal Components Analysis
Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely
More informationHow To Understand Multivariate Models
Neil H. Timm Applied Multivariate Analysis With 42 Figures Springer Contents Preface Acknowledgments List of Tables List of Figures vii ix xix xxiii 1 Introduction 1 1.1 Overview 1 1.2 Multivariate Models
More informationMedical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu
Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
More informationT-test & factor analysis
Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue
More informationNEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS
NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document
More informationExploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016
and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings
More informationCommon factor analysis
Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More informationCurrent Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary
Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:
More informationMultivariate Statistical Inference and Applications
Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationHow To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
More informationManifold Learning Examples PCA, LLE and ISOMAP
Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition
More informationTutorial on Exploratory Data Analysis
Tutorial on Exploratory Data Analysis Julie Josse, François Husson, Sébastien Lê julie.josse at agrocampus-ouest.fr francois.husson at agrocampus-ouest.fr Applied Mathematics Department, Agrocampus Ouest
More informationAlgebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard
Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express
More informationPrincipal Component Analysis
Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded
More informationComponent Ordering in Independent Component Analysis Based on Data Power
Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals
More informationBiggar High School Mathematics Department. National 5 Learning Intentions & Success Criteria: Assessing My Progress
Biggar High School Mathematics Department National 5 Learning Intentions & Success Criteria: Assessing My Progress Expressions & Formulae Topic Learning Intention Success Criteria I understand this Approximation
More informationIntroduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
More informationFactor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
More informationMehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics
INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree
More informationMultimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.
Multimedia Databases Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 14 Previous Lecture 13 Indexes for Multimedia Data 13.1
More informationData Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
More informationMultivariate Analysis (Slides 13)
Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables
More informationPrincipal Component Analysis
Principal Component Analysis Principle Component Analysis: A statistical technique used to examine the interrelations among a set of variables in order to identify the underlying structure of those variables.
More informationCommon Core Unit Summary Grades 6 to 8
Common Core Unit Summary Grades 6 to 8 Grade 8: Unit 1: Congruence and Similarity- 8G1-8G5 rotations reflections and translations,( RRT=congruence) understand congruence of 2 d figures after RRT Dilations
More informationCHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In
More informationAlgebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.
Chapter 1 Vocabulary identity - A statement that equates two equivalent expressions. verbal model- A word equation that represents a real-life problem. algebraic expression - An expression with variables.
More informationIntroduction to Principal Component Analysis: Stock Market Values
Chapter 10 Introduction to Principal Component Analysis: Stock Market Values The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from
More informationGRADES 7, 8, AND 9 BIG IDEAS
Table 1: Strand A: BIG IDEAS: MATH: NUMBER Introduce perfect squares, square roots, and all applications Introduce rational numbers (positive and negative) Introduce the meaning of negative exponents for
More informationElasticity Theory Basics
G22.3033-002: Topics in Computer Graphics: Lecture #7 Geometric Modeling New York University Elasticity Theory Basics Lecture #7: 20 October 2003 Lecturer: Denis Zorin Scribe: Adrian Secord, Yotam Gingold
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationSection 1.1. Introduction to R n
The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to
More information1 2 3 1 1 2 x = + x 2 + x 4 1 0 1
(d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which
More informationOverview of Factor Analysis
Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,
More informationMathematics Pre-Test Sample Questions A. { 11, 7} B. { 7,0,7} C. { 7, 7} D. { 11, 11}
Mathematics Pre-Test Sample Questions 1. Which of the following sets is closed under division? I. {½, 1,, 4} II. {-1, 1} III. {-1, 0, 1} A. I only B. II only C. III only D. I and II. Which of the following
More informationDescriptive Statistics
Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web
More informationData Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar What is data exploration? A preliminary exploration of the data to better understand its characteristics.
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationData Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler
Data Mining: Exploring Data Lecture Notes for Chapter 3 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Topics Exploratory Data Analysis Summary Statistics Visualization What is data exploration?
More informationFactor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models
Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis
More informationFACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.
FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationNonlinear Iterative Partial Least Squares Method
Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for
More informationGlencoe. correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 3-3, 5-8 8-4, 8-7 1-6, 4-9
Glencoe correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 STANDARDS 6-8 Number and Operations (NO) Standard I. Understand numbers, ways of representing numbers, relationships among numbers,
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationChapter 6. Orthogonality
6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be
More informationTowards Online Recognition of Handwritten Mathematics
Towards Online Recognition of Handwritten Mathematics Vadim Mazalov, joint work with Oleg Golubitsky and Stephen M. Watt Ontario Research Centre for Computer Algebra Department of Computer Science Western
More informationComparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
More informationHow to report the percentage of explained common variance in exploratory factor analysis
UNIVERSITAT ROVIRA I VIRGILI How to report the percentage of explained common variance in exploratory factor analysis Tarragona 2013 Please reference this document as: Lorenzo-Seva, U. (2013). How to report
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationPrinciple Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate
More informationCluster Analysis. Isabel M. Rodrigues. Lisboa, 2014. Instituto Superior Técnico
Instituto Superior Técnico Lisboa, 2014 Introduction: Cluster analysis What is? Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from
More informationEM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
More informationwith functions, expressions and equations which follow in units 3 and 4.
Grade 8 Overview View unit yearlong overview here The unit design was created in line with the areas of focus for grade 8 Mathematics as identified by the Common Core State Standards and the PARCC Model
More informationData Exploration and Preprocessing. Data Mining and Text Mining (UIC 583 @ Politecnico di Milano)
Data Exploration and Preprocessing Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationImproving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation
More informationEpipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R.
Epipolar Geometry We consider two perspective images of a scene as taken from a stereo pair of cameras (or equivalently, assume the scene is rigid and imaged with a single camera from two different locations).
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationKEANSBURG SCHOOL DISTRICT KEANSBURG HIGH SCHOOL Mathematics Department. HSPA 10 Curriculum. September 2007
KEANSBURG HIGH SCHOOL Mathematics Department HSPA 10 Curriculum September 2007 Written by: Karen Egan Mathematics Supervisor: Ann Gagliardi 7 days Sample and Display Data (Chapter 1 pp. 4-47) Surveys and
More informationPennsylvania System of School Assessment
Pennsylvania System of School Assessment The Assessment Anchors, as defined by the Eligible Content, are organized into cohesive blueprints, each structured with a common labeling system that can be read
More informationDiagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationGoing Big in Data Dimensionality:
LUDWIG- MAXIMILIANS- UNIVERSITY MUNICH DEPARTMENT INSTITUTE FOR INFORMATICS DATABASE Going Big in Data Dimensionality: Challenges and Solutions for Mining High Dimensional Data Peer Kröger Lehrstuhl für
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationAlgebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13 school year.
This document is designed to help North Carolina educators teach the Common Core (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Algebra
More informationHow To Solve The Cluster Algorithm
Cluster Algorithms Adriano Cruz adriano@nce.ufrj.br 28 de outubro de 2013 Adriano Cruz adriano@nce.ufrj.br () Cluster Algorithms 28 de outubro de 2013 1 / 80 Summary 1 K-Means Adriano Cruz adriano@nce.ufrj.br
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationMultidimensional data and factorial methods
Multidimensional data and factorial methods Bidimensional data x 5 4 3 4 X 3 6 X 3 5 4 3 3 3 4 5 6 x Cartesian plane Multidimensional data n X x x x n X x x x n X m x m x m x nm Factorial plane Interpretation
More informationApplied Linear Algebra I Review page 1
Applied Linear Algebra Review 1 I. Determinants A. Definition of a determinant 1. Using sum a. Permutations i. Sign of a permutation ii. Cycle 2. Uniqueness of the determinant function in terms of properties
More informationFitting Subject-specific Curves to Grouped Longitudinal Data
Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,
More informationFactor Analysis. Factor Analysis
Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we
More informationNew York State Student Learning Objective: Regents Geometry
New York State Student Learning Objective: Regents Geometry All SLOs MUST include the following basic components: Population These are the students assigned to the course section(s) in this SLO all students
More informationThe Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression
The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression The SVD is the most generally applicable of the orthogonal-diagonal-orthogonal type matrix decompositions Every
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationNeural Networks Lesson 5 - Cluster Analysis
Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm michele.scarpiniti@uniroma1.it Rome, 29
More informationChapter ML:XI (continued)
Chapter ML:XI (continued) XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained
More informationRachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA
PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor
More informationBindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8
Spaces and bases Week 3: Wednesday, Feb 8 I have two favorite vector spaces 1 : R n and the space P d of polynomials of degree at most d. For R n, we have a canonical basis: R n = span{e 1, e 2,..., e
More informationChapter 17. Orthogonal Matrices and Symmetries of Space
Chapter 17. Orthogonal Matrices and Symmetries of Space Take a random matrix, say 1 3 A = 4 5 6, 7 8 9 and compare the lengths of e 1 and Ae 1. The vector e 1 has length 1, while Ae 1 = (1, 4, 7) has length
More informationSteven M. Ho!and. Department of Geology, University of Georgia, Athens, GA 30602-2501
PRINCIPAL COMPONENTS ANALYSIS (PCA) Steven M. Ho!and Department of Geology, University of Georgia, Athens, GA 30602-2501 May 2008 Introduction Suppose we had measured two variables, length and width, and
More informationMeasurement with Ratios
Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical
More informationMAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A =
MAT 200, Midterm Exam Solution. (0 points total) a. (5 points) Compute the determinant of the matrix 2 2 0 A = 0 3 0 3 0 Answer: det A = 3. The most efficient way is to develop the determinant along the
More informationAlgebra 1 Course Information
Course Information Course Description: Students will study patterns, relations, and functions, and focus on the use of mathematical models to understand and analyze quantitative relationships. Through
More informationHigher Education Math Placement
Higher Education Math Placement Placement Assessment Problem Types 1. Whole Numbers, Fractions, and Decimals 1.1 Operations with Whole Numbers Addition with carry Subtraction with borrowing Multiplication
More informationPartial Least Squares (PLS) Regression.
Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component
More informationExploratory Factor Analysis
Exploratory Factor Analysis Definition Exploratory factor analysis (EFA) is a procedure for learning the extent to which k observed variables might measure m abstract variables, wherein m is less than
More informationCRLS Mathematics Department Algebra I Curriculum Map/Pacing Guide
Curriculum Map/Pacing Guide page 1 of 14 Quarter I start (CP & HN) 170 96 Unit 1: Number Sense and Operations 24 11 Totals Always Include 2 blocks for Review & Test Operating with Real Numbers: How are
More informationAPPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder
APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large
More informationIris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode
Iris Sample Data Set Basic Visualization Techniques: Charts, Graphs and Maps CS598 Information Visualization Spring 2010 Many of the exploratory data techniques are illustrated with the Iris Plant data
More informationMULTIVARIATE DATA ANALYSIS WITH PCA, CA AND MS TORSTEN MADSEN 2007
MULTIVARIATE DATA ANALYSIS WITH PCA, CA AND MS TORSTEN MADSEN 2007 Archaeological material that we wish to analyse through formalised methods has to be described prior to analysis in a standardised, formalised
More informationUnderstanding and Applying Kalman Filtering
Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding
More information