PCA vs. Varimax rotation



Similar documents
Gravitation. Definition of Weight Revisited. Newton s Law of Universal Gravitation. Newton s Law of Universal Gravitation. Gravitational Field

Additional File 1 - A model-based circular binary segmentation algorithm for the analysis of array CGH data

Perturbation Theory and Celestial Mechanics

A Novel Lightweight Algorithm for Secure Network Coding

An Algorithm For Factoring Integers

Lecture 16: Color and Intensity. and he made him a coat of many colours. Genesis 37:3

Efficient Evolutionary Data Mining Algorithms Applied to the Insurance Fraud Prediction

Electric Potential. otherwise to move the object from initial point i to final point f

(Semi)Parametric Models vs Nonparametric Models

Gauss Law. Physics 231 Lecture 2-1

AN IMPLEMENTATION OF BINARY AND FLOATING POINT CHROMOSOME REPRESENTATION IN GENETIC ALGORITHM

12. Rolling, Torque, and Angular Momentum

This circuit than can be reduced to a planar circuit

Keywords: Transportation network, Hazardous materials, Risk index, Routing, Network optimization.

Orbit dynamics and kinematics with full quaternions

LINES ON BRIESKORN-PHAM SURFACES

Continuous Compounding and Annualization

A Coverage Gap Filling Algorithm in Hybrid Sensor Network

Joint Virtual Machine and Bandwidth Allocation in Software Defined Network (SDN) and Cloud Computing Environments

Statistical modelling of gambling probabilities

TRUCK ROUTE PLANNING IN NON- STATIONARY STOCHASTIC NETWORKS WITH TIME-WINDOWS AT CUSTOMER LOCATIONS

ON THE (Q, R) POLICY IN PRODUCTION-INVENTORY SYSTEMS

Carter-Penrose diagrams and black holes

Supplementary Material for EpiDiff

3.32 In aircraft control systems, an ideal pitch response ( qo) versus a pitch command ( qc) is described by the transfer function

A New replenishment Policy in a Two-echelon Inventory System with Stochastic Demand

FXA Candidates should be able to : Describe how a mass creates a gravitational field in the space around it.

STUDENT RESPONSE TO ANNUITY FORMULA DERIVATION

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

Voltage ( = Electric Potential )

Gravitational Mechanics of the Mars-Phobos System: Comparing Methods of Orbital Dynamics Modeling for Exploratory Mission Planning

A Mathematical Model for Selecting Third-Party Reverse Logistics Providers

Questions & Answers Chapter 10 Software Reliability Prediction, Allocation and Demonstration Testing

Voltage ( = Electric Potential )

Vector Calculus: Are you ready? Vectors in 2D and 3D Space: Review

4a 4ab b (count number of places from first non-zero digit to

The Role of Gravity in Orbital Motion

SIMPLE LINEAR CORRELATION

Problem Set # 9 Solutions

Figure 2. So it is very likely that the Babylonians attributed 60 units to each side of the hexagon. Its resulting perimeter would then be 360!

Exam 3: Equation Summary

12.1. FÖRSTER RESONANCE ENERGY TRANSFER

Review Graph based Online Store Review Spammer Detection

1240 ev nm 2.5 ev. (4) r 2 or mv 2 = ke2

UNIT CIRCLE TRIGONOMETRY

FI3300 Corporate Finance

Graphs of Equations. A coordinate system is a way to graphically show the relationship between 2 quantities.

Peer-to-Peer File Sharing Game using Correlated Equilibrium

Order-Degree Curves for Hypergeometric Creative Telescoping

Green's function integral equation methods for plasmonic nanostructures

Lab #7: Energy Conservation

Mixed Task Scheduling and Resource Allocation Problems

AREA COVERAGE SIMULATIONS FOR MILLIMETER POINT-TO-MULTIPOINT SYSTEMS USING STATISTICAL MODEL OF BUILDING BLOCKAGE

2 r2 θ = r2 t. (3.59) The equal area law is the statement that the term in parentheses,

YARN PROPERTIES MEASUREMENT: AN OPTICAL APPROACH

2. TRIGONOMETRIC FUNCTIONS OF GENERAL ANGLES

Multiple choice questions [70 points]

MULTIPLE SOLUTIONS OF THE PRESCRIBED MEAN CURVATURE EQUATION

The LCOE is defined as the energy price ($ per unit of energy output) for which the Net Present Value of the investment is zero.

The Detection of Obstacles Using Features by the Horizon View Camera

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

High Availability Replication Strategy for Deduplication Storage System

The Can-Order Policy for One-Warehouse N-Retailer Inventory System: A Heuristic Approach

Discussion Papers. Thure Traber Claudia Kemfert

Magnetic Field and Magnetic Forces. Young and Freedman Chapter 27

Chapter 30: Magnetic Fields Due to Currents

Thick-Walled Cylinders and Press Fits by W.H.Dornfeld PressCylinder:

Reduced Pattern Training Based on Task Decomposition Using Pattern Distributor

2. Orbital dynamics and tides

Modeling and computing constrained

Physics HSC Course Stage 6. Space. Part 1: Earth s gravitational field

Experiment 6: Centripetal Force

PY1052 Problem Set 8 Autumn 2004 Solutions

Financing Terms in the EOQ Model

Converting knowledge Into Practice

On the Efficiency of Equilibria in Generalized Second Price Auctions

International Business Cycles and Exchange Rates

est using the formula I = Prt, where I is the interest earned, P is the principal, r is the interest rate, and t is the time in years.

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Bending Stresses for Simple Shapes

Experiment MF Magnetic Force

AN EQUILIBRIUM ANALYSIS OF THE INSURANCE MARKET WITH VERTICAL DIFFERENTIATION

Forecasting the Direction and Strength of Stock Market Movement

LATIN SQUARE DESIGN (LS) -With the Latin Square design you are able to control variation in two directions.

Time Series Perturbation by Genetic Programming

Prejudice and the Economics of Discrimination

Episode 401: Newton s law of universal gravitation

Statistical Discrimination or Prejudice? A Large Sample Field Experiment. Michael Ewens, Bryan Tomlin, and Liang Choon Wang.

Spirotechnics! September 7, Amanda Zeringue, Michael Spannuth and Amanda Zeringue Dierential Geometry Project

The Binomial Distribution

Semipartial (Part) and Partial Correlation

Transcription:

PCA vs. Vamax otaton The goal of the otaton/tansfomaton n PCA s to maxmze the vaance of the new SNP (egensnp), whle mnmzng the vaance aound the egensnp. Theefoe the dffeence between the vaances captued n each egensnp s maxmzed. The constant, Γ' ΛΓ s dagonal, on the coeffcents of ognal SNPs and egensnps s a mathematcal convenence to make the coeffcents unque; howeve, t can complcate the poblem of ntepetaton. (See the scatte plot (fgue 1) of the coeffcents (table 2) fom the dataset (table 1) whee each SNP s epesented as a pont n the fst 2 dmensons of the egenspace.) The ntepetaton of the coeffcents s the most staghtfowad f each SNP s coelated hghly on at most one egensnp, and f all the coeffcent ae ethe lage o nea zeo, wth few ntemedate values. The SNPs ae then splt nto dsjont sets, each of whch s assocated wth one egensnp, pehaps some SNPs ae left ove. To acheve ths clea patten of coeffcents, we could otate the axes defned by PCA n any decton wthout changng the elatve locatons of the ponts to each othe n evey two dmensons; but the actual coodnates of the ponts would change. The otated solutons spam n the same geometc space as the ognal solutons and explan the same amount of vaance n the data as the ognal soluton, howeve the dffeence of the vaances captued n the otated axes s no longe maxmzed. Thee ae seveal analytcal choces of otaton that have been poposed n the past. One of them s the vamax method of othogonal otaton. The vamax otaton cteon maxmzes the sum of the vaances of the squaed coeffcents wthn each egenvecto, and the otated axes eman othogonal. Fgue 2 demonstates the otated soluton (table 3) afte a vamax otaton. Afte the coodnate axes ae otated clockwse by an angle about 45 degees, we obtan a clea patten of SNPs coespondng to otated egensnps. In ths smple example the oveall ntepetaton s the same whethe we otate the axes o no, but n moe complcated stuatons we could beneft moe. Zhen Ln zln@sm.stanfod.edu 1

Tables: snp1 snp2 snp3 snp4 ch1 1 1 0 1 ch2 1 0 1 1 ch3 0 0 0 0 ch4 0 1 0 1 ch5 1 0 0 0 Table 1. a small dataset of 4 SNPs fom 5 chomosomes Vaables E1 E2 Snp1 0.0978322 0.5690011 Snp2 0.647965-0.40432 Snp3 0.1198195 0.6968811 Snp4 0.745972 0.164681 Table 2. patal PCA esults (unotated) Vaables E1 E2 Snp1 0.00012428 0.5773503 Snp2 0.70744623-0.288827 Snp3 0.00015222 0.7071068 Snp4 0.70716891 0.2885229 Table 3. otated soluton Zhen Ln zln@sm.stanfod.edu 2

Fgues: Fgue 1. scatte plot of SNPs n the othogonal space (unotated) Fgue 2. scatte plot of SNPs n the otated othogonal space Zhen Ln zln@sm.stanfod.edu 3

R souce code: data <- c(1,1,0,0,1,1,0,0,1,0,0,1,0,0,0,1,1,0,1,0) dm(data) <- c(5,4) snploadngs <- loadngs(pncomp(data, co=t)) plot(snploadngs[,1:2]) otated <- vamax(snploadngs[,1:2])$loadngs plot(otated) Zhen Ln zln@sm.stanfod.edu 4

An example of fndng htsnps fom a small SNP dataset usng the vamax otaton method. We stat wth a SNP dataset: SNP1 SNP2 SNP3 SNP4 SNP5 Chomosome1 1 1 1 0 1 Chomosome2 1 1 0 0 0 Chomosome3 0 0 0 0 1 Chomosome4 0 0 1 1 0 PCA esults, unotated, ae: e 1 e 2 e 3 e 4 e 5 SNP1-0.5532-0.3854 0-0.7385 0 SNP2-0.5532-0.3854 0 0.6155 0.4082 SNP3 0.2025-0.5265-0.7071 0.1231-0.4082 SNP4 0.5532-0.3854 0-0.2132 0.7071 SNP5-0.2025 0.5265-0.7071-0.1231 0.4082 PCA esults upon vamax oaton (Mada et al. 1979; Dunteman 1989) ae: e 1 e 2 SNP1 0 0 0-1 0 SNP2-1 0 0 0 0 SNP3 0-0.7071-0.7071 0 0 SNP4 0 0 0 0 1 SNP5 0 0.7071-0.7071 0 0 e 3 We compae the aveage coeffcent fo all k egensnps ( Γ ) to the one fo the est of (pk) egensnps ( γ ) fo each SNP; and select the SNP f Γ > γ, whch ndcates that ths SNP contbutes mostly to the k egensnp (Meng et al. 2003). Suppose k = 2, htsnp selectons ae: Γ γ htsnp SNP1 0 0.5 N SNP2 0.5 0 Y SNP3 0.3536 0.2357 Y SNP4 0 0.3333 N SNP5 0.3536 0.2357 Y e 4 e 5 Refeences: Dunteman GH (1989) Pncpal components analyss. Sage Publcatons, Newbuy Pak Mada KV, Kent JT, Bbby JM (1979) Multvaate analyss. Academc Pess, London ; New Yok Meng Z, Zaykn DV, Xu CF, Wagne M, Ehm MG (2003) Selecton of genetc makes fo assocaton analyses, usng lnkage dsequlbum and haplotypes. Am J Hum Genet 73:115-130 Zhen Ln zln@sm.stanfod.edu 5