Statistical Analysis (1-way ANOVA)



Similar documents
TRAINING GUIDE. Crystal Reports for Work

990 e-postcard FAQ. Is there a charge to file form 990-N (e-postcard)? No, the e-postcard system is completely free.

How to put together a Workforce Development Fund (WDF) claim 2015/16

How To Set Up A General Ledger In Korea

Using PayPal Website Payments Pro UK with ProductCart

The ad hoc reporting feature provides a user the ability to generate reports on many of the data items contained in the categories.

Custom Portlets. an unbiased review of the greatest Practice CS feature ever. Andrew V. Gamet

Times Table Activities: Multiplication

KronoDesk Migration and Integration Guide Inflectra Corporation

efusion Table of Contents

McAfee Enterprise Security Manager. Data Source Configuration Guide. Infoblox NIOS. Data Source: September 2, Infoblox NIOS Page 1 of 8

Some Statistical Procedures and Functions with Excel

esupport Quick Start Guide

Space Exploration Classroom Activity

e-qip Online Checklist

Table of Contents. Welcome to Employee Self Service... 3 Who Do I Call For Help?... 3

DIRECT DATA EXPORT (DDE) USER GUIDE

Chalkable Classroom For Students

BRILL s Editorial Manager (EM) Manual for Authors Table of Contents

Firewall/Proxy Server Settings to Access Hosted Environment. For Access Control Method (also known as access lists and usually used on routers)

Ad Hoc Reporting: Query Building Tyler SIS Version 10.5

Budget Planning. Accessing Budget Planning Section. Select Click Here for Budget Planning button located close to the bottom of Program Review screen.

Using Sentry-go Enterprise/ASPX for Sentry-go Quick & Plus! monitors

UTO Training Bb Discussion Boards. Technical Assistance: Website: Help Desk Phone: (24/7 support) Instruction

Two-Way ANOVA tests. I. Definition and Applications...2. II. Two-Way ANOVA prerequisites...2. III. How to use the Two-Way ANOVA tool?...

Frequently Asked Questions November 19, Which browsers are compatible with the Global Patent Search Network (GPSN)?

ViPNet VPN in Cisco Environment. Supplement to ViPNet Documentation

Welcome to CNIPS Training: CACFP Claim Entry

Merchant Management System. New User Guide CARDSAVE

Getting Started Guide

Chris Chiron, Interim Senior Director, Employee & Management Relations Jessica Moore, Senior Director, Classification & Compensation

Licensing the Core Client Access License (CAL) Suite and Enterprise CAL Suite

Aras Innovator Internet Explorer Client Configuration

Tipsheet: Sending Out Mass s in ApplyYourself

This guide is intended for administrators, who want to install, configure, and manage SAP Lumira, server for BI Platform

LeadStreet Broker Guide

Tips for Using PCB Artist Getting Started

TEDBv2 - User Manual

CSAT Account Management

Aras Innovator Internet Explorer Client Configuration

Spamguard SPAM Filter

Licensing Windows Server 2012 for use with virtualization technologies

Spread Bet Terms: Deposit Accounts

Guide to Stata Econ B003 Applied Economics

NAVIPLAN PREMIUM LEARNING GUIDE. Existing insurance coverage

CREDIT REPORTING USER GUIDE

Topic: Import MS Excel data into MS Project Tips & Troubleshooting

Excel Contact Reports

Dreamweaver MX Templates

Student Academic Learning Services Page 1 of 7. Statistics: The Null and Alternate Hypotheses. A Student Academic Learning Services Guide

HarePoint HelpDesk for SharePoint. For SharePoint Server 2010, SharePoint Foundation User Guide

Access EEC s Web Applications... 2 View Messages from EEC... 3 Sign In as a Returning User... 3

Licensing Windows Server 2012 R2 for use with virtualization technologies

March 2016 Group A Payment Issues: Missing Information-Loss Calculation letters ( MILC ) - deficiency resolutions: Outstanding appeals:

Microsoft has released Windows 8.1, a free upgrade to Windows 8. Follow the steps below to upgrade to Windows 8.1.

:: ADMIN HELP AT A GLANCE Contents

STIOffice Integration Installation, FAQ and Troubleshooting

Preparing to Deploy Reflection : A Guide for System Administrators. Version 14.1

GETTING STARTED With the Control Panel Table of Contents

Helpdesk Support Tickets & Knowledgebase

NASDAQ BookViewer 2.0 User Guide

LOUISIANA TECH UNIVERSITY Division of Student Financial Aid Post Office Box 7925 Ruston, LA 71272

Title: How Do You Handle Exchange Mailboxes for Employees Who Are No Longer With the Company

Spread Bet Terms: Deposit Accounts

CU Payroll Data Entry

learndirect Test Information Guide The National Test in Adult Numeracy

Kurzweil 3000 Version 12 Web License

Wireless Light-Level Monitoring

Adobe Sign. Enabling Single Sign-On with SAML Reference Guide

Employee Self Service (ESS) Quick Reference Guide ESS User

Attachment 2 BID PROPOSAL SUBMISSION GUIDE OCTOBER 2014 SOLICITATION

NAVIPLAN PREMIUM LEARNING GUIDE. Analyze, compare, and present insurance scenarios

P a g e 1. Banner Workflow: Teaching Effort Approver Procedure Guide. Teaching Effort Approver Procedures

Grants Online. Quick Reference Guide Grant Recipients

Import VAT VAT Corrections VAT Reporting VAT Settlement Unrealized VAT Payment Discount VAT Adjustments

Mandatory Courses Optional Courses Elective Courses

Spread Bet Terms: Deposit Accounts

Supervisor Quick Guide

Create a Non-Catalog Requisition

HP ExpertOne. HP2-T21: Administering HP Server Solutions. Table of Contents

What Does Specialty Own Occupation Really Mean?

PEARL LINGUISTICS YOUR NEW LANGUAGE SERVICE PROVIDER FREQUENTLY ASKED QUESTIONS

InformationNOW for Teachers Classroom Grade Book

How to deploy IVE Active-Active and Active-Passive clusters

ELEC 204 Digital System Design LABORATORY MANUAL

Pervasive Data Integrator. REST Invoker 2.0 Guide

User Guide Version 3.9

Data Analytics for Campaigns Assignment 1: Jan 6 th, 2015 Due: Jan 13 th, 2015

Software Distribution

CONTENTS UNDERSTANDING PPACA. Implications of PPACA Relative to Student Athletes. Institution Level Discussion/Decisions.

NextGen: PM Contract Library. User Manual

Transcription:

Statistical Analysis (1-way ANOVA) Cntents at a glance I. Definitin and Applicatins...2 II. Befre Perfrming 1-way ANOVA - A Checklist...2 III. Overview f the Statistical Analysis (1-way tests) windw...3 IV. 1-way ANOVA test...4 a. Null hypthesis...4 b. Number f Genes Analyzed...4 c. Test Optins...4 d. Recmmendatins...4 e. P-value...4 V. Multiple Testing Crrectins...5 a. Optins...5 b. Recmmendatins...5 VI. Pst Hc Tests...6 a. Optins...6 VII. Interpreting the Results...6 a. Results f 1-way ANOVA withut Pst Hc test applied...6 b. Results f 1-way ANOVA with Pst Hc test applied...7 VIII. Viewing P-values Generated...8 IX. Mst frequently asked questins and answers...9 X. References...9 1

I. Definitin and Applicatins One-way analysis f variance (ANOVA) tests allw yu t determine if ne given factr, such as drug treatment, has a significant effect n gene expressin behavir acrss any f the grups under study. A significant p-value resulting frm a 1-way ANOVA test wuld indicate that a gene is differentially expressed in at least ne f the grups analyzed. If there are mre than tw grups being analyzed, hwever, the 1-way ANOVA des nt specifically indicate which pair f grups exhibits statistical differences. Pst Hc tests can be applied in this specific situatin t determine which specific pair/pairs are differentially expressed. This dcument will prvide the necessary infrmatin fr yu t perfrm these analyses within GeneSpring. II. Befre Perfrming 1-way ANOVA A Checklist 1. D yu have replicates fr the experimental grups that yu are abut t cmpare? Statistical tests that cmpare ne grup t anther, such as Student s t-test/anova, need variance and means fr each grup. Withut replicates, the variance fr each grup cannt be cmputed using standard methds. Hwever, variance fr experimental grups withut replicates can be cmputed by applying the GeneSpring Crss-Gene Errr Mdel. If n replicates are available, apply the Errr Mdel based n Deviatin frm 1 befre prceeding. Please refer t the GeneSpring user manual, nline tech ntes, webinars, r crss-gene errr mdel features sheet t learn mre abut the Crss-Gene Errr Mdel. 2. Have yu filtered ut genes whse measurements are mstly unreliable? 3. Have yu defined ne parameter in the Experiment Parameters windw indicating which sample belngs t which grup? 4. If yu plan t use a parametric test, have yu changed the analysis mde t Lg f Rati in the Experiment Interpretatin windw? Parametric tests assume that means f the ppulatins under study are nrmally distributed (Gaussian distributin). Interpreting yur data in lg mde will make data mre Nrmal/Gaussian than rati mde. It is mandatry that yu either have replicates r apply the crss-gene errr mdel if n replicates are available, in rder t perfrm 1-way ANOVA fr grups under study. It is als recmmended (thugh nt mandatry) that yur statistical analysis be perfrmed n a set f reliable genes, instead f all genes, n the chips. 2

III. Overview f the Statistical Analysis (1-way ANOVA tests) windw 1. G t Tls tlbar and select Statistical Analysis (ANOVA) 3. In the resulting windw, select the 1-Way Tests tab Figure 1: Statistical Analysis (ANOVA) 1-way tests windw Chse Gene List: Select the gene list cntaining the set f genes yu wuld like t analyze. Statistical tests will be perfrmed nly n genes in the selected gene list. Again, it is recmmended that the all genes gene list shuld nt be used. Instead, use a list f genes that has been filtered t remve genes with measurements mstly in the nise range r mstly flagged Absent. Chse Experiment: Chse the experiment and its prper interpretatin t analyze. If yu are using parametric tests, then yur experiment interpretatin shuld be in lg-f-rati mde. Parameter t Test: Select the parameter and the underlying grups t cmpare. In the example shwn abve, the parameter, Drug Agent was selected t cmpare the effect f different drug agents n Sprague-Dawley rats. If yu wuld like t cmpare nly selected cnditins fr this parameter, pen the Select Grups Manually windw, and uncheck the cnditins that yu wuld like t ignre. Only grups that are checked will be analyzed. Test Type: Select the apprpriate 1-way ANOVA test type. If yu are using a parametric test, make sure yur data has been lg-transfrmed (by selecting lg-f-rati mde in experiment interpretatin windw). False Discvery Rate: Indicates the verall rate f false psitive. The wrding fr this ptin, and its final effect n the number f false psitives, changes accrding t the multiple testing crrectin selected in the ptin belw. Multiple Testing Crrectin: This test ptin is nt required fr analysis, but it will allw yu t keep the verall rate f false psitive lw. Pst Hc Tests: This test ptin is als nt required fr analysis, but selecting this ptin will allw yu t determine which pair(s) amng the grups under study have expressin means that are statistically different. 3

IV. General backgrund n 1-way ANOVA test a. Null Hypthesis: The hypthesis fr each gene is that there is n difference in the mean gene expressin intensities in the grups tested. In ther wrds, the gene will have equal means acrss every grup. Example f a specific null hypthesis: There is n difference in the mean gene expressin intensities fr the bcl-2 gene acrss all rat grups treated with different drug agents. b. Number f genes analyzed: All genes in the selected gene list will be analyzed. If there are 10,000 genes n yur gene list (assuming yu have all required measurements fr each f the genes), then there are 10,000 separate analyses being perfrmed and each gene will have a separate p-value. c. Test Optins: Optins Specific test used (analyzing 2 grups) Specific test used (analyzing mre than 2 grups) Parametric (variances equal) Student s T-test ANOVA Parametric (variances nt equal) Welch t-test Welch ANOVA Parametric (use all available errr estimate) Welch t-test using errr mdel variances Welch ANOVA using errr mdel variances Nnparametric Wilcxn-Mann-Whitney test Kruskal-Wallis test d. Recmmendatins: The Welch test (variances nt assumed equal) is recmmended fr mst cases. This is set as the default. The parametric test, use all available errr estimate, is similar t Welch test but has better variance estimates. T use this ptin, the Crss-gene errr mdel needs t be activated in the Experiment Interpretatin windw. Student s t-test/anova (variances assumed equal) shuld be used if very few replicates are available, r if sme grups being analyzed d nt have replicates. Nnparametric test makes the least assumptins abut yur data but shuld be used nly when there are mre than 5 replicates per grup. e. P-value Indicates the prbability f getting a mean difference between the grups as high as what is bserved by chance. The lwer the p-value, the mre significant the difference between the grups. 4

V. Multiple Testing Crrectins (MTC): When testing a set f genes fr statistical significance acrss varius grups, sme f the genes may be falsely cnsidered as statistically significant. If 10,000 genes are tested fr differential expressin between grups, with a significance p-value cutff f 0.05, then the expected level f genes t be identified as significant by chance alne, even if there is n true differential expressin, is 500 genes: 10,000 x 0.05 = 500 genes Pssible false psitives = (# f genes) (p-value cutff) The purpse f a multiple testing crrectin is t keep the verall errr rate/false psitives t less than the userspecified p-value cutff, even if thusands f genes are being analyzed. a. Optins Test Type Type f Errr cntrl Genes identified by chance after MTC Bnferrni Bnferrni step-dwn (Hlm) Family-wise errr rate If testing 10,000 genes with p-cutff equals 0.05, then expects 0.05 genes t be significant by chance Same as abve Westfall and Yung Permutatin Benjamini and Hchberg b. Recmmendatins: False Discvery Rate Same as abve If testing 10,000 genes with p-cutff equals 0.05, then pssible genes identified by chance is 5% f genes that passed restrictin (cnsidered statistically significant) The recmmended crrectin fr multiple testing is Benjamini and Hchberg False Discvery Rate prcedure. This prcedure is the least stringent f all the methds mentined abve, but it prvides a gd balance between discvery f statistically significant differences in gene expressin and prtectin against false psitives (Type I errr). The stringency f MTC prcedures mentined increases as the number f genes being tested (genes n selected gene list) increases. The fllwing example illustrates this situatin: If: number f genes n gene list = 10,000 p-value cutff = 0.05 p-value fr Gene A withut MTC equals 0.000006 If the Bnferrni multiple testing crrectin was applied t this analysis, then the p- value fr Gene A with MTC equals 0.06: P value with MTC = 10,000 x 0.000006 It is therefre recmmended that yu perfrm statistical analysis n a list f genes that have been filtered fr unreliable genes since the multiple testing crrectins are directly affected by the number f genes n yur gene list. Fr a mre cmprehensive discussin n multiple testing, see the Multiple Testing Crrectins Features Sheet, refer t the user manual, r attend ur Statistics wrkshp. 5

VI. Pst Hc Tests: 1-way ANOVA determines whether a gene is differentially expressed in any f the cnditins tested. Hwever, it des nt indicate which specific grup pair(s) are the nes where statistical differences ccur. Pst Hc Test can be used in cnjunctin with ANOVA t determine which specific grup pair(s) are statistically different frm each ther. a. Optins: Test Name Tukey Student- Newman-Keuls (SNK) test: Hw it wrks All means fr each cnditin are ranked in rder f magnitude; grup with lwest mean gets a ranking f 1. The pairwise differences between means, starting with the largest mean cmpared t the smallest mean, are tabulated between each grup pair and divided by the standard errr. This value, q, is cmpared t a Studentized range critical value. If q is larger than the critical value, then the expressin between that grup pair is cnsidered t be statistically different. This test is similar t the Tukey test, except with regard t hw the critical value is determined. All q s in Tukey s test are cmpared t the same critical value determined fr that experiment; whereas all q s determined frm SNK test are cmpared t a different critical value. This makes the SNK test slightly less cnservative than the Tukey test. ** There are nnparametric and parametric versins f Tukey and Student-Newman-Keuls test. GeneSpring will apply the crrect ptin based n whether a parametric r nnparametric ANOVA test was chsen. VII. Interpreting the Results a. Results frm 1-way ANOVA withut Pst Hc test applied Figure 2 belw shws an example f a 1-way ANOVA result withut a Pst Hc test applied. The Ntes sectin indicates what setting was used fr this analysis and the percentage f genes that culd have been identified by chance. The genes in this gene list were fund t have measurements cnsidered statistically different acrss at least ne grup-pair. Yu cannt tell which exact grup was differentially expressed frm this analysis. Figure 2: 1-way ANOVA result 6

b. Results f 1-way ANOVA with Pst Hc test applied 1-way ANOVA with Pst Hc test applied will return the windw shwn in Figure 2, and als the windws shwn in Figures 3 and 4 belw. Figure 3: 1-way ANOVA with Pst Hc test, Summary by Gene tab This windw lists all the genes cnsidered differentially expressed by statistical criteria. Grups with the highest clr differential have the mst significant difference. Grups with the same clr shw n statistical difference fr that gene. A grup clred grey is cnsidered t be unknwn because the significance f its mean difference cannt be determined with cnfidence frm the test used. 7

Figure 4: 1-way ANOVA with Pst Hc test, Summary by Grups tab This windw indicates the ttal number f genes that are statistically differentially expressed between the grups being cmpared in the matrix. Greater clr saturatin indicates greater difference (r similarity). Ttal number f genes analyzed is shwn in the bx clred grey. Gene list can be generated frm each, r cmbinatin f the bxes, by highlighting the apprpriate bxes and selecting Make List f Unin r Make List f Intersectin. VIII. Viewing P-values Generated The assciated p-value fr the genes n this gene list culd be viewed in GeneSpring using the fllwing methds: 1. Gene List Inspectr: Duble-click n the selected gene list t pen up the Gene List Inspectr windw. P values are shwn under the P-value clumns. 2. Ordered List: Select the gene list and g t View Ordered List. Genes are displayed accrding t p-values: smallest p-values are n the left-hand side, highest p-values are n the right-hand side. 3. Exprt ut f GeneSpring: Highlight the gene list and g t Edit Cpy Cpy Anntated Gene List and select t exprt ut Gene List Assciated Values. 8

IX. Mst frequently asked questins and answers Q. Why d I get an errr message saying I have n degrees f freedm (such as the message shwn belw)? A. This errr message indicates that there are n replicates in the grups being cmpared. The degree f freedm is a mathematical way f representing the number f replicates/samples. Zer degrees f freedm indicates there are n replicates, and thus a 1-way ANOVA CANNOT be perfrmed. If n replicates are available, but yu wuld still like t perfrm a statistical analysis, then the Crss-Gene Errr Mdel needs t be activated and the Parametric test, use all available errr estimate must be used. If yu d have replicates but get this errr message, then check yur parameter t ascertain that it was set up crrectly t indicate which samples are cnsidered replicates. GeneSpring will nt autmatically knw which samples are replicates unless specified crrectly in the Experiment Parameter windw and selected in the Parameter t Test field. Q. Why d I get zer genes passing the restrictin when I perfrm statistical analysis? A. There can be several explanatins fr this bservatin: i. Analysis criteria might be t stringent (lw p-value cut-ff and cnservative multiple testing crrectin) ii. Nt enugh replicates in each grup resulting in insufficient pwer t detect real differences between grups under study iii. Bilgically, there may nt be differential gene expressin. X. References Zar, J. (1999) Bistatistical Analysis. (4 th ed.) Upper Saddle River, NJ, Prentice Hall. 9