Implementing Multiple Comparisons on Pearson Chi-square Test for an R C Contingency Table in SAS
|
|
|
- Brook Copeland
- 9 years ago
- Views:
Transcription
1 Paper Implementing Multiple Comparisons on Pearson Chi-square Test for an R C Contingency Table in SAS Man Jin, Forest Research Institute; Binhuan Wang, New York University School of Medicine ABSTRACT This paper illustrates a permutation method for implementing multiple comparisons on Pearson s Chi-square test for an R C contingency table, using the SAS procedure FREQ and a newly developed SAS macro called CHISQ_MC. This method is analogous to the Tukey-type multiple comparison method for one-way analysis of variance. INTRODUCTION A common statistical analysis for count data is to compare the frequency distributions of a multinomial outcome of interest for two or more groups. Assume that there are R groups and the categorical outcome variable has C nominal categories. The data for this analysis can be displayed in a contingency table with R rows and C columns. In SAS, the analysis can be performed using the FREQ procedure with /CHISQ or /Pearson option. The output provides the result of using Pearson s Chi-squared test for testing against the null hypothesis that the frequency distributions are the same across R groups. If the Chi-square test results in a p-value which is smaller than some significance level, say 0.05, then there is strong evidence to state that there is difference between at least two of the groups. In the case when there are only two groups being compared (2 C table), it is easy to conclude that the frequency distributions are different between the two groups. However, in the case where there are three or more groups, there is no follow-up test available in SAS to directly identify the significant differences between pairs of groups. In this paper, we propose a Tukey-type multiple comparison method for R C tables based on permutation. The method is analogous to the Tukey-type multiple comparison method for one-way analysis of variance (ANOVA; Tukey, 1953, by option TUKEY in PROC GLM), and methods for R 2 contingency tables (Elliott and Reisch, 2006, by SAS macro CompProp; Brown and Fears, 1981, by option Fisher in PROC MULTTEST). As pointed out by SAS Usage Note Performing multiple comparisons or tests on subtables following a significant Pearson chi-square ( there are no multiple comparison methods directly available in PROC FREQ, but there are a number of approaches to consider, including the following three approaches. First, correspondence analysis is often used as a way to visualize the non-independence in a table. However, no statistical tests of comparisons are performed. Second, use the WHERE statement to conduct Pearson s Chi-square test on each pair of groups and then use PROC MULTTEST to adjust pairwise p-values. However, there is no SAS macro can implement these steps directly and the methods available are less powerful than the Tukey-type adjustment. Third, multiple comparisons can be done by fitting a Poisson loglinear model with PROC GENMOD, and then constructing custom CONTRAST statements to test pair comparisons. However, the method is model based and is less powerful than the Tukey-type adjustment. METHOD To illustrate the newly proposed method, we consider the same example used in the aforementioned SAS Usage Note The SAS output of the data Melanoma are printed in Table 1, where Type is the group variable (R=4) and Site the outcome variable (C=3). The significant chi-square test, produced by the CHISQ option in PROC FREQ, indicates lack of independence between TYPE and SITE. First, we develop a macro, which is called PAIRWISE_CHISQ and used as a subroutine in main macro CHISQ_MC, to calculate Chi-square test statistics and p-values for all pairs of groups. Because there are 4 groups, there are 4 3/2=6 pairs and consequently there are 6 comparisons. The output of PAIRWISE_CHISQ is shown in Table 2, where the p-values are unadjusted p-values. The most conservative multiple comparison is Bonferroni correction 1
2 (e.g., Abdi, 2007), which use significant level =0.3 instead of 0.05 to claim a significance, leading to familywise-error-rate (FWER) α=0.05. Obs Type site Count 1 Hutchinson's Head 22 2 Hutchinson's Trunk 2 3 Hutchinson's Extremities 10 4 Superficial Head 16 5 Superficial Trunk 54 6 Superficial Extremities Nodular Head 19 8 Nodular Trunk 33 9 Nodular Extremities Indeterminate Head Indeterminate Trunk Indeterminate Extremities 28 Table 1: Melanoma Data. Overall Chi-square Test Statistic=65.81 (df= 6; p<0.0001). Obs PAIR_COMPARED CHI_STAT DF P_VALUE 1 Hutchinson's vs Indeterminate Hutchinson's vs Nodular Hutchinson's vs Superficial Indeterminate vs Nodular Indeterminate vs Superficial Nodular vs Superficial Table 2: Output of Macro PAIRWISE_CHISQ. There are 6 Comparisons. To describe the Tukey-type adjustment, denote pairwise Chi-squared test statistics as number of pairs. Under the null hypothesis that there is no difference across groups, where m is the. Motivated by the idea of Tukey s honest significance test (HSD), we consider the distribution of under the null hypothesis. Denote the upper α 100% percentile of the null distribution of Q as and use it as critical value, then the family-wise-error-rate is controlled as. In ANOVA, individual test statistic is T-test statistic and Q follows the studentize range distribution under the null hypothesis. However, in the current case where individual test statistic is Chi-square test statistic and the null distribution of Q is unknown. Here we develop a permutation procedure to approximate the null distribution of Q. After we ungroup the grouped data Melanoma in Table 1, we randomly permute the group assignment. For each random permutation j, we execute macro PAIRWISE_CHISQ, obtain the corresponding pairwise Chi-square test statistics as in Table 2, and calculate the maximum, say, of these Chi-square test statistics. If we repeat such random permutation by many times, say J=1000, we can approximate the p-value for We call such p-values Tukey-type adjusted p-values. by 2
3 These steps are written into our main macro CHISQ_MC. This macro has 6 arguments: (1) grouped dataset; (2) group variable; (3) outcome variable; (4) frequency variable; (5) number of permutations; and (6) seed for random permutations. Table 3 is obtained by running: %Chisq_MC(data=melanoma, group=type, outcome=site, count=count, num_perm=5000, seed=100); In output such as Table 3, the original pairwise p-values are shown in the 5th column. The Bonferroni adjusted p- values, which are the multiplications of the original p-values by the number of pairs (and truncated by 1), are shown in the 6th column. The Tukey-type adjusted p-values, which are based on random permutation, are shown in the last column. We add the seed argument in the macro just for reproducibility. Both Bonferroni adjusted p-values and Tukey adjusted p-values can be compared with some α, say α=0.05, to control the FWER as α. Obs PAIR_COMPARED CHI_STAT DF P_VALUE P_BONFERRONI P_TUKEY 1 Hutchinson's vs Indeterminate Hutchinson's vs Nodular Hutchinson's vs Superficial Indeterminate vs Nodular Indeterminate vs Superficial Nodular vs Superficial Table 3: Final Output of Macro Chisq_MC. The Last Column is Based on 5,000 Permutations. MACROS The code for the macro PAIRWISE_CHISQ: %macro Pairwise_Chisq(data=, group=, outcome=, count=); proc freq data=&data noprint; table &group/out=out_group; data groupid; set out_group; GID=_n_; data _null_; set groupid nobs=nobs; call symput('num_group', compress(put(nobs, 11.))); stop; data ChiTest; *store pairwise Chisquare statistics to this; stop; %Do group1=1 %to &num_group-1; *for each pair obtain chisquare stat; %Do group2=&group1+1 %to &num_group; data group_pair; set groupid; if GID EQ &group1 then call symput('name_group1', type); if GID EQ &group2 then call symput('name_group2', type); proc freq data=&data noprint; output out=ctout PCHI; tables &group*&outcome /chisq cellchi2 nopercent; where &group in ("&name_group1", "&name_group2"); weight &count; 3
4 data CTout1; set CTout; PAIR_COMPARED="&name_group1" " vs " "&name_group2"; data ChiTest; set ChiTest CTout1; %End; %End; data ChiTest; *rename variables for final results; set ChiTest; CHI_STAT=_PCHI_; DF=DF_PCHI; P_VALUE=P_PCHI; keep pair_compared Chi_stat DF p_value; %mend Pairwise_Chisq; The code for macro CHISQ_MC: %macro Chisq_MC (data=, group=, outcome=, count=, num_perm=, seed=); options nosource nonotes; data data_ungrouped; *convert grouped data to ungrouped data; set &data; keep &group &outcome; do i=1 to &count; output; end; *** excute submacro pairwise_chisq on the original data; %pairwise_chisq(data=&data, group=&group, outcome=&outcome, count=&count); data ChiTest_original; *save the pairwise chisquare test on original data; set ChiTest; data ChiMax_all; *store max chisq stats from all permutations stop; *** repeart permutations; %Do perm=1 %to &num_perm; data outcome_var; *start to ermute the group assignment; set data_ungrouped; keep &outcome; data group_var; set data_ungrouped; random_number=ranuni(&seed); call symput('seed', &seed+1); keep &group random_number; proc sort data=group_var; *randomly permute group assignment; by random_number; data data_permuted; *obtain a randomly permuted data; merge group_var outcome_var; proc freq data=data_permuted noprint; *group ungrouped permuted data; table &group*&outcome/out=data_permuted_grouped; *execute the submacro pairwise_chisq on permuted data; %pairwise_chisq(data=data_permuted_grouped, group=&group, outcome=&outcome, count=count); Data ChiTest_perm; *save the pairwise chisq stat on a permuted data; set ChiTest; proc iml; *calculate the max of pairwise chisq on a permuted data; use ChiTest_perm; read all; Chi_max=max(Chi_stat); 4
5 create ChiMax_perm var {Chi_max &perm}; append; quit; data ChiMax_all; set ChiMax_all ChiMax_perm; %End; *end of all permutations; proc iml; use ChiTest_original; read all; use ChiMax_all; read all; n_pair=nrow(p_value);*number of pairs compared; *initinals for Bonferroni and Tukey ajusted p-values; p_bonferroni=p_value; p_tukey=p_value; *for each pair calculate adjusted p-values; do pair=1 to n_pair; p_bonferroni[pair]=min(p_value[pair]*n_pair,1); p_tukey[pair]=sum(chi_max>chi_stat[pair])/nrow(chi_max); end; create p_adjusted var{p_bonferroni p_tukey}; append; quit; data output_final; *THIS IS FINAL OUTPUT; merge ChiTest_original p_adjusted; proc print data=output_final; *PRINT FINAL OUTPUT; title "RESULTS of MACRO Chisq_MC"; options source notes; %mend Chisq_MC; CONCLUSION The proposed multiple comparison method for an R C contingency table analysis provides a post hoc test when the overall Chi-square test is significant. The proposed macro CHISQ_MC makes the interpretation of results easier and clearer. The proposed method can also be applied to arbitrary comparisons other than pairwise, and to other test statistics other than Chisq-test. REFERRENCES Abdi, H. (2007). Bonferroni and Šidák corrections for multiple comparisons. In N.J. Salkind (ed.). Encyclopedia of Measurement and Statistics. Thousand Oaks, CA: Sage. Brown, C.C. and Fears, T.R. (1981), Exact significance levels for multiple binomial testing with application to carcinogenicity screens, Biometrics 37: Elliott, A.C. and Reisch, J.S. (2006). Implementing a multiple comparison test for proportions in a 2xC crosstabulation in SAS. Proceedings of the SAS User's Group International 31, San Francisco, CA, March 26 29, Paper Mantel, N. (1966). Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports 50: Šidák, Z. (1967). Rectangular confidence regions for the means of multivariate normal distributions. Journal of the American Statistical Association 62: Tukey, J.W. (1953). The problem of multiple comparisons. Dittoed manuscript of 396 pages, Department of Statistics, Princeton University. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Man Jin Forest Research Institute Harborside Financial Center, PLAZA 5 Jersey City, NJ Phone: (201) [email protected] 5
6 SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 6
SAS/STAT. 9.2 User s Guide. Introduction to. Nonparametric Analysis. (Book Excerpt) SAS Documentation
SAS/STAT Introduction to 9.2 User s Guide Nonparametric Analysis (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.2 User s Guide. The correct bibliographic citation
Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY
Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY ABSTRACT PROC FREQ is an essential procedure within BASE
Beginning Tutorials. PROC FREQ: It s More Than Counts Richard Severino, The Queen s Medical Center, Honolulu, HI OVERVIEW.
Paper 69-25 PROC FREQ: It s More Than Counts Richard Severino, The Queen s Medical Center, Honolulu, HI ABSTRACT The FREQ procedure can be used for more than just obtaining a simple frequency distribution
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
Multiple samples: Pairwise comparisons and categorical outcomes
Multiple samples: Pairwise comparisons and categorical outcomes Patrick Breheny May 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/19 Introduction Pairwise comparisons In the previous lecture,
Basic Statistical and Modeling Procedures Using SAS
Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom
Chapter 12 Nonparametric Tests. Chapter Table of Contents
Chapter 12 Nonparametric Tests Chapter Table of Contents OVERVIEW...171 Testing for Normality...... 171 Comparing Distributions....171 ONE-SAMPLE TESTS...172 TWO-SAMPLE TESTS...172 ComparingTwoIndependentSamples...172
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition
Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application
Multiple-Comparison Procedures
Multiple-Comparison Procedures References A good review of many methods for both parametric and nonparametric multiple comparisons, planned and unplanned, and with some discussion of the philosophical
EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST
EPS 625 INTERMEDIATE STATISTICS The Friedman test is an extension of the Wilcoxon test. The Wilcoxon test can be applied to repeated-measures data if participants are assessed on two occasions or conditions
Comparing Multiple Proportions, Test of Independence and Goodness of Fit
Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2
Nonparametric Statistics
Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics
1 Overview. Fisher s Least Significant Difference (LSD) Test. Lynne J. Williams Hervé Abdi
In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 2010 Fisher s Least Significant Difference (LSD) Test Lynne J. Williams Hervé Abdi 1 Overview When an analysis of variance
TABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2
About Chi Squares TABLE OF CONTENTS About Chi Squares... 1 What is a CHI SQUARE?... 1 Chi Squares... 1 Goodness of fit test (One-way χ 2 )... 1 Test of Independence (Two-way χ 2 )... 2 Hypothesis Testing
Projects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
Package dunn.test. January 6, 2016
Version 1.3.2 Date 2016-01-06 Package dunn.test January 6, 2016 Title Dunn's Test of Multiple Comparisons Using Rank Sums Author Alexis Dinno Maintainer Alexis Dinno
Section 12 Part 2. Chi-square test
Section 12 Part 2 Chi-square test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of
Chapter 19 The Chi-Square Test
Tutorial for the integration of the software R with introductory statistics Copyright c Grethe Hystad Chapter 19 The Chi-Square Test In this chapter, we will discuss the following topics: We will plot
1 Nonparametric Statistics
1 Nonparametric Statistics When finding confidence intervals or conducting tests so far, we always described the population with a model, which includes a set of parameters. Then we could make decisions
SPSS Tests for Versions 9 to 13
SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list
UNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
Descriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
Paper PO-008. exceeds 5, where and (Stokes, Davis and Koch, 2000).
Paper PO-008 A Macro for Computing the Mantel-Fleisś Criterion Brandon Welch MS, Rho, Inc., Chapel Hill, NC Jane Eslinger BS, Rho, Inc., Chapel Hill, NC Rob Woolson MS, Rho, Inc., Chapel Hill, NC ABSTRACT
THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.
THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM
Tests for Two Proportions
Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics
Chapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
Analysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
The Kendall Rank Correlation Coefficient
The Kendall Rank Correlation Coefficient Hervé Abdi Overview The Kendall (955) rank correlation coefficient evaluates the degree of similarity between two sets of ranks given to a same set of objects.
SPSS Explore procedure
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,
Crosstabulation & Chi Square
Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among
Analysis of categorical data: Course quiz instructions for SPSS
Analysis of categorical data: Course quiz instructions for SPSS The dataset Please download the Online sales dataset from the Download pod in the Course quiz resources screen. The filename is smr_bus_acd_clo_quiz_online_250.xls.
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution
Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
Section 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected]
Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected] Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
Quick Start to Data Analysis with SAS Table of Contents. Chapter 1 Introduction 1. Chapter 2 SAS Programming Concepts 7
Chapter 1 Introduction 1 SAS: The Complete Research Tool 1 Objectives 2 A Note About Syntax and Examples 2 Syntax 2 Examples 3 Organization 4 Chapter by Chapter 4 What This Book Is Not 5 Chapter 2 SAS
Statistical tests for SPSS
Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly
Is it statistically significant? The chi-square test
UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical
VI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
Introduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
Chi-square test Fisher s Exact test
Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
Lecture 14: GLM Estimation and Logistic Regression
Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South
Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.
Consider a study in which How many subjects? The importance of sample size calculations Office of Research Protections Brown Bag Series KB Boomer, Ph.D. Director, [email protected] A researcher conducts
Descriptive Analysis
Research Methods William G. Zikmund Basic Data Analysis: Descriptive Statistics Descriptive Analysis The transformation of raw data into a form that will make them easy to understand and interpret; rearranging,
This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.
One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.
An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA
ABSTRACT An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA Often SAS Programmers find themselves in situations where performing
Counting the Ways to Count in SAS. Imelda C. Go, South Carolina Department of Education, Columbia, SC
Paper CC 14 Counting the Ways to Count in SAS Imelda C. Go, South Carolina Department of Education, Columbia, SC ABSTRACT This paper first takes the reader through a progression of ways to count in SAS.
ABSORBENCY OF PAPER TOWELS
ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?
Two Correlated Proportions (McNemar Test)
Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with
Two-Sample T-Tests Assuming Equal Variance (Enter Means)
Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of
Package HHG. July 14, 2015
Type Package Package HHG July 14, 2015 Title Heller-Heller-Gorfine Tests of Independence and Equality of Distributions Version 1.5.1 Date 2015-07-13 Author Barak Brill & Shachar Kaufman, based in part
An introduction to IBM SPSS Statistics
An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive
Data analysis process
Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis
Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
Creating Dynamic Reports Using Data Exchange to Excel
Creating Dynamic Reports Using Data Exchange to Excel Liping Huang Visiting Nurse Service of New York ABSTRACT The ability to generate flexible reports in Excel is in great demand. This paper illustrates
Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY
Paper FF-007 Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY ABSTRACT SAS datasets include labels as optional variable attributes in the descriptor
SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES
SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR
Rank-Based Non-Parametric Tests
Rank-Based Non-Parametric Tests Reminder: Student Instructional Rating Surveys You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs
The Chi-Square Test. STAT E-50 Introduction to Statistics
STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed
Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions
BMS 617 Statistical Techniques for the Biomedical Sciences Lecture 11: Chi-Squared and Fisher's Exact Tests Chi Squared and Fisher's Exact Tests This lecture presents two similarly structured tests, Chi-squared
Lecture 19: Conditional Logistic Regression
Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina
Using Stata for Categorical Data Analysis
Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,
Testing differences in proportions
Testing differences in proportions Murray J Fisher RN, ITU Cert., DipAppSc, BHSc, MHPEd, PhD Senior Lecturer and Director Preregistration Programs Sydney Nursing School (MO2) University of Sydney NSW 2006
Generalized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
Statistics for Sports Medicine
Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota ([email protected]) Fellow s Research Conference July 2012: Philadelphia GOALS Try not to bore you to death!! Try to teach
Statistics, Data Analysis & Econometrics
Using the LOGISTIC Procedure to Model Responses to Financial Services Direct Marketing David Marsh, Senior Credit Risk Modeler, Canadian Tire Financial Services, Welland, Ontario ABSTRACT It is more important
1 Overview and background
In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 010 The Greenhouse-Geisser Correction Hervé Abdi 1 Overview and background When performing an analysis of variance with
Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups
Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln Log-Rank Test for More Than Two Groups Prepared by Harlan Sayles (SRAM) Revised by Julia Soulakova (Statistics)
13: Additional ANOVA Topics. Post hoc Comparisons
13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Kruskal-Wallis Test Post hoc Comparisons In the prior
Biostatistics: Types of Data Analysis
Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics [email protected] http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS
Independent t- Test (Comparing Two Means)
Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent
Analysis of Variance. MINITAB User s Guide 2 3-1
3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced
The Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
Lecture 8. Confidence intervals and the central limit theorem
Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of
Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)
Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption
How Does My TI-84 Do That
How Does My TI-84 Do That A guide to using the TI-84 for statistics Austin Peay State University Clarksville, Tennessee How Does My TI-84 Do That A guide to using the TI-84 for statistics Table of Contents
CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS
CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHI-SQUARE TESTS OF INDEPENDENCE (SECTION 11.1 OF UNDERSTANDABLE STATISTICS) In chi-square tests of independence we use the hypotheses. H0: The variables are independent
Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics
Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to
Mind on Statistics. Chapter 15
Mind on Statistics Chapter 15 Section 15.1 1. A student survey was done to study the relationship between class standing (freshman, sophomore, junior, or senior) and major subject (English, Biology, French,
Study Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
An introduction to using Microsoft Excel for quantitative data analysis
Contents An introduction to using Microsoft Excel for quantitative data analysis 1 Introduction... 1 2 Why use Excel?... 2 3 Quantitative data analysis tools in Excel... 3 4 Entering your data... 6 5 Preparing
Paper PO06. Randomization in Clinical Trial Studies
Paper PO06 Randomization in Clinical Trial Studies David Shen, WCI, Inc. Zaizai Lu, AstraZeneca Pharmaceuticals ABSTRACT Randomization is of central importance in clinical trials. It prevents selection
Research Methods & Experimental Design
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
Final Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.
1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis
ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups
ANOVA ANOVA Analysis of Variance Chapter 6 A procedure for comparing more than two groups independent variable: smoking status non-smoking one pack a day > two packs a day dependent variable: number of
Simulating Chi-Square Test Using Excel
Simulating Chi-Square Test Using Excel Leslie Chandrakantha John Jay College of Criminal Justice of CUNY Mathematics and Computer Science Department 524 West 59 th Street, New York, NY 10019 [email protected]
Friedman's Two-way Analysis of Variance by Ranks -- Analysis of k-within-group Data with a Quantitative Response Variable
Friedman's Two-way Analysis of Variance by Ranks -- Analysis of k-within-group Data with a Quantitative Response Variable Application: This statistic has two applications that can appear very different,
XPost: Excel Workbooks for the Post-estimation Interpretation of Regression Models for Categorical Dependent Variables
XPost: Excel Workbooks for the Post-estimation Interpretation of Regression Models for Categorical Dependent Variables Contents Simon Cheng [email protected] php.indiana.edu/~hscheng/ J. Scott Long [email protected]
Categorical Variables
Categorical Variables Variables which record a response as a set of categories are termed categorical. Such variables fall into three classifications: Nominal, Ordinal, and Interval. Nominal variables
Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) 90. 35 (d) 20 (e) 25 (f) 80. Totals/Marginal 98 37 35 170
Work Sheet 2: Calculating a Chi Square Table 1: Substance Abuse Level by ation Total/Marginal 63 (a) 17 (b) 10 (c) 90 35 (d) 20 (e) 25 (f) 80 Totals/Marginal 98 37 35 170 Step 1: Label Your Table. Label
Statistical Functions in Excel
Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.
Introduction to Analysis of Variance (ANOVA) Limitations of the t-test
Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only
Salary. Cumulative Frequency
HW01 Answering the Right Question with the Right PROC Carrie Mariner, Afton-Royal Training & Consulting, Richmond, VA ABSTRACT When your boss comes to you and says "I need this report by tomorrow!" do
Applying Customer Attitudinal Segmentation to Improve Marketing Campaigns Wenhong Wang, Deluxe Corporation Mark Antiel, Deluxe Corporation
Applying Customer Attitudinal Segmentation to Improve Marketing Campaigns Wenhong Wang, Deluxe Corporation Mark Antiel, Deluxe Corporation ABSTRACT Customer segmentation is fundamental for successful marketing
Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)
1 Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship) I. Authors should report effect sizes in the manuscript and tables when reporting
