# Analysing Questionnaires using Minitab (for SPSS queries contact -)

Size: px
Start display at page:

Transcription

1 Analysing Questionnaires using Minitab (for SPSS queries contact -) Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections: Questions to identify definite groups that each subject falls into, e.g. male/female, age range, educational background, etc Questions to describe respondents initial/background opinions, knowledge, e.g. How often do you watch TV shows like CSI less than once a month, once a month, once a week, more than once a week? Questions related to specific issues to be assessed, e.g. How strong do you think this evidence is? Are these two fingerprints a match? As a first example, we have the following extract from a data set Grouping Background Responses Subject G1 G B1 B B3 R1 R R3 D1 D 1 M F M F Levels: M/F /1 0/1 Data Set 1 In this data set there are 100 records corresponding to 100 respondents or subjects. Factual data to group the respondents: G1 is nominal data that could describe a binary group, e.g male/female G is ordinal data that could describe a progressive group, e.g. age range Background questions to describe initial opinions, knowledge: B1 (1 or ), B (1 to 3), B3 (1 to 3) Response questions: R1, R scale data on Likert scales 1 to 4 R3 scale data on Likert scale 1 to 10 D1, D logical/digital data (D1, D are actually derived from values of R1, R greater than or less than.5) A general, but very important point, is that the number of conclusions that can be drawn and the power of any tests depend critically on the amount of data collected. It is important to get as many responses as possible. Designing a Likert scale response question Many questions in a questionnaire invite the respondent to choose a response on a symmetrical Likert scale.

3 1. Using boxplots to explore raw data Boxplots are an excellent way to explore your own data and to present raw data in your report. 10 Boxplot of R3 In the example, the boxplot of response R3, grouped separately for M and F responses to G1, suggests that there is no difference in the median value, but is there a difference in the spread or variance? Is the outlier significant? R F G1 M. Using histograms to check data distributions A histogram can be useful when you have several possible response levels to a question. It shows how the replies are distributed. The example here shows a uni-modal response, with one main peak. Frequency Histogram of R3 Sometimes you may see a bi-modal response with two different groups giving two different responses with two clear peaks R Is there a difference in the average responses, R1, R, R3 for the two values each of G1, B1? Using t-test R1 and R3 shows significant differences in the mean responses for the two B1 groups with p = and p = respectively. Mann-Whitney is preferable for data without normal distribution (Minitab needs data in two columns) giving p = and p = for the same tests as the t-tests above. 4. Is there a difference in the average responses R1, R, R3 for the three values each of G, B, B3 etc? Using One-Way ANOVA to test whether the mean of R1 changes for different values of G. Source DF SS MS F P G Error Total Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ( * ) ( * ) ( * ) Pooled StDev = Mean of R1 is different for levels 1 and from G. Using Kruskal-Wallis also shows differences in R1 for different levels of G, with p = 0.017

4 5. Are there any interactions between possible factors, i.e. does one factor have a different effect, depending on the value of another factor? Using GLM to test whether R1 is affect by G, B1 or an interaction between them Analysis of Variance for R1, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P G B G*B Error Total These results give R1 dependent on B1 but not now significant for G (compare with 1-way ANOVA above) and shows no significance for any interaction between them. Note: If you try to include too many factors or interactions, the available data is spread out and reduces the power of the individual analyses making a not significant result more likely. 6. Is there a difference in the variances of, R1, R, R3 for the two values each of G1, B1? For example using the variances test for R3 for the two groups in G1 Method DF1 DF Statistic P-Value F Test (normal) Levene's Test (any continuous) which is not significant for the F-test, but Levene s test suggests that there is a difference in the variance of F responses compared to those of M (see boxplot in 1. above) 7. Is there a difference in the proportion of 1 responses between D1 and D? Using the proportions test: Fisher's exact test: P-Value = 0.66 There is no significant difference - see 9. below for the measure of agreement. 8. Do the values of one answer change in the same way as those of another answer? For example: You might have good reason, from prior knowledge, to believe that the values of R1 change in a similar way to the values of the respondents group B1. A test for correlation between R1 and B1 produces significant correlation with p = However be aware that correlation is not the same as cause and effect. It is possible that B1 does not directly influence R1 (or vice versa), but both might be influenced by a third factor (see next example below). Alternatively, without any prior knowledge you might look randomly for any possible correlations between all of the answer. A test for bivariate correlations between all of the answers gives: Correlations: G, B1, B, B3, R1, R, R3 G B1 B B3 R1 R B B B R R R Cell Contents: Pearson correlation P-Value

5 In the above table the p-value is the lower value in the pair of values given for each combination of variables check that the p-value for R1 and B1 is again given as Bonferroni correction: You must be very careful with the above data because the chance of seeing a false significance (i.e. p < 0.05 just by chance) has increased considerably because you have calculated many (n = 1) p- values. Where you are randomly looking for possible significant results amongst several p-values, the Bonferroni correction gives a new 95% confidence critical value: Critical value = 0.05/n where n is the number of p-values calculated. In this case the critical value would be 0.05/1 ~ B1 and B3, R1 and B3, R1 and R show significant correlation, but not R1 and B1! It is often possible that two variables might show correlation just because they are both correlated with the same third variable: e.g. R1 is correlated with B3 and B3 is correlated with B1, so it is likely that R1 and B1 show some correlation (see previous example above). SPSS can perform partial correlations where is possible to compensate for a third correlated variable when calculating the correlation between two variables. 9. How good is the match, or agreement, between D1 and D A test for correlation between D1 and D gives p < , but it is sometimes important to know how strong the agreement is between D1 and D. For example, D1 and D might be two assessments of the match of a fingerprint using different assessment protocols. Using Attribute Agreement Analysis for D1, D Between Appraisers Assessment Agreement # Inspected # Matched Percent 95% CI (77.63, 9.13) Cohen's Kappa Statistics Response Kappa SE Kappa Z P(vs > 0) K > 0.7 suggests a good match There is a variety of other measures available to measure of the strength of association between questionnaire answers, which are grouped in the following table by the types of variables involved: nominal, ordinal and interval. They are also divided into two types: symmetrical and directional. In a directional (or asymmetric) association we measure how a knowledge of one (independent) variable can be used to predict the variation of the other (dependent) variable. In a symmetric association, there is no sense of direction and we measure only the extent to which the two variables vary in similar ways. For further information on these statistics, contact

6 Variable pairs Symmetric measures Directional measures Nominal / nominal Phi, φ Lambda, λ Cramer s V Kappa, κ Ordinal / ordinal Gamma, Γ Somers d Kendall s tau-b, τ Spearman s rho, ρ Coefficient of concordance Interval / interval Pearson s coefficient, r Linear regression Nominal / interval Eta, η Measures of association between questionnaire answers 10. Is the distribution of answers to one question related to (associated with) the way in which subjects answer another question? For example: Is the choice that subjects make for D1 related to their answers to B3? Using Cross Tabulation to count the numbers of respondents who fall into the 6 categories defined by levels of D1 multiplied by 3 levels of B3, together with a chi-squared test for association: Tabulated statistics: D1, B3 Rows: D1 Columns: B3 1 3 All All Cell Contents: Count Expected count Pearson Chi-Square = 8.199, DF =, P-Value = Likelihood Ratio Chi-Square = 8.4, DF =, P-Value = This shows that subjects with B3=1 are more likely to choose D1=1 than those with B3= Is an association between two answers dependent on the level of a third (or 4 th ) answer? For example: Is an association between answers (as for D1 and B3 above) dependent on the group (e.g. G) of the subject? Using cross tabulations and chi-squared, layered by group G: Tabulated statistics: D1, B3, G Results for G = 1 Rows: D1 Columns: B3 1 3 All All Cell Contents: Count Expected count Pearson Chi-Square = 0.311, DF =, P-Value = Likelihood Ratio Chi-Square = 0.311, DF =, P-Value = * NOTE * cells with expected counts less than 5

7 Results for G = Rows: D1 Columns: B3 1 3 All All Cell Contents: Count Expected count Pearson Chi-Square =.683, DF =, P-Value = 0.61 Likelihood Ratio Chi-Square =.588, DF =, P-Value = 0.74 * NOTE * 3 cells with expected counts less than 5 Results for G = 3 Rows: D1 Columns: B3 1 3 All All Cell Contents: Count Expected count Pearson Chi-Square = 7.778, DF =, P-Value = 0.00 Likelihood Ratio Chi-Square = 9.96, DF =, P-Value = * NOTE * 5 cells with expected counts less than 5 It is only group G = 3 that shows a significant association (using likelihood chi-squared) between D1 and B3 with the p-value less than 0.05/3 (using Bonferroni correction for 3 derived p-values see Bonferroni correction below) Note that as you try to get more information, the data is spread more thinly and a number of expected counts fall below 5. You either need to restrict the categories that you create by the different levels in your answers, or you need to get more responses to the questionnaire. 1. What is the difference between finding an association using chi-squared or a correlation between different answers? In most cases the choice of analysis is quite clear: Correlation is used for testing for a linear relationship between two x-y variables, and cannot be used with nominal data. Chi-squared is used for testing for differences in the spread of numbers of data values (frequencies) in different categories that can be described by nominal data. However, it is also useful to illustrate the differences by considering a situation where it could be possible to use either correlation or chi-squared analysis Two questions, Q1 and Q, both have three possible answer levels, 1,, 3, and counting the results of 76 responses might give the two possible sets of results as Data A and Data B in the table below: (For example the number 8 in the top middle of Data A shows that 8 respondents recorded 3 for Q and for Q1)

8 Q / Q1 1 3 Q / Q1 1 3 Data A Data B Correlation: p < p = 0.13 Chi-squared: p = p = The only difference between data sets A and B is that the responses, and 3 for Q1 have been reversed. If we perform a correlation analysis for Data A, we get p < , showing a high degree of correlation. We can see this in the data; a respondent who gives a high value for Q1 is also likely to give a high value for Q and vice versa. We can also imagine fitting a best fit straight line going diagonally from bottom left to top right through the most data points. However, in Data B, the cells with the most data points no longer follow a straight line; for example, a respondent who gives a 3 for Q1 is more likely to give a than a 3 for Q. If we perform a correlation analysis for Data B, we get p = 0.13, which says that there is not enough evidence to claim that there is significant correlation. This agrees with our visual picture. If we perform a chi-squared test we get the same significant differences in distribution with p = for both data sets. The chi-squared test treats the answers as different nominal categories without any specific order, and consequently a change of order makes no difference to the result. Chi-squared just tests for a difference in distribution of frequency values, whereas correlation looks specifically for a continuing change from one level to another. 13. Is it possible to model relationships between the different variables? Use Stepwise regression to see if it is possible to predict values of R1 from the data in B1, B, B3 Stepwise Regression: R1 versus G1, B1, B, B3 Response is R1 on 3 predictors, with N = 100 Step 1 Constant B T-Value P-Value B T-Value 1.87 P-Value Which gives an equation for R1: R1 = *B *B1 R1 depends on both B3 and B1 in agreement with correlations.

9 Use Stepwise regression to see if it is possible to predict values of R3 as a function of other variables. Stepwise Regression: R3 versus G1, B1, B, B3, R1, R Response is R3 on 6 predictors, with N = 100 Step 1 Constant B T-Value P-Value B T-Value.83 P-Value Which gives an equation for R3: R3 = *B *B1 R3 depends on both B and B1 in agreement with correlations. 14 Is there any clustering of answers or subjects hidden within our data? We can use another data set to illustrate this final set of analytical techniques: Subject G1 G B1 R1 R R3 R4 1 M B F B M C F A Levels: M/F A/B/C 0/1 1 to 5 0 to 10 0 to 10 0 to 10 Data Set Data set is a set of 0 responses, with two grouping questions, one background question and four response questions with the possible answer levels shown. This data set has been created to illustrate the use of cluster and principal component analysis. 14a Is there are similarities between the answers to different questions? The dendrogram in the diagram is based on correlation and shows the similarity between different variables ( questionnaire answers) Dendrogram Average Linkage, Correlation Coefficient Distance R3 and R4 are very similar to each other and show some similarity to R1. There is very little similarity with the other variables. Similarity R3, R4 and R1 form a variables cluster G3 R3 R1 Variables R R4

10 14b Is there any clustering of subjects showing similar sets of answers? The dendrogram in the diagram is based on shows the similarity between different subjects responding to the questionnaire Dendrogram Average Linkage, Euclidean Distance The clustering is not strong, but does show a weak cluster with subjects, 1, 1, 16, 17, 18 and 19. Similarity We look for a long tail leading to several branches near the baseline Observations c Can we model the clustering of subjects? Using Principal Component Analysis and Factor Analysis it is possible to describe the multiple responses using just two main components or factors. Every subject is then plotted on a two-dimensional plot using these two components or factors: Score Plot of R1,..., R4 Score Plot of R1,..., R4 Second Component Second Factor First Component First Factor Principal Component Analysis Factor Analysis with Quartimax rotation Both diagrams show our cluster of subjects, 1, 1, 16, 17, 18 and 19, grouped together in the top right hand quadrants. 14d Using Cluster analysis techniques It is important to realise that: The cluster analysis techniques described above are typically used to explore relationships, looking for hidden groupings within large amounts of data. This can then help the planning of future research. These techniques work best with large amounts of data. Note that in our example, there are responses over the full 1-5 and 0-10 ranges which gives the analysis enough information to be able to distinguish some groupings. Although it is probable that cluster analysis will not be particularly appropriate for your project, it might be useful for you to demonstrate that you are aware of the technique even if you do not end up with startling discoveries!

### SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

### The Statistics Tutor s Quick Guide to

statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence The Statistics Tutor s Quick Guide to Stcp-marshallowen-7

### An introduction to IBM SPSS Statistics

An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive

### The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

### SPSS Explore procedure

SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

### Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### January 26, 2009 The Faculty Center for Teaching and Learning

THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i

### Statistics for Sports Medicine

Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota (suzanne.hecht@gmail.com) Fellow s Research Conference July 2012: Philadelphia GOALS Try not to bore you to death!! Try to teach

### SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

### SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 Statistical techniques to be covered Explore relationships among variables Correlation Regression/Multiple regression Logistic regression Factor analysis

### Data Analysis for Marketing Research - Using SPSS

North South University, School of Business MKT 63 Marketing Research Instructor: Mahmood Hussain, PhD Data Analysis for Marketing Research - Using SPSS Introduction In this part of the class, we will learn

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

### We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### Data Analysis, Research Study Design and the IRB

Minding the p-values p and Quartiles: Data Analysis, Research Study Design and the IRB Don Allensworth-Davies, MSc Research Manager, Data Coordinating Center Boston University School of Public Health IRB

### SPSS Guide How-to, Tips, Tricks & Statistical Techniques

SPSS Guide How-to, Tips, Tricks & Statistical Techniques Support for the course Research Methodology for IB Also useful for your BSc or MSc thesis March 2014 Dr. Marijke Leliveld Jacob Wiebenga, MSc CONTENT

### Nonparametric Statistics

Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics

### UNIVERSITY OF NAIROBI

UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER

### TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002

### Association Between Variables

Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

### Linear Models in STATA and ANOVA

Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### Data analysis process

Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis

### Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. 1. Motivation. Likert items are used to measure respondents attitudes to a particular question or statement. One must recall

### Introduction to Quantitative Methods

Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

### UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

### Mathematics within the Psychology Curriculum

Mathematics within the Psychology Curriculum Statistical Theory and Data Handling Statistical theory and data handling as studied on the GCSE Mathematics syllabus You may have learnt about statistics and

### Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship

### Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

### Descriptive Analysis

Research Methods William G. Zikmund Basic Data Analysis: Descriptive Statistics Descriptive Analysis The transformation of raw data into a form that will make them easy to understand and interpret; rearranging,

### Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

### Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

### Comparing Means in Two Populations

Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### THE KRUSKAL WALLLIS TEST

THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

### Simple Linear Regression, Scatterplots, and Bivariate Correlation

1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

### Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

### Using SPSS, Chapter 2: Descriptive Statistics

1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

### CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA

CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA Chapter 13 introduced the concept of correlation statistics and explained the use of Pearson's Correlation Coefficient when working

### QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

### Chapter 12. Data Collection Measurements and Analysis

Chapter 12 186 Data Collection Measurements and Analysis In planning the collection of data it is necessary first to carefully consider what information is needed to answer the research question. A list

### 2 Sample t-test (unequal sample sizes and unequal variances)

Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

### MULTIPLE REGRESSION WITH CATEGORICAL DATA

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

### SPSS TUTORIAL & EXERCISE BOOK

UNIVERSITY OF MISKOLC Faculty of Economics Institute of Business Information and Methods Department of Business Statistics and Economic Forecasting PETRA PETROVICS SPSS TUTORIAL & EXERCISE BOOK FOR BUSINESS

### business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

### SPSS Modules Features Statistics Premium

SPSS Modules Features Statistics Premium Core System Functionality (included in every license) Data access and management Data Prep features: Define Variable properties tool; copy data properties tool,

### Post-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9.

Two-way ANOVA, II Post-hoc comparisons & two-way analysis of variance 9.7 4/9/4 Post-hoc testing As before, you can perform post-hoc tests whenever there s a significant F But don t bother if it s a main

### Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

### Categorical Data Analysis

Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods

### One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

### POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

### Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis

### Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.

Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged

### Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

The t-test Outline Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test - Dependent (related) groups t-test - Independent (unrelated) groups t-test Comparing means Correlation

### Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Statistics Statistics are quantitative methods of describing, analysing, and drawing inferences (conclusions)

### INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

### 4. Multiple Regression in Practice

30 Multiple Regression in Practice 4. Multiple Regression in Practice The preceding chapters have helped define the broad principles on which regression analysis is based. What features one should look

### Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)

1 Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship) I. Authors should report effect sizes in the manuscript and tables when reporting

### The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

### An introduction to using Microsoft Excel for quantitative data analysis

Contents An introduction to using Microsoft Excel for quantitative data analysis 1 Introduction... 1 2 Why use Excel?... 2 3 Quantitative data analysis tools in Excel... 3 4 Entering your data... 6 5 Preparing

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

### Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### DATA INTERPRETATION AND STATISTICS

PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

### The correlation coefficient

The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

### Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

### DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

DATA ANALYSIS QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University Quantitative Research What is Statistics? Statistics (as a subject) is the science

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### Chapter 2 Probability Topics SPSS T tests

Chapter 2 Probability Topics SPSS T tests Data file used: gss.sav In the lecture about chapter 2, only the One-Sample T test has been explained. In this handout, we also give the SPSS methods to perform

### 1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

### 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

### Instructions for SPSS 21

1 Instructions for SPSS 21 1 Introduction... 2 1.1 Opening the SPSS program... 2 1.2 General... 2 2 Data inputting and processing... 2 2.1 Manual input and data processing... 2 2.2 Saving data... 3 2.3

### Lesson 4 Measures of Central Tendency

Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central

### HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

### Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.