1 A full analysis example Multiple correlations Partial correlations
2 New Dataset: Confidence This is a dataset taken of the confidence scales of 41 employees some years ago using 4 facets of confidence (Physical, Appearance, Emotional, and Problem Solving, as well as their gender and their citizenship status.
3 Example problem 1: Analyze the correlation between physical confidence and appearance confidence. First question we should ask Is Pearson correlation appropriate? Four requirements for correlation:
4 Example problem 1: Analyze the correlation between physical confidence and appearance confidence. First question we should ask Is Pearson correlation appropriate? Four requirements for correlation: 1. A straight-line relationship. 2. Interval data. 3. Random sampling (Will need to assume) 4. Normal distributed characteristics
5 Check for normality in each of the histograms. (Graphs Legacy Dialog Histogram)
6 The appearance variable is close enough to normal, although it has more on the upper and lower end than it should. The physical variable has a negative skew, so that could be a problem.
7 There are at least two values that are far below the mean for confidence in physical. We should investigate them further. Graphs Legacy Dialogs Boxplot Use summaries of separate variables, and Options Exclude Variable-by-Variable
8 Boxplots identify outliers, from the boxplot we find that cases 31 and 37 are the outliers in physical confidence. Looking at the data directly we find that neither of these cases even have a value for appearance.
9 The two outliers in physical have no measured value for appearance That means they will have no effect on a correlation between physical and appearance. Correlation can only consider cases where there are values for both variables (a point needs both an X and a Y to exist)
10 Next, we look at the scatterplot. Graphs Legacy Dialog Scatter/Dot No obvious signs of non-linear trends, but there doesn t seem to be any strong trend at all. Correlation is a appropriate measure, but it won t be strong.
11 We run the correlation to find it and see if it s significant at alpha = Analyze Correlate Bivariate Sig. (2-tailed) is.039, so the correlation is significant at alpha =.05. (Had we chosen the.01 level, this would not be the case)
12 We could also run a t-test by hand to verify the significance level we found. (r=.373, n=31) t* = at 0.05 level, 29 df t* = at 0.01 level, 29 df
13 Let s not sully this moment with a bad pun or something.
14 The correlation matrix is a table that shows the correlation between two variables. Physical Appearance Physical Appearance In this case, Physical is correlated with Appearance with r=.373 Likewise, Appearance is correlated with Physical with r=.373 Also, everything correlates with itself with r=1.000.
15 SPSS takes it a little farther by making a matrix of correlation coefficient, significance, and sample size. Confidences are significantly correlated, there are 31 entries for each pair (not 41 because real data has blanks).
16 However, if we go to the correlations menu and select more than two variables of interest:
17 We get a 4x4 correlation matrix instead! What s better than two variables? FOUR VARIABLES!
18 Cutting away all the sample size and significance stuff, I find: Phys. Appear. Emot. Pr.Solve. Physical 1.373*.430**.730** Appearance 1.483**.527** Emotional 1.540** Problem Solving 1 There is a positive correlation between every facet. That means that any one facet of confidence increases, so do all the others. * significant at 0.05 level * significant at 0.01 level
19 Phys. Appear. Emot. Pr.Solve. Physical 1.373*.430**.730** Appearance 1.483**.527** Emotional 1.540** Problem Solving 1 Multiple correlation is useful as a first-look search for connections between variables, and to see broad trends between data. If there were only a few variables connected to each other, it would help us identify which ones without having to look at all 6 pairs individually.
20 Pitfalls of multiple correlations: 1. Multiple testing. With 4 variables, there are 6 correlations being tested for significance. At alpha =0.05, there s a 26.5% chance that at least one correlation is going to show as significant even if there are no correlations at all. At 5 variables, there are 10 tests and a 40.1% chance of falsely rejecting at least one null. (Assuming no correlations) At 6 variables, there are 15 tests and a 53.7% chance of falsely rejecting the null.
21 You don t need to know how to handle multiple testing problems in this class. However, be cautious when dealing with many variables. Be suspicious of correlations that are significant, but just barely. Example: The weakest correlation here is physical with appearance, a correlation of.373. That correlation being significant could be a fluke.
22 2. Diagnostics doesn t get easier. Doing correlations as a matrix allows you to do the math of a correlation much faster than checking them one at a time. However, the diagnostic tests like histograms, scatterplots, and residual plots don t get any faster. Any correlation we re interested in (even if it s not showing as significant) still needs checks for normality and linearity before use in research.
23 One big advantage of correlating with multiple variables is that we can isolate the connections between different variables where they might not be obvious otherwise. Phys. Appear. Emot. Pr.Solve. Physical 1.373*.430**.730** Appearance 1.483**.527** Emotional 1.540** Problem Solving 1
24 Example: Is there really a correlation between appearance confidence and problem solving confidence SPECIFICALLY, or are they both attached to the same general confidence?
25 Ponder that over a Mandarin Duck.
26 To isolate a correlation between two variables from a third variable, we want to only look at the part of that correlation that s really between those two and not the third. We want the partial correlation. Example: Ice cream sales increase when murder rates increase. These two variables have nothing logical to do with each other, however, they both increase when it s hot out.
27 This is the simple correlation between these two variables. We want the relationship between murder and ice cream WITHOUT the confounding variable of heat.
28 In the dataset murderice.csv, we can find run a partial correlation and find out. First, a simple correlation reveals very significant correlations between everything.
29 But how much of that connection is truly between murder and ice cream? Analyze Correlate Partial
30 From here, put the two variables of interest in the variable (you can put more than two if you wish). Put the confounding variable in the control for slot.
31 The partial correlation between ice cream and murder is much lower than the simple correlation. It appears that heat (or something common to all three) was a major factor in both. In fact, the correlation is no longer significant (we fail to reject the null that there is no correlation)
32 Also note: SPSS tells us in the output table that heat is a control variable, so we know from the output that this is a partial correlation (hint, hint). We re using three degrees of freedom, one for each variable involved, so the df is 57 even when n is 60 (for interest)
33 Key observation: The partial correlation will be less than the simple correlation if both variables of interest are correlated to the confounding variable in the same way. Here, both murder and ice cream are correlated to heat positively, so the partial correlation removes that common positive relationship murder and ice cream. Removing a positive relationship makes the correlation less positive.
34 Likewise, if the correlation to the confounding variable is opposing, then the partial correlation will be higher than the simple correlation. If we re only considering positive correlations, this means a confounding variable could be hiding or masking a correlation hiding a correlation between two variables rather than creating a false correlation.
35 Example: Confidence. Consider the correlation between types of confidence. Do the correlations between the other three still show after we control for problem solving confidence? Simple Correlations Phys. Appear. Emot. Pr.Solve. Physical 1.373*.430**.730** Appearance 1.483**.527** Emotional 1.540** Problem Solving 1
36 The correlation between physical and anything is removed entirely (that means that knowing problem solving confidence tells you as much about an employee s physical confidence as knowing all three other facets)
37 With the heat behind murder and ice cream we had some other non-math information to make the claim that heat was behind the other two variables. It could have easily been something we didn t measure, like the proportion of elderly in an area (retirees often migrate south for winter). In the case of facets of confidence, we don t have any reason why problem solving confidence would be the common thread. The partial correlations shrink to nothing because after problem solving, the other variables we re giving much info.
38 If we control for emotional confidence, we see there s a connection between problem solving and physical when emotional is taken out of the picture.
39 Interestingly, controlling for appearance produces the same result. They all have a common thread and so increase together, but the real connection is between problem solving and physical confidence. Without partial correlation we would have never caught this.
AMS7: WEEK 8. CLASS 1 Correlation Monday May 18th, 2015 Type of Data and objectives of the analysis Paired sample data (Bivariate data) Determine whether there is an association between two variables This
Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,
Spearman s correlation Introduction Before learning about Spearman s correllation it is important to understand Pearson s correlation which is a statistical measure of the strength of a linear relationship
Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do
An SPSS companion book to Basic Practice of Statistics SPSS is owned by IBM. 6 th Edition. Basic Practice of Statistics 6 th Edition by David S. Moore, William I. Notz, Michael A. Flinger. Published by
STAT E-150 Statistical Methods Multiple Regression Three percent of a man's body is essential fat, which is necessary for a healthy body. However, too much body fat can be dangerous. For men between the
0.1 Multiple Regression Models We will introduce the multiple Regression model as a mean of relating one numerical response variable y to two or more independent (or predictor variables. We will see different
Dale Berger SPSS Step-by-Step Regression Introduction: MRC01 This step-by-step example shows how to enter data into SPSS and conduct a simple regression analysis to develop an equation to predict from.
Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we
Chapter 9: Correlation Coefficients **This chapter corresponds to chapters 5 ( Ice Cream and Crime ) and 14 ( Cousins or Just Good Friends? of your book. What it is: A correlation coefficient (also called
SPSS: Expected frequencies, chi-squared test. In-depth example: Age groups and radio choices. Dealing with small frequencies. Quick Example: Handedness and Careers Last time we tested whether one nominal
Pearson s correlation Introduction Often several quantitative variables are measured on each member of a sample. If we consider a pair of such variables, it is frequently of interest to establish if there
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
Math Objectives Students will be able to identify the conditions that need to be met to perform inference procedures for the slope of a regression line. Students will be able to interpret the results of
An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive
1 Commands in SPSS 1.1 Dowloading data from the web The data I post on my webpage will be either in a zipped directory containing a few files or just in one file containing data. Please learn how to unzip
1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.
Chapter 16 Multiple Choice Questions (The answers are provided after the last question.) 1. Which of the following symbols represents a population parameter? a. SD b. σ c. r d. 0 2. If you drew all possible
Contents Module 3: Multiple Regression Concepts Fiona Steele 1 Centre for Multilevel Modelling...4 What is Multiple Regression?... 4 Motivation... 4 Conditioning... 4 Data for multiple regression analysis...
Two Related Samples t Test In this example 1 students saw five pictures of attractive people and five pictures of unattractive people. For each picture, the students rated the friendliness of the person
Is the mean hourly rate of male workers $2.00? T-Test One-Sample Statistics Std. Error N Mean Std. Deviation Mean 2997 2.0522 6.6282.2 One-Sample Test Test Value = 2 95% Confidence Interval Mean of the
1 Statistical Inference and t-tests Objectives Evaluate the difference between a sample mean and a target value using a one-sample t-test. Evaluate the difference between a sample mean and a target value
An example ANOVA situation Example (Treating Blisters) 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
Lecture #10 Chapter 10 Correlation and Regression The main focus of this chapter is to form inferences based on sample data that come in pairs. Given such paired sample data, we want to determine whether
AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use
Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS t-test X 2 X 2 AOVA (F-test) t-test AOVA
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,
INTERPRETING THE REPEATED-MEASURES ANOVA USING THE SPSS GENERAL LINEAR MODEL PROGRAM RM ANOVA In this scenario (based on a RM ANOVA example from Leech, Barrett, and Morgan, 2005) each of 12 participants
Correlation and Regression Analysis: SPSS Bivariate Analysis: Cyberloafing Predicted from Personality and Age These days many employees, during work hours, spend time on the Internet doing personal things,
Data used in this guide: studentp.sav (http://people.ysu.edu/~gchang/stat/studentp.sav) Organize and Display One Quantitative Variable (Descriptive Statistics, Boxplot & Histogram) 1. Move the mouse pointer
Chapter 2 Probability Topics SPSS T tests Data file used: gss.sav In the lecture about chapter 2, only the One-Sample T test has been explained. In this handout, we also give the SPSS methods to perform
Regression analysis in practice with GRETL Prerequisites You will need the GNU econometrics software GRETL installed on your computer (http://gretl.sourceforge.net/), together with the sample files that
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
BIOSTATISTICS QUIZ ANSWERS 1. When you read scientific literature, do you know whether the statistical tests that were used were appropriate and why they were used? a. Always b. Mostly c. Rarely d. Never
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Minitab Guide This packet contains: A Friendly Guide to Minitab An introduction to Minitab; including basic Minitab functions, how to create sets of data, and how to create and edit graphs of different
Class 6: Chapter 12 Correlational Designs l 1 Key Ideas Explanatory and predictor designs Characteristics of correlational research Scatterplots and calculating associations Steps in conducting a correlational
A Guide for a Selection of SPSS Functions IBM SPSS Statistics 19 Compiled by Beth Gaedy, Math Specialist, Viterbo University - 2012 Using documents prepared by Drs. Sheldon Lee, Marcus Saegrove, Jennifer
HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER
Formula for linear models. Prediction, extrapolation, significance test against zero slope. Last time, we looked the linear regression formula. It s the line that fits the data best. The Pearson correlation
1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,
Chapter 21 Section D Statistical Tests for Ordinal Data The rank-sum test. You can perform the rank-sum test in SPSS by selecting 2 Independent Samples from the Analyze/ Nonparametric Tests menu. The first
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of
7. Tests of association and Linear Regression In this chapter we consider 1. Tests of Association for 2 qualitative variables. 2. Measures of the strength of linear association between 2 quantitative variables.
Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used
UNIVERSITY OF MISKOLC Faculty of Economics Institute of Business Information and Methods Department of Business Statistics and Economic Forecasting PETRA PETROVICS SPSS TUTORIAL & EXERCISE BOOK FOR BUSINESS
T-test in SPSS Hypothesis tests of proportions Confidence Intervals (End of chapter 6 material) Definition of p-value: The probability of getting evidence as strong as you did assuming that the null hypothesis
Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours
Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:
Hints for Success on the AP Statistics Exam. (Compiled by Zack Bigner) The Exam The AP Stat exam has 2 sections that take 90 minutes each. The first section is 40 multiple choice questions, and the second
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate
Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test
Note to Students: This practice exam is intended to give you an idea of the type of questions the instructor asks and the approximate length of the exam. It does NOT indicate the exact questions or the
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
Today: Sections 13.1 to 13.3 ANNOUNCEMENTS: We will finish hypothesis testing for the 5 situations today. See pages 586-587 (end of Chapter 13) for a summary table. Quiz for week 8 starts Wed, ends Monday
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
Topics covered in this chapter: Chapter 5. Regression Adding a Regression Line to a Scatterplot Regression Lines and Influential Observations Finding the Least Squares Regression Model Adding a Regression
Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
One-Sample t-test Example 1: Mortgage Process Time Problem A faster loan processing time produces higher productivity and greater customer satisfaction. A financial services institution wants to establish
Overview The Basics of a Test Dr Tom Ilvento Department of Food and Resource Economics Alternative way to make inferences from a sample to the Population is via a Test A hypothesis test is based upon A
496 STATISTICAL ANALYSIS OF CAUSE AND EFFECT * Use a non-parametric technique. There are statistical methods, called non-parametric methods, that don t make any assumptions about the underlying distribution
MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative
Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,
UNDERSTANDING THE DEPENDENT-SAMPLES t TEST A dependent-samples t test (a.k.a. matched or paired-samples, matched-pairs, samples, or subjects, simple repeated-measures or within-groups, or correlated groups)
Applied Multivariate Statistical Modelling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture - 32 Regression Modelling Using SPSS (Refer
Bivariate Analysis Variable 1 2 LEVELS >2 LEVELS CONTINUOUS Variable 2 2 LEVELS X 2 chi square test >2 LEVELS X 2 chi square test CONTINUOUS t-test X 2 chi square test X 2 chi square test ANOVA (F-test)
About Hypothesis Testing TABLE OF CONTENTS About Hypothesis Testing... 1 What is a HYPOTHESIS TEST?... 1 Hypothesis Testing... 1 Hypothesis Testing... 1 Steps in Hypothesis Testing... 2 Steps in Hypothesis
Tests of Differences: two related samples What are paired data? Frequently data from ecological work take the form of paired (matched, related) samples Before and after samples at a specific site (or individual)
SPSS Manual To Accompany Howell s Fundamental Statisitcs for The Behavioral Sciences (7th Edition) Esther M. Leerkes David C. Howell University of Vermont CONTENTS Introduction to SPSS What is SPSS? Opening
Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is
AP Statistics 1998 Scoring Guidelines These materials are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use must be sought from the Advanced Placement
Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and
Using SPSS for Multiple Regression UDP 520 Lab 7 Lin Lin December 4 th, 2007 Step 1 Define Research Question What factors are associated with BMI? Predict BMI. Step 2 Conceptualizing Problem (Theory) Individual
Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas