Correlation and Model Fit

Size: px
Start display at page:

Download "Correlation and Model Fit"

Transcription

1 Correlation and Model Fit Prof. Jacob M. Montgomery Quantitative Political Methodology (L32 363) November 9, 2016 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

2 Some class business Poster projects Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

3 Some class business Poster projects Poster files due on 12/7 at 10am. (29 days from now) Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

4 Some class business Poster projects Poster files due on 12/7 at 10am. (29 days from now) Problem set is long. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

5 Some class business Poster projects Poster files due on 12/7 at 10am. (29 days from now) Problem set is long. This lecture is pretty abstract with a lot of equations. PLEASE ask questions. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

6 Overview Last time: Inference with regression A bit on interpreting regression output Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

7 Overview Last time: Inference with regression A bit on interpreting regression output This time: A quick reversion to correlation (r) Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

8 Overview Last time: Inference with regression A bit on interpreting regression output This time: A quick reversion to correlation (r) Understanding the rest of the stuff on R-output Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

9 Overview Last time: Inference with regression A bit on interpreting regression output This time: A quick reversion to correlation (r) Understanding the rest of the stuff on R-output RMSE and Model fit (r 2 ) Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

10 Overview Last time: Inference with regression A bit on interpreting regression output This time: A quick reversion to correlation (r) Understanding the rest of the stuff on R-output RMSE and Model fit (r 2 ) F-tests Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

11 Alternative measure of correlation Pearson s r Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

12 Alternative measure of correlation Pearson s r (Standardized slope) S Y = (Y i Ȳ ) 2 n 1 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

13 Alternative measure of correlation Pearson s r (Standardized slope) S Y = S X = (Y i Ȳ ) 2 n 1 (X i X ) 2 n 1 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

14 Alternative measure of correlation Pearson s r (Standardized slope) S Y = S X = (Y i Ȳ ) 2 n 1 (X i X ) 2 n 1 r = ( Sx S Y ) ˆβ Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

15 Reminder: These are the main parameters ˆβ = n i=1 ( ) (X i X )(Y i Ȳ ) n i=1(x i X ) 2 ˆα = Ȳ ˆβ X Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

16 How good is our model?: Thinking about variance Unconditional Variance: Estimate of total variance in the population S 2 = ˆσ 2 Y = (Y i Ȳ ) 2 n 1 S = ˆσ Y = (Y i Ȳ ) 2 n 1 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

17 How good is our model?: Thinking about variance Unconditional Variance: Estimate of total variance in the population S 2 = ˆσ 2 Y = (Y i Ȳ ) 2 n 1 S = ˆσ Y = (Y i Ȳ ) 2 n 1 Sum of Squared Error: A measure of is spread around the line SSE = n i=1 (Y i Ŷ i ) 2 = n i=1 (Y i ˆα ˆβX i ) 2 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

18 How good is our model?: Thinking about variance Unconditional Variance: Estimate of total variance in the population S 2 = ˆσ 2 Y = (Y i Ȳ ) 2 n 1 S = ˆσ Y = (Y i Ȳ ) 2 n 1 Sum of Squared Error: A measure of is spread around the line SSE = n i=1 (Y i Ŷ i ) 2 = n i=1 (Y i ˆα ˆβX i ) 2 Conditional Variance: Estimate of variance around line in population ˆσ 2 = SSE n 2 = (Y i Ŷ i ) 2 SSE n 2 ˆσ = n 2 = (Y i Ŷ i ) 2 n 2 ˆσ 2 is sometimes called Mean squared error (MSE) and ˆσ is Root mean squared error (RMSE) or Residual standard error (in R) or Standard error of the estimate (in SPSS). Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

19 A bit of visualization Histogram of Y Regression of X and Y Histogram of Y when X=2 Frequency Y Frequency Y X Y[X == 2] A really, really good line will have small conditional variance. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

20 How good is our model?: Thinking about variance Hold onto some basic ideas: (Y i Ȳ ) 2 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

21 How good is our model?: Thinking about variance Hold onto some basic ideas: (Y i Ȳ ) 2 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

22 How good is our model?: Thinking about variance Hold onto some basic ideas: (Y i Ȳ ) 2 = Total Sum of Squares Unconditional variance: S 2 = (Y i Ȳ ) 2 n 1 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

23 How good is our model?: Thinking about variance Hold onto some basic ideas: (Y i Ȳ ) 2 = Total Sum of Squares Unconditional variance: S 2 = (Y i Ȳ ) 2 n 1 (Y i Ŷ i ) 2 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

24 How good is our model?: Thinking about variance Hold onto some basic ideas: (Y i Ȳ ) 2 = Total Sum of Squares Unconditional variance: S 2 = (Y i Ȳ ) 2 n 1 (Y i Ŷ i ) 2 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

25 How good is our model?: Thinking about variance Hold onto some basic ideas: (Y i Ȳ ) 2 = Total Sum of Squares Unconditional variance: S 2 = (Y i Ȳ ) 2 n 1 (Y i Ŷ i ) 2 = Sum of Squared Error Conditional variance: ˆσ 2 = SSE n 2 = (Y i Ŷ i ) 2 n 2 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

26 How good is our model?: Thinking about variance Hold onto some basic ideas: (Y i Ȳ ) 2 = Total Sum of Squares Unconditional variance: S 2 = (Y i Ȳ ) 2 n 1 (Y i Ŷ i ) 2 = Sum of Squared Error Conditional variance: ˆσ 2 = SSE n 2 = (Y i Ŷ i ) 2 n 2 We are going to say that IF we have a really good model, ˆσ 2 should be a lot smaller than S 2. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

27 Regression between presidential vote outcomes and election outcomes Two-party presidential vote share Q2 GDP growth rate Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

28 We draw the line that reduces SSE Residuals e i = (Y i Ŷ i ) = (y i ˆα ˆβX i ) Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

29 We draw the line that reduces SSE Residuals e i = (Y i Ŷ i ) = (y i ˆα ˆβX i ) Incumbent party share of vote Residuals for presidential regression nd quarter GDP Growth Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

30 How good is our model? : r 2 Define some preliminary terms: TSS = (Y i Ȳ ) 2 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

31 How good is our model? : r 2 Define some preliminary terms: TSS = (Y i Ȳ ) 2 SSE = (Y i Ŷ i ) 2 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

32 How good is our model? : r 2 Define some preliminary terms: TSS = (Y i Ȳ ) 2 SSE = (Y i Ŷ i ) 2 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

33 How good is our model? : r 2 Define some preliminary terms: TSS = (Y i Ȳ ) 2 SSE = (Y i Ŷ i ) 2 R-squared r 2 = Explained Variance Total Variance = Total Variance - Unexplained Variance Total Variance Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

34 How good is our model? : r 2 Define some preliminary terms: TSS = (Y i Ȳ ) 2 SSE = (Y i Ŷ i ) 2 R-squared r 2 = Explained Variance Total Variance = Total Variance - Unexplained Variance Total Variance = TSS SSE TSS Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

35 Some notes on r 2 R-squared r 2 = TSS SSE TSS SSE=0 (Perfect fit) r 2 = 1 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

36 Some notes on r 2 R-squared r 2 = TSS SSE TSS SSE=0 (Perfect fit) r 2 = 1 SSE = TSS (No fit) r 2 = 0 Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

37 Some notes on r 2 R-squared r 2 = TSS SSE TSS SSE=0 (Perfect fit) r 2 = 1 SSE = TSS (No fit) r 2 = 0 Does not depend on units of measurement Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

38 Some notes on r 2 R-squared r 2 = TSS SSE TSS SSE=0 (Perfect fit) r 2 = 1 SSE = TSS (No fit) r 2 = 0 Does not depend on units of measurement Sometimes called the coefficient of determination Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

39 Some notes on r 2 R-squared r 2 = TSS SSE TSS SSE=0 (Perfect fit) r 2 = 1 SSE = TSS (No fit) r 2 = 0 Does not depend on units of measurement Sometimes called the coefficient of determination It does not penalize for model complexity. Often adjusted R-squared is used. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

40 Some notes on r 2 R-squared r 2 = TSS SSE TSS SSE=0 (Perfect fit) r 2 = 1 SSE = TSS (No fit) r 2 = 0 Does not depend on units of measurement Sometimes called the coefficient of determination It does not penalize for model complexity. Often adjusted R-squared is used. In R-output this is labeled Multiple R-squared Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

41 Some notes on r 2 R-squared r 2 = TSS SSE TSS SSE=0 (Perfect fit) r 2 = 1 SSE = TSS (No fit) r 2 = 0 Does not depend on units of measurement Sometimes called the coefficient of determination It does not penalize for model complexity. Often adjusted R-squared is used. In R-output this is labeled Multiple R-squared Why do we use it? Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

42 Some notes on r 2 R-squared r 2 = TSS SSE TSS SSE=0 (Perfect fit) r 2 = 1 SSE = TSS (No fit) r 2 = 0 Does not depend on units of measurement Sometimes called the coefficient of determination It does not penalize for model complexity. Often adjusted R-squared is used. In R-output this is labeled Multiple R-squared Why do we use it? Gives us an overall impression for how well our model is doing. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

43 Some notes on r 2 R-squared r 2 = TSS SSE TSS SSE=0 (Perfect fit) r 2 = 1 SSE = TSS (No fit) r 2 = 0 Does not depend on units of measurement Sometimes called the coefficient of determination It does not penalize for model complexity. Often adjusted R-squared is used. In R-output this is labeled Multiple R-squared Why do we use it? Gives us an overall impression for how well our model is doing. We can informally compare models. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

44 R output Call: lm(formula = vote ~ q2gdp, data = Abram) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-15 *** q2gdp * --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 15 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 15 DF, p-value: Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

45 Example: Explaining congressional elections Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

46 A real quick primer on the F-statistic for Regression Before we said that r 2 is intuitively r 2 Explained Variance =. It makes Total Variance sense then that (1 r 2 ) is the percent of variance we haven t explained. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

47 A real quick primer on the F-statistic for Regression Before we said that r 2 is intuitively r 2 Explained Variance =. It makes Total Variance sense then that (1 r 2 ) is the percent of variance we haven t explained. It turns out that: F-statistic for regression F = r 2 /p (1 r 2 )/[n (p + 1)] Here p is the number of covariates (gdp, incumbent, etc.), and n is the number of observations. This will be distributed according to the F-distribution with df 1 = p, and df 2 = n (p + 1). Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

48 Example F-Distributions And now you understand (almost) everything on a regression table. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

49 Hypothesis testing with F-statistics Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

50 Interpreting the F-test Is our model any good? This is a formalized way of asking whether our model is any good. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

51 Interpreting the F-test Is our model any good? This is a formalized way of asking whether our model is any good. We compare the amount of variance explained by the regression to the amount unexplained. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

52 Interpreting the F-test Is our model any good? This is a formalized way of asking whether our model is any good. We compare the amount of variance explained by the regression to the amount unexplained. This is essentially a comparison of the following two models: Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

53 Interpreting the F-test Is our model any good? This is a formalized way of asking whether our model is any good. We compare the amount of variance explained by the regression to the amount unexplained. This is essentially a comparison of the following two models: H 0 : Y i = α + ɛ i Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

54 Interpreting the F-test Is our model any good? This is a formalized way of asking whether our model is any good. We compare the amount of variance explained by the regression to the amount unexplained. This is essentially a comparison of the following two models: H 0 : Y i = α + ɛ i H a : Y i = α + X i β + ɛ i Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

55 Interpreting the F-test Is our model any good? This is a formalized way of asking whether our model is any good. We compare the amount of variance explained by the regression to the amount unexplained. This is essentially a comparison of the following two models: H 0 : Y i = α + ɛ i H a : Y i = α + X i β + ɛ i This is more useful in multivariate regression: Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

56 Interpreting the F-test Is our model any good? This is a formalized way of asking whether our model is any good. We compare the amount of variance explained by the regression to the amount unexplained. This is essentially a comparison of the following two models: H 0 : Y i = α + ɛ i H a : Y i = α + X i β + ɛ i This is more useful in multivariate regression: H 0 : Y i = α + ɛ i Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

57 Interpreting the F-test Is our model any good? This is a formalized way of asking whether our model is any good. We compare the amount of variance explained by the regression to the amount unexplained. This is essentially a comparison of the following two models: H 0 : Y i = α + ɛ i H a : Y i = α + X i β + ɛ i This is more useful in multivariate regression: H 0 : Y i = α + ɛ i H a : Y i = α + X i1 β 1 + X i2 β 2 + X i3 β ɛ i Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

58 R output Call: lm(formula = vote ~ q2gdp, data = Abram) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-15 *** q2gdp * --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 15 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 15 DF, p-value: Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

59 Posters: General guidelines If you gather data, no sensitive questions. No time series data You need to have a plan to deal with endogeneity Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

60 The first thing you need to do... 1 Think of a research question! This step should always come first. You, as the researcher, need to formulate some sort of question that you want answered. You should care about this question and its answer! Not in an I want a good grade way, but in an I really care about the answer to this question kind of way. If you don t care about the question itself, then the project will be miserable to complete. Make it easier on yourself and research something you re actually interested in. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

61 Once you have a question... 1 Once you have settled on a question, the next step is to come up with a hypothesis pertaining to that question. 2 The hypothesis needs to be testable with the tools that you have learned (or will learn) in this course. This generally precludes giant questions: Why do Americans vote? What makes people make environmentally friendly choices? Does President Obama have an American birth certificate? 3 However, smaller questions are interesting too! Do yard signs make people more likely to vote? Can door knocking campaigns lead to higher levels of recycling? Are racial attitudes related to beliefs about whether President Obama is not a citizen? Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

62 Once you have a question... 1 This hypothesis does not have to ultimately be supported. Most hypotheses are wrong especially interesting ones. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

63 Once you have a question... 1 Your question must be testable with data, meaning that you have to find a proper data source for your research project. 2 If you cannot find data that helps answer your question, then, for the purposes of this class, the question is unanswerable. If this applies to your research question, then it s time to find a new one. 3 Fortunately, the internet is full of data sources. If you can think of a data set you may need, there is a good chance that someone has collected it. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

64 Finding Data 1 There are many data sources. And you can use any data source you want for your project, as long as the data provides evidence for an answer to your question. Do you wonder if you re data is appropriate? Ask yourself if the data you have collected would be able to convince someone else of your answer. If there is any question as to whether or not this is the case, you should search for more data. While having data is better than not having data, there are differences in the quality of data across data sets. Some data is very credible while other data should certainly be scrutinized. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

65 Good Data Sources 1 For political science data, the best place to begin looking for data is This website is a new project by Harvard s Institute for Quantitative Social Sciences that is meant to direct undergraduate students on how to do a quantitative social science research project. The link above is to the data section, which contains a database of credible data sources. If you want to do something in the political science realm, this is the place to begin looking for the data set that best suits your question. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

66 Specific Data Sources 1 For many questions regarding American political attitudes and behavior, the American National Election Study (ANES) is the best survey to use. Survey is done every two years before and after an election. Asks many sociological and political questions. Attempts to capture behavior through a large set of questions. The survey has a relatively large sample size. However, not all respondents are asked the same battery of questions. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

67 Specific Data Sources 1 For questions more of the sociological nature and questions that the ANES just doesn t seem to work well for, the General Social Survey (GSS) may suit your needs better. The GSS is done every one or two years and has a very large sample of respondents. Asks many sociological questions and questions about attitudes towards issues. Only samples respondents from the United States. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

68 Other Data Sources 1 For questions that could be categorized in the realm of international politics, you can use the Correlates of War surveys and its various offshoots. Another good source is the V-Dem project ( 2 In general, any data set that you find a link for on the IQSS website is going to be credible and rather informative. Start early and spend a good amount of time finding the right data set. 3 Many scholars post their datasets on their websites. Many journals archive related datasets on their websites. If you find a good paper written in the last 5 years, it might just be online waiting for you. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

69 Other Data Sources 1 Does your question require an obscure data source that cannot be found on the IQSS website? Then, searching the ICPSR data set archives might be the best option for you. ICPSR is an initiative based at the University of Michigan that houses a huge data base of political science data sets. Data sets contained in this archive range from Iowa Census data from 1908 to the results of every U.S. House election from the 1940 s to the present. This database is less user friendly than the IQSS student one, but if you can t find what you need on that website, then this may be your next best option. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

70 Other Data Sources 1 The IQSS also has another, albeit less user friendly, database of data sources: Like the ICPSR website, there is a very large set of data sets here. However, it is less user friendly than the IQSS student website. Finding the right data may take a long time to find on this website. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

71 Non-Political Science Data 1 The question for your project does not have to be political science related! You can choose to do a project on any question that you are interested in answering. If you re interested in sports, then the following website is a database of sports related data sources: Another database of data sets can be found on http: //rss.acs.unt.edu/rdoc/library/ecdat/html/00index.html. These data sets, which are actually a part of an R package vary on subject from tobacco budgets to crime in North Carolina to Heating and Cooling system choices. Before you use any of these data sets, though, make sure that there is a valid codebook which contains information about the survey method, sampling procedures, etc. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

72 Other Options 1 Are none of these data sources good enough for you? Then maybe you should collect your own data. Remember, we discussed sampling schemes at the very beginning of the course, so we expect you to tell us why your sampling scheme is good if you choose to collect your own data. Keep in mind that you need a decent sample size. However, if you can pull it off you can often get much better data to answer your question of interest. However, if you choose to collect your own data, start early and talk to your group, the TAs, etc. The more opinions about your project you get, the better it will be. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

73 Rules 1 Above all else, if you choose to collect your own data it has to be appropriate! In research, we have a board that approves human research projects called the IRB. If you think your data is questionable by appropriateness standards, read the IRB website to see if it would meet their standards. In short, this eliminates surveys involving minors, the homeless population, prisoners, the mentally handicapped, and any other population that is in a position that might cause them to feel obligated to take part in your survey. In addition to these requirements, your data cannot involve questions about alcohol or drug usage. Use your common sense! Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

74 How to get a good grade. 1 Once you have your data and have done all of the statistical analysis that you can possibly do, you and your team are tasked to make a poster telling everybody about the awesome work you have done. Your findings may seem minor to you, but they are probably much more interesting than you give yourself credit for. The data is new to everyone else, so they will probably find it very interesting. 2 The best way to make sure that your poster gets a good grade from the QPM team is to follow the instructions on the last page of your syllabus! The rubric for grading your poster is very descriptive and thorough. The examples on blackboard all scored an A based on this rubric. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

75 Some notes about the posters 1 We need documented R code for everything you do to make your poster! And this R code should be full of comments! Lots and lots of comments! Lots of comments eliminates guess work if we aren t sure what you are doing in your R code. Less guess work almost absolutely leads to a better grade. 2 Spend a good amount of time on the statistical analysis as well as the poster presentation. We like well thought out projects as well as well presented projects. Both of these contribute to your overall grade. 3 Finally, don t wait until the last minute for this project. This project is a pretty significant portion of your grade and should be treated as such. Waiting until the last minute to do this project will surely lead to a poor grade. Lecture 19 (QPM 2016) R, R-squared and F November 9, / 36

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Simple Linear Regression

Simple Linear Regression STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze

More information

Chapter 2: Descriptive Statistics

Chapter 2: Descriptive Statistics Chapter 2: Descriptive Statistics **This chapter corresponds to chapters 2 ( Means to an End ) and 3 ( Vive la Difference ) of your book. What it is: Descriptive statistics are values that describe the

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

Using Excel for Statistical Analysis

Using Excel for Statistical Analysis Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-

More information

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection Chapter 311 Introduction Often, theory and experience give only general direction as to which of a pool of candidate variables (including transformed variables) should be included in the regression model.

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

We extended the additive model in two variables to the interaction model by adding a third term to the equation. Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

More information

Causal Infraction and Network Marketing - Trends in Data Science

Causal Infraction and Network Marketing - Trends in Data Science Causality and Treatment Effects Prof. Jacob M. Montgomery Quantitative Political Methodology (L32 363) October 21, 2013 Lecture 13 (QPM 2013) Causality and Treatment Effects October 21, 2013 1 / 19 Overview

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis Journal of tourism [No. 8] MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM Assistant Ph.D. Erika KULCSÁR Babeş Bolyai University of Cluj Napoca, Romania Abstract This paper analysis

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Exploratory Data Analysis. Psychology 3256

Exploratory Data Analysis. Psychology 3256 Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

ANOVA. February 12, 2015

ANOVA. February 12, 2015 ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R

More information

Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test. Dr. Tom Pierce Radford University Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects

More information

Lets suppose we rolled a six-sided die 150 times and recorded the number of times each outcome (1-6) occured. The data is

Lets suppose we rolled a six-sided die 150 times and recorded the number of times each outcome (1-6) occured. The data is In this lab we will look at how R can eliminate most of the annoying calculations involved in (a) using Chi-Squared tests to check for homogeneity in two-way tables of catagorical data and (b) computing

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

More information

Binary Logistic Regression

Binary Logistic Regression Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

More information

CREATIVE S SKETCHBOOK

CREATIVE S SKETCHBOOK Session Plan for Creative Directors CREATIVE S SKETCHBOOK THIS SKETCHBOOK BELONGS TO: @OfficialSYP 1 WELCOME YOUNG CREATIVE If you re reading this, it means you ve accepted the We-CTV challenge and are

More information

1 Descriptive statistics: mode, mean and median

1 Descriptive statistics: mode, mean and median 1 Descriptive statistics: mode, mean and median Statistics and Linguistic Applications Hale February 5, 2008 It s hard to understand data if you have to look at it all. Descriptive statistics are things

More information

Data Mining Introduction

Data Mining Introduction Data Mining Introduction Bob Stine Dept of Statistics, School University of Pennsylvania www-stat.wharton.upenn.edu/~stine What is data mining? An insult? Predictive modeling Large, wide data sets, often

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

REACHING YOUR GOALS. Session 4. Objectives. Time. Materials. Preparation. Procedure. wait4sex

REACHING YOUR GOALS. Session 4. Objectives. Time. Materials. Preparation. Procedure. wait4sex Session 4 REACHING YOUR GOALS Objectives At the completion of this session, youth will: 1. Define what is meant by short and long term goals, 2. Identify and describe personal goals to the group, and 3.

More information

AP STATISTICS REVIEW (YMS Chapters 1-8)

AP STATISTICS REVIEW (YMS Chapters 1-8) AP STATISTICS REVIEW (YMS Chapters 1-8) Exploring Data (Chapter 1) Categorical Data nominal scale, names e.g. male/female or eye color or breeds of dogs Quantitative Data rational scale (can +,,, with

More information

T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Statistical Functions in Excel

Statistical Functions in Excel Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance

Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance 14 November 2007 1 Confidence intervals and hypothesis testing for linear regression Just as there was

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

MULTIPLE REGRESSION WITH CATEGORICAL DATA

MULTIPLE REGRESSION WITH CATEGORICAL DATA DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

More information

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

STAT 350 Practice Final Exam Solution (Spring 2015)

STAT 350 Practice Final Exam Solution (Spring 2015) PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

More information

Pushes and Pulls. TCAPS Created June 2010 by J. McCain

Pushes and Pulls. TCAPS Created June 2010 by J. McCain Pushes and Pulls K i n d e r g a r t e n S c i e n c e TCAPS Created June 2010 by J. McCain Table of Contents Science GLCEs incorporated in this Unit............... 2-3 Materials List.......................................

More information

False. Model 2 is not a special case of Model 1, because Model 2 includes X5, which is not part of Model 1. What she ought to do is estimate

False. Model 2 is not a special case of Model 1, because Model 2 includes X5, which is not part of Model 1. What she ought to do is estimate Sociology 59 - Research Statistics I Final Exam Answer Key December 6, 00 Where appropriate, show your work - partial credit may be given. (On the other hand, don't waste a lot of time on excess verbiage.)

More information

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Mgmt 469. Model Specification: Choosing the Right Variables for the Right Hand Side

Mgmt 469. Model Specification: Choosing the Right Variables for the Right Hand Side Mgmt 469 Model Specification: Choosing the Right Variables for the Right Hand Side Even if you have only a handful of predictor variables to choose from, there are infinitely many ways to specify the right

More information

Augmented reality enhances learning at Manchester School of Medicine

Augmented reality enhances learning at Manchester School of Medicine Augmented reality enhances learning at Manchester School of Medicine Welcome to the Jisc podcast. The University of Manchester is taking a unique approach to prescription training for its medical students

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

E-mail Marketing for Martial Arts Schools:

E-mail Marketing for Martial Arts Schools: E-mail Marketing for Martial Arts Schools: Tips, Tricks, and Strategies That Will Send a Flood of New Students into Your School Practically Overnight! By Michael Parrella CEO of Full Contact Online Marketing

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

More information

SPSS Explore procedure

SPSS Explore procedure SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

More information

Comparing Nested Models

Comparing Nested Models Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

POL 204b: Research and Methodology

POL 204b: Research and Methodology POL 204b: Research and Methodology Winter 2010 T 9:00-12:00 SSB104 & 139 Professor Scott Desposato Office: 325 Social Sciences Building Office Hours: W 1:00-3:00 phone: 858-534-0198 email: swd@ucsd.edu

More information

ANALYSIS OF TREND CHAPTER 5

ANALYSIS OF TREND CHAPTER 5 ANALYSIS OF TREND CHAPTER 5 ERSH 8310 Lecture 7 September 13, 2007 Today s Class Analysis of trends Using contrasts to do something a bit more practical. Linear trends. Quadratic trends. Trends in SPSS.

More information

Psychology 205: Research Methods in Psychology

Psychology 205: Research Methods in Psychology Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready

More information

UNIVERSITY OF NAIROBI

UNIVERSITY OF NAIROBI UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

1.2 Investigations and Experiments

1.2 Investigations and Experiments Science is about figuring out cause and effect relationships. If we do something, what happens? If we make a ramp steeper, how much faster will a car roll down? This is an easy question. However, the process

More information

When to use Excel. When NOT to use Excel 9/24/2014

When to use Excel. When NOT to use Excel 9/24/2014 Analyzing Quantitative Assessment Data with Excel October 2, 2014 Jeremy Penn, Ph.D. Director When to use Excel You want to quickly summarize or analyze your assessment data You want to create basic visual

More information

CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 14 NONPARAMETRIC TESTS CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

Using R for Linear Regression

Using R for Linear Regression Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

More information

Ep #19: Thought Management

Ep #19: Thought Management Full Episode Transcript With Your Host Brooke Castillo Welcome to The Life Coach School podcast, where it s all about real clients, real problems and real coaching. And now your host, Master Coach Instructor,

More information

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Simple Linear Regression, Scatterplots, and Bivariate Correlation 1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

More information

Joseph in Egypt. Genesis 39:2-3 the LORD was with Joseph and gave him success in everything he did.

Joseph in Egypt. Genesis 39:2-3 the LORD was with Joseph and gave him success in everything he did. Joseph in Egypt Teacher Pep Talk: Joseph s brothers had seen their chance to get rid of him and they did. They sold him into slavery in Egypt. But the LORD was with Joseph in Egypt and gave him success

More information

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship

More information

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 5-10 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Violent crime total. Problem Set 1

Violent crime total. Problem Set 1 Problem Set 1 Note: this problem set is primarily intended to get you used to manipulating and presenting data using a spreadsheet program. While subsequent problem sets will be useful indicators of the

More information

This chapter discusses some of the basic concepts in inferential statistics.

This chapter discusses some of the basic concepts in inferential statistics. Research Skills for Psychology Majors: Everything You Need to Know to Get Started Inferential Statistics: Basic Concepts This chapter discusses some of the basic concepts in inferential statistics. Details

More information

Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

More information

Why Your Business Needs a Website: Ten Reasons. Contact Us: 727.542.3592 Info@intensiveonlinemarketers.com

Why Your Business Needs a Website: Ten Reasons. Contact Us: 727.542.3592 Info@intensiveonlinemarketers.com Why Your Business Needs a Website: Ten Reasons Contact Us: 727.542.3592 Info@intensiveonlinemarketers.com Reason 1: Does Your Competition Have a Website? As the owner of a small business, you understand

More information

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: What do the data look like? Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Your logbook. Choosing a topic

Your logbook. Choosing a topic This booklet contains information that will be used to complete a science fair project for the César Chávez science fair. It is designed to help participants to successfully complete a project. This booklet

More information