# Introduction to Quantitative Methods

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics Frequency Tables Measures of Central Tendencies Measures of Variability Summary of Central Tendencies and Variability Inferential Statistics More Definitions and Terms Comparing Two or More Groups Association and Correlation Explaining a Dependent Variable Assumptions List of Tables 1 Frequency Table Socioeconomic Class Crosstab of Music Preference and Age Summary of Univariate Statistics Comparing Group Means Association and Correlation Explaining a Dependent Variable

2 1 Definition of Key Terms 1. Unit of Analysis (also referred to as cases): The most elementary part of what is being studied or observed. Some examples include individuals, households, court cases, countries, states, firms, industries, etc. 2. Variables: Concepts, characteristics, or properties that can vary, or change, from one unit of analysis to another. Please note that all variables must vary, if there is no variation among the different cases then it is not a variable. Some examples of variables include gender, social class, education, age, level of public enforcement, type of bankruptcy, etc. (a) Dependent Variable DV: Variables whose change the researcher wishes to explain (b) Independent Variable IV: Variables that help explain the change in the dependent variable 3. Hypothesis: An empirical statement which seeks to test the relationship between at least two variables. For instance, As levels of public enforcement increases, levels of stock development also increases. This hypothesis has two variables: (1) public enforcement independent variable, and (2) stock development dependent variable. 4. Levels of Measuring Variables (a) Nomial: A nominal variable has qualitative categories that cannot be ranked in a meaningful way in terms of degree or magnitude. Examples of nominal variables include RACE, TYPE OF BANKRUPTCY, TYPE OF CORPORATION, NAME. All of these variables have qualitative categories that cannot be ordered in terms of magnitude or degree. This is the least powerful type of variable. 1 (b) Ordinal: An ordinal variable has qualitative categories that are ordered in terms of degree or magnitude. Examples of a nominal variable include CLASS or DEGREE OBTAINED. The variable DEGREE OBTAINED may include the following categories: 1 Alphabetizing the categories does not count as ordering the variable, because the ordering has to be in terms of degree or magnitude. 2

3 None, High School Diploma, College/University Degree, Masters, Advanced Degree (JD/PHD/MD). All of these categories are qualitative and are ordered in terms of the amount of education each individual has completed. (c) Interval/Ratio: An interval variable has quantitative values (or numbers). Some examples of interval variables include AGE (in years), NUMBER OF SHARES OUTSTANDING, and AMOUNT IN DEBT (in dollars). For all of these variables the response is going to be a number or value. This is most powerful type of variable because you can do the most with it statistically. Note that if a variable has qualitative categories that ARE ordered and there are numerical values assigned to each category which are also ordered, we can treat this variable like an interval level variable. An example would be questionnaire that asks respondents about their feelings towards President Obama s handling of the economy on a scale of 1 to 5 where (1=very bad job, 2=bad job, 3=neither bad nor good, 4=good job, and 5=very good job). The respondents are asked to choose a category that is ordered, but since it has ordered numbers attached to the categories, we can treat it as an interval level variable with some restrictions. 2 (d) Dichotomous/Dummy: A dichotomous variable is a variable with two (and only two) categories. These categories can be qualitative or quantitative values. 3 2 Descriptive Statistics Descriptive statistics are often used to describe variables. Descriptive statistics are performed by analyzing one variable at a time (univariate analysis). All researchers perform these descriptive statistics before beginning any type of data analysis. 2 One such restriction being the dependent variable in regression analysis. In order to perform regression (see section 3.4) your dependent variable must be a proper interval variable. 3 It is possible to convert nominal variables into numerous dichotomous/dummy variables. 3

4 2.1 Frequency Tables Frequency tables are a detailed description of the categories/values for one variable. A frequency table most often includes all of the following: 4 1. Absolute frequency (or just frequency): This tells you how many times a particular category in your variable occurs. This is a tally, count, or frequency of occurrence of each individual category/value in the table. 2. Relative frequency (or percent): This tells you the percentage of each category/value relative to the total number of cases. 3. Cumulative frequency: This is simply a cumulation of the relative frequency for each category/value. Table 1 provides an example of a frequency table for an ordinal variable (note it is ordinal because the categories are qualitative and ordered) named Socioeconomic Class. If there were numbers assigned to each category that were also ordered, we could treat this as an interval level variable. Table 1: Frequency Table Socioeconomic Class Socioeconomic Class Frequency Percent Cumm. Percent Upper % 7.14% Upper Middle % 28.57% Middle % 71.43% Lower Middle % 92.86% Lower % 100% Total % 4. Crosstabulations: This is also referred to as a grouped frequency table for two variables. A crosstab simply presents the absolute frequency broken down by categories of two or more variables. It is also possible to find percentages in these types of tables. For instance, using the 4 The stata command for frequency is fre or tab. Before you use the fre command you need to install it onto your computer, so you need to type the following command: ssc install fre, which will install the fre command onto your computer. For a frequency table of a variable named class, type either fre class or tab class 4

5 example below, we can find the percentage of young people that listen to music. 5 Table 2: Crosstab of Music Preference and Age AGE Preference Young Middle Age Old Music News-talk Sports Measures of Central Tendencies Measures of central tendencies provide the most occurring or middle value/category for each variable. There are three measures of central tendencies mode, median, and mean. See Table 3 for a summary of measures of central tendencies. 2.3 Measures of Variability Measures of variability is defined as the dispersion (or deviation) away from the mean for each variable. Measures of variability only exist for interval level variables. There are three measures of variability range, standard deviation, and variance. A discussion of each can be found below followed by a summary table (Table 3). 1. Range: The range is found by taking the highest value of a variable minus the lowest value of that variable. 2. Standard deviation: The standard deviation exists for all interval variables. It is the average distance of each value away from the sample mean. The larger the standard deviation, the farther away the values are from the mean; the smaller the standard deviation the closer, the values are to the mean. Suppose you passed out a questionnaire 5 The stata command for a crosstab is either tab or tab2. For a crosstab of two variables named age and preference, type the following command into stata: tab age preference 5

6 asking randomly selected individuals to rate President Obama s job performance on a scale from 1 to 10. You find that on average these individuals give the President a rating of 5.8, and this variable has a standard deviation of 1.2. This means that on average, each rating of the President is approximately 1.2 points away from 5.8 (the sample mean). 3. Variance: The variance is always going to the the standard deviation squared. The variance cannot be interpreted as meaning anything other than the standard deviation squared Summary of Central Tendencies and Variability Table 3: Summary of Univariate Statistics Univariate Variables Description Statistic Mode Nominal, Ordinal, and Interval most frequent category/value Median Ordinal and Interval category/value that lies in the middle Mean Interval value that represents the average Range Interval highest value minus lowest value Standard Interval on average how much each individual Deviation value is dispersed around the mean Variance Interval standard deviation squared 3 Inferential Statistics 3.1 More Definitions and Terms 1. Normal Curve An interval variable is said to be normally distributed if it has all of the following characteristics: (a) A bell shape curve. 6 In stata, the easiest way to find the mode is by looking at a frequency table and finding the value/category that occurs most frequently. The median, mean, standard deviation, and variance can be found by using the following command: sum var1 var2..., detail 6

7 (b) It is perfectly symmetrical. (c) All measures of central tendencies (mode, median, and mean) lie in the middle middle of the curve. These measures of central tendencies divide the curve in half (where 50% of the values lie to the left of the mean, and 50% lie to the right). (d) Approximately 95% of the values are found two standard deviations away from the mean (in both directions). Variables that are determined by nature are normally distributed (graphically they have a normal curve) such as age, weight, height, etc. It is important to understand what a normal curve looks like and its characteristics because almost all methods described below assume normality. If this assumption is violated (i.e. a variable is not normally distributed) it can have an effect on the statistical results (resulting in significance when in reality it is not significant, or not resulting in statistical significance when it is significant). If variables are not nor- 7

8 mally distributed, it is easy to make transformations, such as logging or taking the square root, in order to achieve normality Confidence Intervals: Confidence Intervals are used to estimate a range of the population based on some sample of any interval level variable. Confidence intervals are two numbers which represent the higher and lower limits of a statistic, coefficient, or paramater. Confidence intervals assume the interval level variable has a normal distribution, and uses the sample in order to find a range of the entire population. When dealing with confidence intervals, a confidence or an α level (see discussion below on statistical significance for explanation of these terms) must be specified. 3. Standard Error: Standard error is the estimated standard deviation, and the standard error squared is the estimated variance. Standard error plays a large role in testing for significance, and can drastically affect the outcome. For instance, large standard errors will cause variables to be insignificant, which may indicate an incorrect use of a statistical method or analysis. 4. Statistical Significance: Statistical significance represents the results of some statistical test that is being performed. The statistical test varies depending on the levels of measurement of the variables, and the objective of the research or hypothesis. There are numerous different tests but they all have some similarities and include all of the following: (a) One Null Hypothesis: The null hypothesis usually states there is no relationship between the variables being tested. The null hypothesis is already determined and based on the method being used. Most null hypotheses state that one statistic or number is equal to another statistic or number. This is usually displayed as: H 0 : a = b (b) One Alternative Hypothesis: 8 The alternative hypothesis usually states that the two or more variables are somehow related. 7 The best way to see if an interval variable is normal is with a histogram. A histogram is a graph which places the values of the interval variable on the X axis, and the frequency or density on the Y axis. In Stata, the command for a histogram is histogram var1, freq normal 8 This is also referred to as a research hypothesis. I refer to this as a research hypothesis or an alternative hypothesis 8

9 Like the null hypothesis, the alternative hypothesis is also already determined based on the method being used. The alternative hypothesis is the opposite of the null hypothesis and usually states that one statistic or number is not equal to another statistic or number. Alternative hypotheses can be displayed using one of the following forms: i. H a : a b ii. H a : a > b iii. H a : a < b (c) Outcome of Statistical Test: All tests will always have one of two outcomes: i. Reject H 0 and Accept H a, which means that the null is wrong and the alternative hypothesis is right. When this occurs, there is statistical significance, and a confidence or an α level is always specified (discussed below). ii. Accept H 0 and Reject H a, which means that the null is right and the alternative hypothesis is wrong. When this occurs, there is no statistical significance, and therefore no need to specify a confidence or an α level. (d) Confidence Level: With every test that is being performed, the researcher must specify a confidence level if and only if the H 0 is rejected and the H a is accepted. The confidence level literally translates into the percentage or probability of correctly rejecting the null hypothesis (or the level in which the outcome is correct). In social science and statistics, the most common confidence levels are AT LEAST 90%. This means that with each test, researchers want to be right at least 90% of the time. Usually researchers report confidence levels at 90%, 95%, 99%, or 99.9%, the higher the confidence level the better the results. The confidence level can also be displayed as a probability, where 90% confidence level translates into being right with a probability of (e) α level or p value: 9 In addition to specifying a confidence level when rejecting the H 0, an α level must also be specified. Similar to 9 These are just different names for the same thing, and can be seen in articles as statistical significance, α, p-value, or even Type I error. I use these terms interchangeably, but they all mean the same thing. 9

10 the confidence level, the α level can be displayed as a percentage or a probability. The α level represents the probability or percentage of rejecting the null when it should not have been rejected. In other words, the α level is the probability or percentage of making a mistake, and the lower the α level the better the results. A statistical significance (or α level) of 1% is better than a statistical significance of 5%. α (probability) = 1 probability confidence level α (percentage) = 100 percentage confidence level (f) What does a statistical test do: When performing any type of inferential statistics and any type of statistical testing, a value is generated based on the data (either a T, F, Z, or χ 2 ), and this value is being compared to some corresponding critical value 10 (T, F, Z, or χ 2 ) in order to determine statistical significance. 3.2 Comparing Two or More Groups The following methods are used if a researcher is interested in comparing differences in two or more group means, and determining whether the difference is statistically significant or a result of sampling error. This portion will provide a table of some of the differences between the methods, and a brief discussion of important details and clarification of each method. Table 4: Comparing Group Means Method Grouping Variable Mean Variable Stata Command Two sample T Dichotomous Interval ttest var1, by(var2) 11 Paired T Before and After Interval ttest var1==var2 One way ANOVA Nominal Interval oneway var1 var Two sample T Test Two sample t test compares means across TWO (and only two) groups. 10 These critical values can be found by looking at tables in the back of any statistics or research methods textbook. 11 In two sample t test var1 is the interval variable, and var2 is the grouping variable. Note the grouping variable MUST be a dummy with 2 categories 12 In ANOVA var1 is the interval variable, and var2 is the grouping variable 10

11 For instance, a researcher wishes to know if there is a difference in the amount of debt (in dollars) in Chapter 7 and Chapter 13 bankruptcies. The researcher is interested in two variables, (1) the amount of debt in dollars (interval), and (2) Chapter 7 or Chapter 13 (dummy). The null hypothesis for this example is that there is no difference in the amount of debt between Chapter 7 and Chapter 13 bankruptcies. In other words, the average amount of debt for Chapter 7 equals the average amount of debt for Chapter 13. The alternative hypothesis is that the average amount of debt is different for these two types of bankruptcies. This test assumes that the interval variable is normally distributed across both groups, and that the variances are equal across both groups. 2. Paired T Test Paired t (also referred to as matched t) test compares means across the SAME VARIABLE and the SAME CASES at two different times. Suppose you were interested in looking to see if neighborhood watch was effective. You had data on the number of calls to 911 in neighborhoods one week before the watch was implemented and one week after the watch started. The null hypothesis is that the neighborhood watch was ineffective (or that the number of 911 calls before the watch remained unchanged after the watch). Matched t test assumes that the interval variable is normally distributed at both times. 3. ANOVA ANOVA (Analysis of Variance) is similar to the two sample t test, but it compares means across MORE THAN TWO groups. A researcher would use ANOVA if s/he was interested in comparing the difference in the amount of debt (in dollars) between Chapters 7, 12, and 13. Now the researcher is interested in comparing three types of bankruptcies, and wants to test the null that there is no difference in the amount of debt for these three types of bankruptcies, or that these three types of bankruptcies has equal group means. The research hypothesis is that at least one type of bankruptcy is different from another in terms of the average amount in debt. Conducting ANOVA alone will not tell you which groups are different from each other, but only if there is a difference between groups. In order to see which specific groups are different you need to conduct a post-hoc test (this is usually done if and only if you find there is a difference between the groups). ANOVA 11

12 assumes that the interval variable is normally distributed. 3.3 Association and Correlation Tests of association only tell you if two or more variables are associated (if two variables tend to occur together). Correlations tell you not only if the variables are associated but also the direction and strength of the relationship. Correlations only range from -1 to 1. A correlation of 0 means that the variables are not related. A positive correlation indicates a positive relationship (an increase in one variable leads to an increase in another variable), while a negative correlation indicates a negative relationship (an increase in one variable leads to a decrease in another variable). The closer a correlation is to -1 or 1 the stronger the relationship between the variables. For instance a correlation of 0.01 to 0.3 indicates a weak positive relationship, while a correlation of to -.3 indicates a weak negative relationship. A correlation of 0.31 to 0.69 indicates a moderate positive relationship while a correlation of to indicates a moderate negative relationship. A correlation above 0.7 indicates a strong positive, and a correlation below -0.7 indicates a strong negative relationship. Table 5: Association and Correlation Method Type Level of Variables Stata Command Chi Square Association Nominal tab var1 var2, chi2 Fishers Exact Association Dichotomous tab var1 var2, exact Pearson Correlation Interval corr var1 var2 Spearman Correlation Ordinal spearman var1 var2 Statistical significance does not tell you the strength of the relationship but only how often the relationship between the two variables holds. For instance, association only tells you if the two variables are statistically significant, or how often these two variables are not independent (a statistical significance of 5% indicates that 95 out of 100 times these two variables tend to occur together). Correlation, on the other hand, tells you how often these two variables are related as well as how strong the relationship is between the two variables. A correlation of 0.8 that has a statistical significance of 5% indicates that 95 out of 100 times these two variables strongly and positively occur together. The null for all of the methods described in Table 5 is that the two variables are independent (or that they are not associated/correlated). 12

13 3.4 Explaining a Dependent Variable If you are looking to explain a variable using one or more variables then you want to use any of methods described below depending on the level of measurement of the dependent variable. Note that for all of the models described below, you can never have a nominal variable (with ore than two categories) as in independent variable. If you want to use a nominal level variable as an IV, then you must recode it into one or more dummy variables. Suppose you wanted the model to control for race. Since race is a nominal variable you cannot include it in your analysis as an IV, however, you could create dichotomous variables for all the categories of race and include these newly formed dichotomous variables. Table 6: Explaining a Dependent Variable Method Dependent Independent 13 Stata Variable Variable Command OLS Regression Interval 14 Dummy/Interval regress dv iv1 iv2 iv Logit/Probit Dummy Dummy/Interval logit dv iv1 iv2 iv3... Cumulative Logit Ordinal Dummy/Interval ologit dv iv1 iv2 iv3... Multinominal Logit Nominal Dummy/Interval mlogit dv iv1 iv2 iv3... Poisson Count 16 Dummy/Interval poisson dv iv1 iv2 iv3... The null hypothesis for all of these methods is that the independent variable does not have an effect on the dependent variable. This null hypothesis is performed for each independent variable You can also use ordinal level variables as independent level variables as long as the categories are ordered in terms of degree or magnitude and the numbers corresponding to the categories are also ordered. 14 The DV in regression analysis MUST be an interval variable. It cannot be an ordinal variable that is being treated as an interval variable, or a dichotomous variable being treated as an interval variable. 15 dv=your dependent variable, iv=your independent variable(s) 16 The dependent variable must not have any negative values and can be non-normal. 17 For all of these methods, the null hypothesis is that the coefficient (β) is equal to 0, and the alternative hypothesis is that the coefficient is a value no equal to zero. This literally means different things for the different methods, but can be interpreted the same way. For regression a coefficient equal to zero literally means that there is no linear change in the dependent variable as the independent variable changes, or that the slope is equal to 0. For the logit and poisson models, the null hypothesis literally means that the logit or the log of the odds is zero. 13

14 3.4.1 Assumptions All of these models have certain assumptions. If these assumptions are not met, it affects the way the results are interpreted and can lead to serious errors in the statistical tests. All of these models have some similar assumptions about the observations, independent, and dependent variables. All models assume that the observations are independent and are randomly sampled from the entire population of interest. 18 The models also assume that no important independent variables are omitted from the model, and that all variables are measured without error. All of these models are stating that there is some functional relationship between the independent and dependent variables. The exact functional relationship is will depend on the method you use to fit your data. For instance, if you are using OLS Regression, this will assume the functional relationship between the independent and dependent variables is linear or that it takes the form of a straight line. 18 Instead of handpicking cases that will ensure statistical significance. 14

### Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.

The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide

### Research Variables. Measurement. Scales of Measurement. Chapter 4: Data & the Nature of Measurement

Chapter 4: Data & the Nature of Graziano, Raulin. Research Methods, a Process of Inquiry Presented by Dustin Adams Research Variables Variable Any characteristic that can take more than one form or value.

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### Quantitative Data Analysis: Choosing a statistical test Prepared by the Office of Planning, Assessment, Research and Quality

Quantitative Data Analysis: Choosing a statistical test Prepared by the Office of Planning, Assessment, Research and Quality 1 To help choose which type of quantitative data analysis to use either before

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Statistical Significance and Bivariate Tests

Statistical Significance and Bivariate Tests BUS 735: Business Decision Making and Research 1 1.1 Goals Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions,

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

### SOME NOTES ON STATISTICAL INTERPRETATION. Below I provide some basic notes on statistical interpretation for some selected procedures.

1 SOME NOTES ON STATISTICAL INTERPRETATION Below I provide some basic notes on statistical interpretation for some selected procedures. The information provided here is not exhaustive. There is more to

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Readings: Ha and Ha Textbook - Chapters 1 8 Appendix D & E (online) Plous - Chapters 10, 11, 12 and 14 Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability

### Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### Data Analysis: Describing Data - Descriptive Statistics

WHAT IT IS Return to Table of ontents Descriptive statistics include the numbers, tables, charts, and graphs used to describe, organize, summarize, and present raw data. Descriptive statistics are most

### Content DESCRIPTIVE STATISTICS. Data & Statistic. Statistics. Example: DATA VS. STATISTIC VS. STATISTICS

Content DESCRIPTIVE STATISTICS Dr Najib Majdi bin Yaacob MD, MPH, DrPH (Epidemiology) USM Unit of Biostatistics & Research Methodology School of Medical Sciences Universiti Sains Malaysia. Introduction

### 11/20/2014. Correlational research is used to describe the relationship between two or more naturally occurring variables.

Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

### LEARNING OBJECTIVES SCALES OF MEASUREMENT: A REVIEW SCALES OF MEASUREMENT: A REVIEW DESCRIBING RESULTS DESCRIBING RESULTS 8/14/2016

UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION LEARNING OBJECTIVES Contrast three ways of describing results: Comparing group percentages Correlating scores Comparing group means Describe

### Statistical tests for SPSS

Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

### Some Critical Information about SOME Statistical Tests and Measures of Correlation/Association

Some Critical Information about SOME Statistical Tests and Measures of Correlation/Association This information is adapted from and draws heavily on: Sheskin, David J. 2000. Handbook of Parametric and

### Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

### 10-3 Measures of Central Tendency and Variation

10-3 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.

### Module 9: Nonparametric Tests. The Applied Research Center

Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test

### Unit 29 Chi-Square Goodness-of-Fit Test

Unit 29 Chi-Square Goodness-of-Fit Test Objectives: To perform the chi-square hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni

### Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

### Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics

Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public

### Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

### The general form of the PROC MEANS statement is

Describing Your Data Using PROC MEANS PROC MEANS can be used to compute various univariate descriptive statistics for specified variables including the number of observations, mean, standard deviation,

### Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

### business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

### Module 2 Project Maths Development Team Draft (Version 2)

5 Week Modular Course in Statistics & Probability Strand 1 Module 2 Analysing Data Numerically Measures of Central Tendency Mean Median Mode Measures of Spread Range Standard Deviation Inter-Quartile Range

### How to choose a statistical test. Francisco J. Candido dos Reis DGO-FMRP University of São Paulo

How to choose a statistical test Francisco J. Candido dos Reis DGO-FMRP University of São Paulo Choosing the right test One of the most common queries in stats support is Which analysis should I use There

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### Simple Linear Regression in SPSS STAT 314

Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,

### 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

### Spearman s correlation

Spearman s correlation Introduction Before learning about Spearman s correllation it is important to understand Pearson s correlation which is a statistical measure of the strength of a linear relationship

### Semester 1 Statistics Short courses

Semester 1 Statistics Short courses Course: STAA0001 Basic Statistics Blackboard Site: STAA0001 Dates: Sat. March 12 th and Sat. April 30 th (9 am 5 pm) Assumed Knowledge: None Course Description Statistical

### STATISTICS FOR PSYCH MATH REVIEW GUIDE

STATISTICS FOR PSYCH MATH REVIEW GUIDE ORDER OF OPERATIONS Although remembering the order of operations as BEDMAS may seem simple, it is definitely worth reviewing in a new context such as statistics formulae.

### Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

### Association Between Variables

Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

### Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

### Statistics revision. Dr. Inna Namestnikova. Statistics revision p. 1/8

Statistics revision Dr. Inna Namestnikova inna.namestnikova@brunel.ac.uk Statistics revision p. 1/8 Introduction Statistics is the science of collecting, analyzing and drawing conclusions from data. Statistics

### SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

### The correlation coefficient

The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### F. Farrokhyar, MPhil, PhD, PDoc

Learning objectives Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc To recognize different types of variables To learn how to appropriately explore your data How to display data using graphs How

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

### Descriptive Statistics

Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

### 6.4 Normal Distribution

Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

### Normality Testing in Excel

Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

### UNIVERSITY OF NAIROBI

UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER

### Chapter 15 Multiple Choice Questions (The answers are provided after the last question.)

Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) 1. What is the median of the following set of scores? 18, 6, 12, 10, 14? a. 10 b. 14 c. 18 d. 12 2. Approximately

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### AP Statistics 1998 Scoring Guidelines

AP Statistics 1998 Scoring Guidelines These materials are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use must be sought from the Advanced Placement

### Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### Bivariate Analysis. Comparisons of proportions: Chi Square Test (X 2 test) Variable 1. Variable 2 2 LEVELS >2 LEVELS CONTINUOUS

Bivariate Analysis Variable 1 2 LEVELS >2 LEVELS CONTINUOUS Variable 2 2 LEVELS X 2 chi square test >2 LEVELS X 2 chi square test CONTINUOUS t-test X 2 chi square test X 2 chi square test ANOVA (F-test)

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

### Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

### SPSS Guide How-to, Tips, Tricks & Statistical Techniques

SPSS Guide How-to, Tips, Tricks & Statistical Techniques Support for the course Research Methodology for IB Also useful for your BSc or MSc thesis March 2014 Dr. Marijke Leliveld Jacob Wiebenga, MSc CONTENT

### Chapter 16 Multiple Choice Questions (The answers are provided after the last question.)

Chapter 16 Multiple Choice Questions (The answers are provided after the last question.) 1. Which of the following symbols represents a population parameter? a. SD b. σ c. r d. 0 2. If you drew all possible

### Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Numerical Summarization of Data OPRE 6301

Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

### TRANSCRIPT: In this lecture, we will talk about both theoretical and applied concepts related to hypothesis testing.

This is Dr. Chumney. The focus of this lecture is hypothesis testing both what it is, how hypothesis tests are used, and how to conduct hypothesis tests. 1 In this lecture, we will talk about both theoretical

### An introduction to using Microsoft Excel for quantitative data analysis

Contents An introduction to using Microsoft Excel for quantitative data analysis 1 Introduction... 1 2 Why use Excel?... 2 3 Quantitative data analysis tools in Excel... 3 4 Entering your data... 6 5 Preparing

### Choosing the correct statistical test made easy

Classroom Choosing the correct statistical test made easy N Gunawardana Senior Lecturer in Community Medicine, Faculty of Medicine, University of Colombo Gone are the days where researchers had to perform

### Hypothesis Testing & Data Analysis. Statistics. Descriptive Statistics. What is the difference between descriptive and inferential statistics?

2 Hypothesis Testing & Data Analysis 5 What is the difference between descriptive and inferential statistics? Statistics 8 Tools to help us understand our data. Makes a complicated mess simple to understand.

### SPSS: Descriptive and Inferential Statistics. For Windows

For Windows August 2012 Table of Contents Section 1: Summarizing Data...3 1.1 Descriptive Statistics...3 Section 2: Inferential Statistics... 10 2.1 Chi-Square Test... 10 2.2 T tests... 11 2.3 Correlation...

### When to Use a Particular Statistical Test

When to Use a Particular Statistical Test Central Tendency Univariate Descriptive Mode the most commonly occurring value 6 people with ages 21, 22, 21, 23, 19, 21 - mode = 21 Median the center value the

### Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

### DATA INTERPRETATION AND STATISTICS

PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

### STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis

### Chi-Square Test. Contingency Tables. Contingency Tables. Chi-Square Test for Independence. Chi-Square Tests for Goodnessof-Fit

Chi-Square Tests 15 Chapter Chi-Square Test for Independence Chi-Square Tests for Goodness Uniform Goodness- Poisson Goodness- Goodness Test ECDF Tests (Optional) McGraw-Hill/Irwin Copyright 2009 by The

### MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

There are three kinds of people in the world those who are good at math and those who are not. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Positive Views The record of a month

### Statistics and research

Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,

### DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

DATA ANALYSIS QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University Quantitative Research What is Statistics? Statistics (as a subject) is the science

### Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.

### The aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree

PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and

### Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

### An introduction to IBM SPSS Statistics

An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive

### Simple Predictive Analytics Curtis Seare

Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

### Minitab Guide. This packet contains: A Friendly Guide to Minitab. Minitab Step-By-Step

Minitab Guide This packet contains: A Friendly Guide to Minitab An introduction to Minitab; including basic Minitab functions, how to create sets of data, and how to create and edit graphs of different

### Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

### Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

### Introduction to Regression and Data Analysis

Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

### Lecture - 32 Regression Modelling Using SPSS

Applied Multivariate Statistical Modelling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture - 32 Regression Modelling Using SPSS (Refer

### Introduction to Statistics and Quantitative Research Methods

Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.

### Statistics for Sports Medicine

Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota (suzanne.hecht@gmail.com) Fellow s Research Conference July 2012: Philadelphia GOALS Try not to bore you to death!! Try to teach

### Module 10: Data Analysis and Interpretation

IPDET Module 10: Data Analysis and Interpretation Intervention or Policy Subevaluations Qualitative vs. Quantitative Qualitative Quantitative Introduction Data Analysis Strategy Analyzing Qualitative Data

### X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

### SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation