Lecture Notes Module 1

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Lecture Notes Module 1"

Transcription

1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific group of people. Some examples of human study populations are: all UCSC freshman, all Arizona public school teachers, all spouses of Alzheimer s patients in Minnesota, and all preschool children in Chicago. Measurement Properties In addition to specifying the study population of interest, a psychologist will specify some attribute to measure. When studying human populations, the attribute of interest might be a specific type of academic ability, a personality trait, some particular behavior (e.g., hours of TV watching), an attitude, an interest, an opinion, or a physiological measure (e.g., heart rate, blood pressure, blood flow in specific parts of the brain, brain wave). The measurement of the attribute that the psychologist wants to examine is called the response variable. To measure some attribute of a person s behavior is to assign a numerical value to that person. These measurements can have different properties. A ratio scale measurement has the following three properties: 1) a score of 0 represents a complete absence of the attribute being measured, 2) a ratio of any two scores correctly describes the ratio of attribute quantities, and 3) a difference between two scores correctly describes the difference in attribute quantities. Suppose a person s heart rate is measured. This measurement is a ratio scale measurement because a score of 0 beats per minute (bmp) represents a stopped heart and a heart rate of, say, 100 bpm is twice as fast as a heart rate of 50 bpm. In addition, the difference between two heart rates of, say, 50 and 60 bmp describes the same change in heart rate as the difference between 70 and 80 bpm. With interval scale measurements, a score of 0 does not represent a complete absence of the attribute being measured and a ratio of two scores does not correctly describe the ratio of attribute quantities, but a difference between two interval scale scores correctly describes the difference in attribute quantities. Most measurements of psychological attributes are not ratio scale measurements but are assumed to be interval scale attributes. For instance, the Beck Depression Inventory (BDI) is scored on a 0 63 scale with higher scores representing higher levels of depression. However, a BDI score of 0 does not indicate a complete absence of depression nor does a BDI score of, say, 40 represent twice the amount 1

2 of depression as a BDI score of 20. It is assumed that a difference between two BDI scores correctly describes the difference in depression levels so that a person who initially obtained a BDI score of, say, 30 and then obtained a score 20 after therapy is assumed to have the same level of improvement as a person who initially scored 25 on the BDI and dropped to 15 after therapy. Ratio and interval scale measurements will be referred to simply as quantitative scores. Population Parameters A population parameter is a single unknown numeric value that describes the measurements that could have been assigned to all N people in a specific study population. Psychologists would like to know the value of a particular population parameter because this information could be used to make an important decision or to advance knowledge in some area of research. The population mean, denoted by the Greek letter μ (mu), is a population parameter that is frequently of interest to psychologists. Imagine every person in a study population of size N being assigned a quantitative score. A population mean (μ) is defined as N μ = i=1 y i /N (1.1) where y i is a quantitative score for the i th person in the study population. The N summation notation i=1 y i is a more compact way of writing y 1 + y y N. Consider a study population of 2,450 elementary school teachers in a particular school district. Imagine giving a job burnout questionnaire (scored on a quantitative scale of 1 to 25) to all 2,450 teachers. The population mean job burnout score would be μ = (y1 + y2 + + y2450)/2450 where y i is the burnout score for the i th teacher. Another important population parameter is the population standard deviation which is defined as σ = N i=1 (y i μ) 2 /N and describes the variability of the quantitative measurements. Note that σ cannot be negative. Note also that if all N scores are identical (no variability), every y i value would equal μ and then σ would be zero. The squared standard deviation (σ 2 ) occurs frequently in statistical formulas and is called the variance. 2

3 Normal (Gaussian) Curve A histogram is a graph that visually describes a set of quantitative scores. A histogram is constructed by specifying several equal-length score intervals and counting the number of people who have scores that fall within each interval. An example of a histogram of scores on the Attention Deficit Checklist (ADC) for 4,810 young adults is shown below. ADC Scores Frequency Mean =10.00 Std. Dev. = N =4,810 y Scientists discovered decades ago that histograms for many different types of quantitative scores could be closely approximated by a certain type of symmetric bell-shaped curve called a normal (or Gaussian) curve. The histogram above includes a graph of a normal curve that closely approximates the shape of the histogram in this particular application. If a set of quantitative scores is approximately normal, the scores will have the following characteristics: about half of the scores are above the mean and about half are below the mean about 68% of the scores are within 1 standard deviation of the mean about 95% of the scores are within 2 standard deviations of the mean almost all (99.7%) of the scores are within 3 standard deviations of the mean 3

4 A normal distribution with a mean of 0 and a standard deviation of 1 is called a standard unit normal distribution. The symbol z α/2 will be used to denote the point on a standard unit normal distribution for which 100(1 α)% of the distribution is between the values -z α/2 and z α/2. For instance, it can be shown that 95% of a standard unit normal distribution is between the values and 1.96 and so z α/2 = 1.96 for α =.05. Random Samples and Parameter Estimates In applications where the study population is large or the cost of measurement is high, the psychologist may not have the necessary resources to measure all N people in the study population. In these applications, the psychologist could take a random sample of n people from the study population of N people. In studies where random sampling is used, the study population is defined as the population from which the random sample was obtained. A random sample of size n is selected in such a way that every possible sample of size n will have the same chance of being selected. Computer programs are typically used to obtain a random sample of size n from a study population of size N. A population mean can be estimated from a random sample. The sample mean n μ = i=1 y i /n (1.2) is an estimate of μ (some statistics texts use X to denote the sample mean). A standard deviation can be estimated from a random sample. The sample standard deviation n σ = i=1 (y i μ ) 2 /(n 1) (1.3) is an estimate of σ (some statistics texts use s to denote the sample standard deviation). Squaring Equation 1.3 gives an estimate of the population variance. Of course, psychologists want to know the exact value of μ but they usually must settle for a sample estimate of μ because the study population size is either too large or the measurement process is too costly. However, the sample mean by itself can be misleading because μ μ will be positive or negative and the direction of the error will be unknown. In other words, the psychologist will not know if the sample mean has overestimated or underestimated the population mean. Furthermore, the magnitude of μ μ will be unknown. Thus, the value of the sample mean might be too small or too large, and it might be close to or very different from the value of μ. 4

5 Standard Error The standard error of a parameter estimate numerically describes the accuracy of an estimate. A small value of the standard error indicates that the parameter estimate is likely to be close to the unknown population parameter value, while a large standard error value indicates that the parameter estimate could be very different from the study population parameter value. A standard error of a parameter estimate can be estimated from a random sample. The estimated standard error of μ is SE μ = σ 2/n. (1.4) From Equation 1.4 it is clear that increasing the sample size (n) will decrease the value of the standard error and increase the accuracy of the sample mean. From Equation 1.4, it also can be seen that variability in the quantitative scores affects the accuracy of the estimate of a population mean with greater variability in scores leading to less accuracy in the sample mean. Confidence Interval for μ We can learn something about the unknown value of μ by using information from a random sample. By using an estimate of μ (Equation 1.2) and its standard error (Equation 1.4), which can be obtained from one random sample, it is possible to say something about the unknown value of μ in the form of a confidence interval. A confidence interval is a range of values that is believed to contain an unknown population parameter value with some specified degree of confidence. A 100(1 α)% confidence interval for μ is μ ± t α/2;df SE μ (1.5) where t α/2;df is a two-sided critical t-value. The value of t α/2;df can be found in a table of critical t-values given in most statistics texts. The symbol df refers to degrees of freedom and is equal to n 1 in this type of application. The value 100(1 α)% is the confidence level. In psychological studies, it is common to set α =.05 to give a 95% confidence level. Example 1.1. The EPA estimates that lead in drinking water is responsible for more than 500,000 new cases of learning disabilities in children each year. Lead contaminated drinking water is most prevalent in homes built before A random sample of n = 10 homes was obtained from a listing of about 240,000 pre-1970 homes in the San Francisco 5

6 area. Drinking water from the 10 homes was tested for lead (the test costs about $25 per house). The legal lead concentration limit for drinking water is 15 ppb. The measured lead concentrations (in ppb) for the 10 homes are given below The sample mean, sample variance, and standard error for this sample of 10 homes are computed below. μ = ( )/10 = 24.7 σ 2 = [( ) 2 + ( ) ( ) 2 ]/(10 1) = SE μ = σ 2/n = 144.0/10 = 3.79 With a sample size of 10 homes, df = n 1 = 9 and t.05/2;9 = The 95% lower confidence limit is (3.79) = 16.2 and the upper 95% limit = (3.79) = We can be 95% confident that the mean lead concentration in the drinking water of the 240,000 older homes is between 16.2 ppb and 33.3 ppb. Properties of Confidence Intervals There are two important properties of confidence intervals: increasing the sample size will tend to reduce the width of the confidence interval, and increasing the level of confidence (e.g., from 95% to 99%) will increase the width of the confidence interval. Increasing the level of confidence increases the proportion of all possible samples in which a confidence interval will capture the unknown population parameter value. These properties are illustrated in analysis of 50 different random samples of n = 30 from a study population of about 15,000 nurses who were all given an emotional exhaustion questionnaire and their mean score was In this hypothetical example, we know that μ = 22.5 but in practice we will not be able to measure all members of the study population and we will estimate μ using the information contained in just one random sample. 6

7 The above table displays the results of 95% confidence intervals from 50 different random samples. Note that the 95% confidence intervals for μ failed to capture the actual population mean value of 22.5 in sample 7 and sample 34. The table below displays the results for 99% confidence intervals computed from the same 50 random samples. Note that these confidence intervals are wider (less precise) but all of them have captured the population mean value. Choosing a Confidence Level The American Psychological Association recommends using 95% confidence intervals. A 95% confidence interval represents a good compromise between the level of confidence and the confidence interval width, as shown in the following graph. Notice that the confidence interval width increases almost linearly up to a confidence level of about 95% and then the width increases dramatically with increasing confidence. Thus, small increases in the level of confidence beyond 95% lead to relatively large increases in the confidence interval width CI Width Confidence 7

8 Hypothesis Testing In some applications, the psychologist simply needs to decide if the population parameter is greater than some value or less than some value. If the parameter is greater than some value, then one course of action will be taken; if the parameter is less than some value, then another course of action will be taken. The following notation is used to specify a set of hypotheses regarding μ H0: μ = h H1: μ > h H2: μ < h where h is some number specified by the psychologist and H0 is called the null hypothesis. H1 and H2 are called the alternative hypotheses. In virtually all applications, H0 is known to be false (because it is extremely unlikely that μ will exactly equal h) and the psychologist s goal is to decide if H1 or H2 is true. Consider the following example. If the mean job satisfaction score in a study population of employees is less than 5, then a company will increase year-end bonuses; otherwise, the standard bonus will be given. In this specific application, the set of hypotheses is shown below. H0: μ = 5 H1: μ > 5 H2: μ < 5 A confidence interval for μ can be used to choose between H1: μ > h and H2: μ < h. If the upper limit of a 100(1 α)% confidence interval is less than h, then H0 is rejected and H2 is accepted. If the lower limit of a 100(1 α)% confidence interval is greater than h, then H0 is rejected and H1 is accepted. If the confidence interval includes h, then H0 cannot be rejected. This general hypothesis testing procedure is called a three-decision rule because one of following three decisions will be made: 1) accept H1, 2) accept H2, or 3) fail to reject H0. A failure to reject H0 is called an inconclusive result. A test of H0: μ = h is commonly referred to as a one-sample t-test and involves the computation of the test statistic t = (μ h)/se μ. Statistical packages such as SPSS or R will compute the p-value that corresponds to the value of the test statistic. The p-value can be used to reject H0: μ = h. Specifically, H0 is rejected if the p-value is less than α (α is usually set equal to.05). 8

9 The p-value is related to the sample size with larger sample sizes leading to smaller p-values. With a sufficiently large sample size, the p-value for a test of H0: μ = h will be less than.05. It is a common practice to report the results of a statistical test to be significant if the p-value is less than.05 and nonsignificant if the p-value is greater than.05. It is important to remember that a p-value of less than.05 (a significant result) simply indicates that the sample size was large enough to reject the null hypothesis (which is known to be false in virtually all applications) and does not indicate if the population mean is meaningfully different from the hypothesized value. Also, a p-value greater than.05 does not imply that H0 is true. In a three-decision rule, a directional error occurs when H1: μ > b has been accepted but μ < b is true or when H2 : μ < b has been accepted but μ > b is true. The probability of making a directional error is at most α/2. For instance, if a 95% confidence interval is used to select H1 or H2, the probability of making a directional error is at most.025. Most social science journals require authors to use α =.05. Power of a Hypothesis Test In hypothesis testing applications, the goal is to reject H0: μ = h and then choose either H1: μ > h or H2: μ < h. The power of a test is the probability of rejecting H0. If the power of the test is low, then the probability of an inconclusive result will be high. The power of a test of H0: μ = h depends on the sample size, the absolute value of (μ h)/σ (the standardized effect size), and the α level. Increasing the sample size will increase the power of the test as illustrated below for α =.05 and (μ h)/σ =

10 Decreasing α will reduce the probability of a directional error but will also decrease the power of the test as illustrated in the graph below for n = 30 and (μ h)/σ = 0.5. Note that there is little loss in power for reductions in α down to about.05 with power decreasing more dramatically for α values below.05, which is why α =.05 is a recommended value. For a given sample size and α level, the power of the test increases as the absolute value of (μ h)/σ increases, as illustrated in the graph below for n = 30 α =.05. Interpreting a Confidence Interval Consider a 95% confidence interval for μ. If a 95% confidence interval for μ was computed from every possible sample of size n in a given study population, about 95% of these confidence intervals will capture the unknown value of μ. With random sampling, we know that every possible sample of size n has the same 10

11 chance of being selected. Knowing that a 95% confidence interval for μ will capture μ in about 95% of all possible samples, and knowing that the one sample the psychologist has used to compute the 95% confidence interval is a random sample, we can say that the probability is.95 (or we are 95% confident) that the computed confidence interval includes μ. Another way to think about confidence intervals is to consider a test of H0: μ = h for many different values of h. For a given value of α, if H0 is tested for all possible values of h, a 100(1 α)% confidence interval for μ is the set of all values of h for which H0 cannot be rejected. All values of h that are not included in the confidence interval are values for which H0 would have been rejected at the specified α level. For instance, if a 95% confidence interval for μ is [14.2, 18.5], then all tests of H0: μ = h will not reject H0 if h is any value in the range 14.2 to 18.5 but will reject H0 for any value of h that is less than 14.2 or greater than Sample Size Planning A narrow confidence interval for μ is desirable because it provides a more precise and informative description of μ than a wider confidence interval. It is possible to approximate the sample size that will give the desired width (upper limit minus lower limit) of a confidence interval with a desired level of confidence. The sample size needed to obtain a 100(1 α)% confidence interval for having a desired width of w is approximately n = 4σ 2(z α/2 /w) 2 (1.6) ~ 2 where is a planning value of the response variable variance and z α/2 is a twosided critical z-value. Planning values are obtained from expert opinion, pilot studies, or previously published research. If the maximum and minimum possible values of the response variable scale are known, [(max min)/4] 2 provides a crude planning value of the population variance. Equation 1.6 shows that larger sample sizes are needed with narrower confidence interval widths, greater levels of confidence, and greater variability of the response variable. Round Equation 1.6 up to the nearest integer. Example 1.2. A psychologist wants to estimate the mean job satisfaction score for a population of 4,782 public school teachers. The psychologist plans to use a job satisfaction questionnaire (measured on a 1 to 10 scale) that has been used in previous studies. A review of the literature suggests that the variance of the job satisfaction scale is about 6.0. The psychologist would like the 95% confidence interval for μ (the mean job satisfaction score for all 4,782 teachers) to have a width of about 1.5. The required sample size is approximately n = 4(6.0)(1.96/1.5) 2 =

12 Note that Equation 1.6 does not include the value of the study population size (N). Actually, the sample size requirement does depend on N according to the formula n = n(1 n/n) where n is given by Equation 1.6 and n is the revised sample size requirement. In most applications, n will be a small fraction of N and then n will be about the same as n. For instance, if N = 3,000 and Equation 1.6 gives n = 40, then n = 40(1 40/3000) = Sampling in Two Stages In applications where sample data can be collected in two stages, the confidence interval obtained in the first stage can be used to determine how many more participants should be sampled in the second stage. If the 100(1 α)% confidence interval width from a first-stage total sample size of n is w 0, then the number of participants that should be added to the original sample (n + ) in order to obtain a 100(1 α)% confidence interval width of w is approximately n + = [( w 0 w )2 1] n. (1.7) Example 1.3. In a study with 25 participants, the 95% confidence interval for μ had a width of The psychologist suspects that the results of this study are unlikely to be published because the confidence interval is too wide. The psychologist would like to obtain a 95% confidence interval for μ that has a width of 2.0. To achieve this goal, the number of participants that should be added to the initial sample is [(4.38/2.0) 2 1]25 = Target Population The confidence interval for μ (Equation 1.5) provides information about the study population from which the random sample was taken. In most applications, the study population will be a small subset of some larger and more interesting population called the target population. For instance, a psychologist may take a random sample of 100 undergraduate students from a particular university directory consisting of 12,000 student names because the psychologist has easy access to this directory. The results of Equation 1.5 will apply only to those 12,000 undergraduate students, but the psychologist is more interested in the value of μ for a target population of all young adults. It might be possible for the psychologist to make a persuasive argument that the study population mean should be very similar to some target population mean. For instance, suppose the psychologist computed a confidence interval for the mean eye pupil diameter in a small room lit only by a 40-watt light bulb using a 12

13 random sample from the 12,000 undergraduate students. The psychologist could argue convincingly that the mean eye pupil diameter in the study population of 12,000 undergraduates should be no different than the mean eye pupil diameter of all young adults. As an example where the study population mean would probably not be similar to some target population mean, suppose that the psychologist instead computed a confidence interval for the mean score on an abortion attitude scale using a sample of students from a Jesuit university. In this situation, the psychologist does not believe that the mean abortion attitude in the Jesuit study population is similar to the mean abortion attitude in a target population of all young adults. Researchers in the physical and biological sciences seldom worry about the distinction between a study population and a target population because the parameter values for many physical or biological attributes (like the eye pupil diameter example) are much less likely to differ across different study populations, and consequently the study population parameter values are almost automatically assumed to generalize to some large target population. In contrast, psychologists who study complex human behavior that can vary considerably across different study populations, need to be very cautious about how they interpret their confidence interval and hypothesis testing results. Psychologists should clearly describe the characteristics of the study population so that the statistical results are interpreted in a proper context. Assumptions for Confidence Intervals and Tests Confidence intervals and hypothesis tests for μ require three assumptions. One assumption, the random sampling assumption, requires the sample to be a random sample from the study population. A second assumption, the independence assumption, requires the responses from each participant in the sample to be independent of one another. In other words, no participant in the study should influence the responses of any other participant in the study. A third assumption, the normality assumption, requires the quantitative scores in the study population have an approximate normal distribution. Confidence intervals and hypothesis tests for μ will be uninterpretable if the random sampling assumption has been violated. If the independence assumption has been violated, the true probability of a directional error can be greater than α/2, and the true confidence level can be less than 100(1 α)%. Recall that the interpretation of a confidence interval for μ assumed that a 100(1 α)% confidence interval would capture the unknown population mean in about 100(1 α)% of all possible samples of a given size. However, when the 13

14 independence assumption is violated, the percent of samples in which a 100(1 α)% confidence interval captures the population parameter can be far less than 100(1 α)% and the psychologist s confidence regarding the computed confidence interval result will be mistakenly too high. Violating the normality assumption will have little effect on the confidence interval and test for μ unless the quantitative scores in the study population are extremely nonnormal and the sample size is small (n < 30). If the sample size is small and the study population quantitative scores are extremely nonnormal, the proportion of all possible 95% confidence intervals that would capture μ can be less than.95, and the psychologist s confidence regarding the computed confidence interval result will be mistakenly too high. Assessing the Normality Assumption Recall that the normal distribution is symmetric. If the quantitative scores in the sample exhibit a clear asymmetry, this would suggest a violation of the normality assumption. The asymmetry in a set of quantitative scores can be described using a coefficient of skewness. The skewness coefficient is equal to zero if the scores are perfectly symmetric, positive if the scores are skewed to the right, and negative if the scores are skewed to the left. SPSS and R provide a test of the null hypothesis that the population skewness coefficient is zero. If the p-value is less than.05, the psychologist may conclude that the normality assumption has been violated and that the population scores are skewed, but a p-value greater than.05 does not imply that the normality assumption has been satisfied. The population distribution of quantitative scores can be non-normal even if the distribution is symmetric. The coefficient of kurtosis describes the degree to which a distribution is more or less peaked or has shorter or thicker tails than a normal distribution. SPSS and R provide a test of the null hypothesis that there is no kurtosis in population distribution of scores. If the p-value is less than.05, the psychologist may conclude that the normality assumption has been violated and that the population scores have kurtosis, but a p-value greater than.05 does not imply that the normality assumption has been satisfied. Data Transformations A transformation of the quantitative scores can reduce skewness. When the quantitative score is a frequency count, such as the number of facts that can be recalled or the number of spelling errors in a writing sample, a square root transformation ( y i ) may reduce non-normality. When the score is a time-to- 14

15 event, such as the time required to solve a problem or a reaction time, a natural log transformation (ln(yi)) or a reciprocal transformation (1/yi) may reduce nonnormality. Example 1.4. A histogram of 80 highly skewed scores is shown below (left). A histogram of the log-transformed scores (right) is much more symmetric. Although data transformations may reduce non-normality, the mean of the transformed scores may then be difficult to interpret. However, in some applications the value of μ could be interpretable after a data transformation. For instance, if the response variable is measured in squared units, such as the brain surface area showing activity measured in squared centimeters, a square root transformation could be interpreted as the size of the activated area. Or if the response variable is reaction time measured in seconds, then a reciprocal transformation could be interpreted as responses per second. 15

How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

More information

9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test. Neyman-Pearson lemma 9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Module 5 Hypotheses Tests: Comparing Two Groups

Module 5 Hypotheses Tests: Comparing Two Groups Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this

More information

Unit 29 Chi-Square Goodness-of-Fit Test

Unit 29 Chi-Square Goodness-of-Fit Test Unit 29 Chi-Square Goodness-of-Fit Test Objectives: To perform the chi-square hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni

More information

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

More information

Chapter 8 Introduction to Hypothesis Testing

Chapter 8 Introduction to Hypothesis Testing Chapter 8 Student Lecture Notes 8-1 Chapter 8 Introduction to Hypothesis Testing Fall 26 Fundamentals of Business Statistics 1 Chapter Goals After completing this chapter, you should be able to: Formulate

More information

Sampling and Hypothesis Testing

Sampling and Hypothesis Testing Population and sample Sampling and Hypothesis Testing Allin Cottrell Population : an entire set of objects or units of observation of one sort or another. Sample : subset of a population. Parameter versus

More information

Sampling Distributions and the Central Limit Theorem

Sampling Distributions and the Central Limit Theorem 135 Part 2 / Basic Tools of Research: Sampling, Measurement, Distributions, and Descriptive Statistics Chapter 10 Sampling Distributions and the Central Limit Theorem In the previous chapter we explained

More information

NCSS Statistical Software. One-Sample T-Test

NCSS Statistical Software. One-Sample T-Test Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D.

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D. Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D. In biological science, investigators often collect biological

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

Null Hypothesis H 0. The null hypothesis (denoted by H 0

Null Hypothesis H 0. The null hypothesis (denoted by H 0 Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

More information

MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters

MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters Inferences about a population parameter can be made using sample statistics for

More information

TRANSCRIPT: In this lecture, we will talk about both theoretical and applied concepts related to hypothesis testing.

TRANSCRIPT: In this lecture, we will talk about both theoretical and applied concepts related to hypothesis testing. This is Dr. Chumney. The focus of this lecture is hypothesis testing both what it is, how hypothesis tests are used, and how to conduct hypothesis tests. 1 In this lecture, we will talk about both theoretical

More information

Statistical Inference and t-tests

Statistical Inference and t-tests 1 Statistical Inference and t-tests Objectives Evaluate the difference between a sample mean and a target value using a one-sample t-test. Evaluate the difference between a sample mean and a target value

More information

Hypothesis Testing or How to Decide to Decide Edpsy 580

Hypothesis Testing or How to Decide to Decide Edpsy 580 Hypothesis Testing or How to Decide to Decide Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Hypothesis Testing or How to Decide to Decide

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information

AP Statistics 1998 Scoring Guidelines

AP Statistics 1998 Scoring Guidelines AP Statistics 1998 Scoring Guidelines These materials are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use must be sought from the Advanced Placement

More information

1 SAMPLE SIGN TEST. Non-Parametric Univariate Tests: 1 Sample Sign Test 1. A non-parametric equivalent of the 1 SAMPLE T-TEST.

1 SAMPLE SIGN TEST. Non-Parametric Univariate Tests: 1 Sample Sign Test 1. A non-parametric equivalent of the 1 SAMPLE T-TEST. Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

More information

Chapter Additional: Standard Deviation and Chi- Square

Chapter Additional: Standard Deviation and Chi- Square Chapter Additional: Standard Deviation and Chi- Square Chapter Outline: 6.4 Confidence Intervals for the Standard Deviation 7.5 Hypothesis testing for Standard Deviation Section 6.4 Objectives Interpret

More information

Chapter 8. Hypothesis Testing

Chapter 8. Hypothesis Testing Chapter 8 Hypothesis Testing Hypothesis In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing

More information

Introduction to Statistics for Computer Science Projects

Introduction to Statistics for Computer Science Projects Introduction Introduction to Statistics for Computer Science Projects Peter Coxhead Whole modules are devoted to statistics and related topics in many degree programmes, so in this short session all I

More information

Hypothesis testing S2

Hypothesis testing S2 Basic medical statistics for clinical and experimental research Hypothesis testing S2 Katarzyna Jóźwiak k.jozwiak@nki.nl 2nd November 2015 1/43 Introduction Point estimation: use a sample statistic to

More information

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

More information

Unit 24 Hypothesis Tests about Means

Unit 24 Hypothesis Tests about Means Unit 24 Hypothesis Tests about Means Objectives: To recognize the difference between a paired t test and a two-sample t test To perform a paired t test To perform a two-sample t test A measure of the amount

More information

Chapter 7. Estimates and Sample Size

Chapter 7. Estimates and Sample Size Chapter 7. Estimates and Sample Size Chapter Problem: How do we interpret a poll about global warming? Pew Research Center Poll: From what you ve read and heard, is there a solid evidence that the average

More information

13 Two-Sample T Tests

13 Two-Sample T Tests www.ck12.org CHAPTER 13 Two-Sample T Tests Chapter Outline 13.1 TESTING A HYPOTHESIS FOR DEPENDENT AND INDEPENDENT SAMPLES 270 www.ck12.org Chapter 13. Two-Sample T Tests 13.1 Testing a Hypothesis for

More information

Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

More information

Inference for two Population Means

Inference for two Population Means Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

find confidence interval for a population mean when the population standard deviation is KNOWN Understand the new distribution the t-distribution

find confidence interval for a population mean when the population standard deviation is KNOWN Understand the new distribution the t-distribution Section 8.3 1 Estimating a Population Mean Topics find confidence interval for a population mean when the population standard deviation is KNOWN find confidence interval for a population mean when the

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

STAT 350 Practice Final Exam Solution (Spring 2015)

STAT 350 Practice Final Exam Solution (Spring 2015) PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

More information

Introduction to Stata

Introduction to Stata Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

More information

Hypothesis Testing Summary

Hypothesis Testing Summary Hypothesis Testing Summary Hypothesis testing begins with the drawing of a sample and calculating its characteristics (aka, statistics ). A statistical test (a specific form of a hypothesis test) is an

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Unit 21 Student s t Distribution in Hypotheses Testing

Unit 21 Student s t Distribution in Hypotheses Testing Unit 21 Student s t Distribution in Hypotheses Testing Objectives: To understand the difference between the standard normal distribution and the Student's t distributions To understand the difference between

More information

MAT X Hypothesis Testing - Part I

MAT X Hypothesis Testing - Part I MAT 2379 3X Hypothesis Testing - Part I Definition : A hypothesis is a conjecture concerning a value of a population parameter (or the shape of the population). The hypothesis will be tested by evaluating

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ 1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

Chapter 16 Multiple Choice Questions (The answers are provided after the last question.)

Chapter 16 Multiple Choice Questions (The answers are provided after the last question.) Chapter 16 Multiple Choice Questions (The answers are provided after the last question.) 1. Which of the following symbols represents a population parameter? a. SD b. σ c. r d. 0 2. If you drew all possible

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of

More information

Chapter 7 Part 2. Hypothesis testing Power

Chapter 7 Part 2. Hypothesis testing Power Chapter 7 Part 2 Hypothesis testing Power November 6, 2008 All of the normal curves in this handout are sampling distributions Goal: To understand the process of hypothesis testing and the relationship

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

When σ Is Known: Recall the Mystery Mean Activity where x bar = 240.79 and we have an SRS of size 16

When σ Is Known: Recall the Mystery Mean Activity where x bar = 240.79 and we have an SRS of size 16 8.3 ESTIMATING A POPULATION MEAN When σ Is Known: Recall the Mystery Mean Activity where x bar = 240.79 and we have an SRS of size 16 Task was to estimate the mean when we know that the situation is Normal

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

AP Statistics 2002 Scoring Guidelines

AP Statistics 2002 Scoring Guidelines AP Statistics 2002 Scoring Guidelines The materials included in these files are intended for use by AP teachers for course and exam preparation in the classroom; permission for any other use must be sought

More information

Two-Sample T-Test from Means and SD s

Two-Sample T-Test from Means and SD s Chapter 07 Two-Sample T-Test from Means and SD s Introduction This procedure computes the two-sample t-test and several other two-sample tests directly from the mean, standard deviation, and sample size.

More information

Sample Size Determination

Sample Size Determination Sample Size Determination Population A: 10,000 Population B: 5,000 Sample 10% Sample 15% Sample size 1000 Sample size 750 The process of obtaining information from a subset (sample) of a larger group (population)

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

3. Nonparametric methods

3. Nonparametric methods 3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the

More information

Independent t- Test (Comparing Two Means)

Independent t- Test (Comparing Two Means) Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

Standard Deviation Calculator

Standard Deviation Calculator CSS.com Chapter 35 Standard Deviation Calculator Introduction The is a tool to calculate the standard deviation from the data, the standard error, the range, percentiles, the COV, confidence limits, or

More information

Lecture 1: t tests and CLT

Lecture 1: t tests and CLT Lecture 1: t tests and CLT http://www.stats.ox.ac.uk/ winkel/phs.html Dr Matthias Winkel 1 Outline I. z test for unknown population mean - review II. Limitations of the z test III. t test for unknown population

More information

Power & Effect Size power Effect Size

Power & Effect Size power Effect Size Power & Effect Size Until recently, researchers were primarily concerned with controlling Type I errors (i.e. finding a difference when one does not truly exist). Although it is important to make sure

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 1. Which of the following will increase the value of the power in a statistical test

More information

BIOSTATISTICS QUIZ ANSWERS

BIOSTATISTICS QUIZ ANSWERS BIOSTATISTICS QUIZ ANSWERS 1. When you read scientific literature, do you know whether the statistical tests that were used were appropriate and why they were used? a. Always b. Mostly c. Rarely d. Never

More information

Hypothesis Testing. Concept of Hypothesis Testing

Hypothesis Testing. Concept of Hypothesis Testing Quantitative Methods 2013 Hypothesis Testing with One Sample 1 Concept of Hypothesis Testing Testing Hypotheses is another way to deal with the problem of making a statement about an unknown population

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST UNDERSTANDING THE DEPENDENT-SAMPLES t TEST A dependent-samples t test (a.k.a. matched or paired-samples, matched-pairs, samples, or subjects, simple repeated-measures or within-groups, or correlated groups)

More information

Chapter 9, Part A Hypothesis Tests. Learning objectives

Chapter 9, Part A Hypothesis Tests. Learning objectives Chapter 9, Part A Hypothesis Tests Slide 1 Learning objectives 1. Understand how to develop Null and Alternative Hypotheses 2. Understand Type I and Type II Errors 3. Able to do hypothesis test about population

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

We will use the following data sets to illustrate measures of center. DATA SET 1 The following are test scores from a class of 20 students:

We will use the following data sets to illustrate measures of center. DATA SET 1 The following are test scores from a class of 20 students: MODE The mode of the sample is the value of the variable having the greatest frequency. Example: Obtain the mode for Data Set 1 77 For a grouped frequency distribution, the modal class is the class having

More information

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

One-Sample t-test. Example 1: Mortgage Process Time. Problem. Data set. Data collection. Tools

One-Sample t-test. Example 1: Mortgage Process Time. Problem. Data set. Data collection. Tools One-Sample t-test Example 1: Mortgage Process Time Problem A faster loan processing time produces higher productivity and greater customer satisfaction. A financial services institution wants to establish

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

Statistical Inference

Statistical Inference Statistical Inference Idea: Estimate parameters of the population distribution using data. How: Use the sampling distribution of sample statistics and methods based on what would happen if we used this

More information

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

More information

Paired T-Test. Chapter 208. Introduction. Technical Details. Research Questions

Paired T-Test. Chapter 208. Introduction. Technical Details. Research Questions Chapter 208 Introduction This procedure provides several reports for making inference about the difference between two population means based on a paired sample. These reports include confidence intervals

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Data Analysis: Describing Data - Descriptive Statistics

Data Analysis: Describing Data - Descriptive Statistics WHAT IT IS Return to Table of ontents Descriptive statistics include the numbers, tables, charts, and graphs used to describe, organize, summarize, and present raw data. Descriptive statistics are most

More information

Homework 6 Solutions

Homework 6 Solutions Math 17, Section 2 Spring 2011 Assignment Chapter 20: 12, 14, 20, 24, 34 Chapter 21: 2, 8, 14, 16, 18 Chapter 20 20.12] Got Milk? The student made a number of mistakes here: Homework 6 Solutions 1. Null

More information

The basics of probability theory. Distribution of variables, some important distributions

The basics of probability theory. Distribution of variables, some important distributions The basics of probability theory. Distribution of variables, some important distributions 1 Random experiment The outcome is not determined uniquely by the considered conditions. For example, tossing a

More information

Water Quality Problem. Hypothesis Testing of Means. Water Quality Example. Water Quality Example. Water quality example. Water Quality Example

Water Quality Problem. Hypothesis Testing of Means. Water Quality Example. Water Quality Example. Water quality example. Water Quality Example Water Quality Problem Hypothesis Testing of Means Dr. Tom Ilvento FREC 408 Suppose I am concerned about the quality of drinking water for people who use wells in a particular geographic area I will test

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Extending Hypothesis Testing. p-values & confidence intervals

Extending Hypothesis Testing. p-values & confidence intervals Extending Hypothesis Testing p-values & confidence intervals So far: how to state a question in the form of two hypotheses (null and alternative), how to assess the data, how to answer the question by

More information