Appendix B Statistics in Psychological Research
|
|
- Amie Parks
- 7 years ago
- Views:
Transcription
1 A10_ /4/07 1:34 PM Page A-10 Appendix B Statistics in Psychological Research Understanding and interpreting the results of psychological research depends on statistical analyses, which are methods for describing and drawing conclusions from data. The chapter on research in psychology introduced some terms and concepts associated with descriptive statistics the numbers that psychologists use to describe and present their data and with inferential statistics the mathematical procedures used to draw conclusions from data and to make inferences about what they mean. Here, we present more details about these statistical analyses that will help you to evaluate research results. Describing Data To illustrate our discussion, consider a hypothetical experiment on the effects of incentives on performance. The experimenter presents a list of mathematics problems to two groups of participants. Each group must solve the problems within a fixed time, but for each correct answer, the low-incentive group is paid ten cents, whereas the high-incentive group gets one dollar. The hypothesis to be tested is the null hypothesis, the assertion that the independent variable manipulated by the experimenter will have no effect on the dependent variable measured by the experimenter. In this case, the null hypothesis is that the size of the incentive (the independent variable) will not affect performance on the mathematics task (the dependent variable). Assume that the experimenter has gathered a representative sample of participants, assigned them randomly to the two groups, and done everything possible to avoid the confounds and other research problems discussed in the chapter on research in psychology. The experiment has been run, and the psychologist now has the data: a list of the number of correct answers given by each participant in each group. Now comes the first task of statistical analysis: describing the data in a way that makes them easy to understand. null hypothesis The assertion that the independent variable manipulated by the experimenter will have no effect on the dependent variable measured by the experimenter. frequency histogram A graphic presentation of data that consists of a set of bars, each of which represents how frequently different scores or values occur in a data set. descriptive statistics Numbers that summarize a set of research data. The Frequency Histogram The simplest way to describe the data is to draw up something like Table 1, in which all the numbers are simply listed. After examining the table, you might notice that the high-incentive group seems to have done better than the low-incentive group, but this is not immediately obvious. The difference might be even harder to see if more participants had been involved and if the scores included three-digit numbers. A picture is worth a thousand words, so a better way of presenting the same data is in a picture-like graphic known as a frequency histogram (see Figure 1). Construction of a histogram is simple. First, divide the scale for measuring the dependent variable (in this case, the number of correct answers) into a number of categories, or bins. The bins in our example are 1, 3 4, 5 6, 7 8, and Next, sort the raw data into the appropriate bin. (For example, the score of a participant who had 5 correct answers would go into the 5 6 bin, a score of 8 would go into the 7 8 bin, and so on.) Finally, for each bin, count the number of scores in that bin and draw a bar up to the height of that number on the vertical axis of a graph. The resulting set of bars makes up the frequency histogram. A-10
2 A11_ /4/07 1:34 PM Page A-11 Describing Data A-11 TABLE 1 A Simple Data Set Here are the test scores obtained by thirteen participants performing under low-incentive conditions and thirteen participants performing under highincentive conditions. Low Incentive High Incentive Number of cases Number of cases Test score categories Low incentive Test score categories High incentive FIGURE 1 Frequency Histograms The height of each bar of a histogram represents the number of scores falling within each range of score values. The pattern formed by these bars gives a visual image of how research results are distributed. Because we are interested in comparing the scores of two groups, there are separate histograms in Figure 1: one for the high-incentive group and one for the lowincentive group. Now the difference between groups that was difficult to see in Table 1 becomes clearly visible: High scores were more common among people in the high-incentive group than among people in the low-incentive group. Histograms and other pictures of data are useful for visualizing and better understanding the shape of research results, but in order to analyze those results statistically, we need to use other ways of handling the data that make up these graphic presentations. For example, before we can tell whether two histograms are different statistically or just visually, the data they represent must be summarized using descriptive statistics. Descriptive Statistics The four basic categories of descriptive statistics (1) measure the number of observations made; () summarize the typical value of a set of data; (3) summarize the spread, or variability, in a set of data; and (4) express the correlation between two sets of data. N The easiest statistic to compute, abbreviated as N, simply describes the number of observations that make up the data set. In Table 1, for example, N 13 for each group, or 6 for the entire data set. Simple as it is, N plays a very important role in more sophisticated statistical analyses. Measures of Central Tendency It is apparent in the histograms in Figure 1 that there is a difference in the pattern of scores between the two groups. But how much of a difference? What is the typical value, the central tendency, that represents each group s performance? As described in the chapter on research in psychology, there are three measures that capture this typical value: the mode, the median, and the mean. Recall that the mode is the value or score that occurs most frequently in the data set. The median is the halfway point in a set of data: Half the scores fall above the median, half fall below it. The mean is the arithmetic average. To find the mean, add up the values of all the scores and divide that total by the number of scores. Measures of Variability The variability, or spread, or dispersion of a set of data is often just as important as its central tendency. This variability can be quantified by measures known as the range and the standard deviation.
3 A1_ /4/07 1:34 PM Page A-1 A-1 APPENDIX B Statistics in Psychological Research TABLE The standard deviation of a set of scores reflects the average degree to which those scores differ from the mean of the set. Calculating the Standard Deviation Difference Raw Data from Mean D D Mean 0/5 4 D 34 Standard deviation Note: means the sum of. D N As described in the chapter on research in psychology, the range is simply the difference between the highest and the lowest values in a data set. For the data in Table 1, the range for the low-incentive group is 9 7; for the high-incentive group, the range is The standard deviation, or SD, measures the average difference between each score and the mean of the data set. To see how the standard deviation is calculated, consider the data in Table. The first step is to compute the mean of the set in this case, 0/5 4. Second, calculate the difference, or deviation (D), of each score from the mean by subtracting the mean from each score, as in column of Table. Third, find the average of these deviations. Notice, though, that if you calculated this average by finding the arithmetic mean, you would sum the deviations and find that the negative deviations exactly balance the positive ones, resulting in a mean difference of 0. Obviously there is more than zero variation around the mean in the data set. So, instead of employing the arithmetic mean, you compute the standard deviation by first squaring the deviations (which, as shown in column 3 of Table, removes any negative values). You then add up these squared deviations, divide the total by N, and then take the square root of the result. These simple steps are outlined in more detail in Table. range A measure of variability that is the difference between the highest and the lowest values in a data set. standard deviation (SD) A measure of variability that is the average difference between each score and the mean of the data set. normal distribution A dispersion of scores such that the mean, median, and mode all have the same value. When a distribution has this property, the standard deviation can be used to describe how any particular score stands in relation to the rest of the distribution. The Normal Distribution Now that we have described histograms and reviewed some descriptive statistics, let s reexamine how these methods of representing research data relate to some of the concepts discussed elsewhere in the book. In most subfields in psychology, when researchers collect many measurements and plot their data in histograms, the resulting pattern often resembles the one shown for the low-incentive group in Figure 1. That is, the majority of scores tend to fall in the middle of the distribution, with fewer and fewer scores occurring as one moves toward the extremes. As more and more data are collected, and as smaller and smaller bins are used (perhaps containing only one value each), histograms tend to smooth out until they resemble the bell-shaped curve known as the normal distribution, or normal curve. When a distribution of scores follows a truly normal curve, its mean, median, and mode all have the same value. Furthermore, if the curve is normal, we can use its standard deviation to describe how any particular score stands in relation to the rest of the distribution. IQ scores provide an example. They are distributed in a normal curve, with a mean, median, and mode of 100 and an SD of 16 as shown in Figure. In such a
4 A13_ /4/07 1:34 PM Page A-13 Describing Data A-13 95% of the scores 68% of the scores FIGURE The Normal Distribution Many kinds of research data approximate the balanced, or symmetrical, shape of the normal curve, in which most scores fall toward the center of the range. 1 0 Standard deviations IQ The normal distribution of IQ percentile score A value that indicates the percentage of people or observations that fall below a given point in a normal distribution. standard score A value that indicates the distance, in standard deviations, between a given score and the mean of all the scores in a data set. distribution, half of the population will have an IQ above 100, and half will be below 100. The shape of the true normal curve is such that 68 percent of the area under it lies in a range within one standard deviation above and below the mean. In terms of IQ, this means that 68 percent of the population has an IQ somewhere between 84 (100 minus 16) and 116 (100 plus 16). Of the remaining 3 percent of the population, half falls more than 1 SD above the mean, and half falls more than 1 SD below the mean. Thus, 16 percent of the population has an IQ above 116, and 16 percent scores below 84. The normal curve is also the basis for percentiles. A percentile score indicates the percentage of people or observations that fall below a given score in a normal distribution. In Figure, for example, the mean score (which is also the median) lies at a point below which 50 percent of the scores fall. Thus the mean of a normal distribution is at the 50th percentile. What does this say about IQ? If you score 1 SD above the mean, your score is at a point above which only 16 percent of the population falls. This means that 84 percent of the population (100 percent minus 16 percent) must be below that score; so this IQ score is at the 84th percentile. A score at SDs above the mean is at the 97.5 percentile, because only.5 percent of the scores are above it in a normal distribution. Scores may also be expressed in terms of their distance in standard deviations from the mean, producing what are called standard scores. A standard score of 1.5, for example, is 1.5 standard deviations from the mean. Correlation Histograms and measures of central tendency and variability describe certain characteristics of one dependent variable at a time. However, psychologists are often interested in describing the relationship between two variables. Measures of correlation are frequently used for this purpose. We discussed the interpretation of the correlation coefficient in the chapter on research in psychology; here we describe how to calculate it. Recall that correlations are based on the relationship between two numbers that are associated with each participant or observation. The numbers might represent, say, a person s height and weight or the IQ scores of a parent and child. Table 3 contains this kind of data for four participants from our incentives study who took the test twice. (As you may recall from the chapter on cognitive abilities, the correlation between their scores would be a measure of test-retest reliability.)
5 A14_ /4/07 1:34 PM Page A-14 A-14 APPENDIX B Statistics in Psychological Research The formula for computing the Pearson product-moment correlation, or r, is as follows: TRY THIS r ( x M )( y M ) x y x y ( x M ) ( y M ) where: x each score on variable 1 (in this case, test 1) y each score on variable (in this case, test ) M x the mean of the scores on variable 1 M y the mean of the scores on variable The main function of the denominator (bottom part) in this formula is to ensure that the coefficient ranges from 1.00 to 1.00, no matter how large or small the values of the variables being correlated. The action element of this formula is the numerator (or top part). It is the result of multiplying the amounts by which each of two observations (x and y) differ from the means of their respective distributions (M x and M y ). Notice that, if the two variables go together (so that, if one score is large, the score it is paired with is also large, and if one is small, the other is also small), then both scores in each pair will tend to be above the mean of their distribution or both of them will tend to be below the mean of their distribution. When this is the case, x M x and y M y will both be positive, or they will both be negative. In either case, when you multiply one of them by the other, their product will always be positive, and the correlation coefficient will also be positive. If, on the other hand, the two variables go opposite to one another, such that, when one score in a pair is large, the other is small, one of them is likely to be smaller than the mean of its distribution, so that either x M x or y M y will have a negative sign, and the other will have a positive sign. Multiplying these differences together will always result in a product with a negative sign, and r will be negative as well. Now compute the correlation coefficient for the data presented in Table 3. The first step (step a in the table) is to compute the mean (M) for each variable. M x turns out to be 3 and M y is 4. Next, calculate the numerator by finding the differences between each x and y value and its respective mean and by multiplying them (as in step b of Table 3). Notice that, in this example, the differences in each pair have like signs, so the correlation coefficient will be positive. The next step is to calculate the terms in the denominator; in this case, as shown in steps c and d in Table 3, they have values of 18 and 4. Finally, place all the terms in the TABLE 3 Calculating the Correlation Coefficient Though it appears complex, calculation of the correlation coefficient is quite simple. The resulting r reflects the degree to which two sets of scores tend to be related, or to co-vary. Participant Test 1 Test (x M x )(y M y ) (b) A 1 3 (1 3)(3 4) ( )( 1) B 1 3 (1 3)(3 4) ( )( 1) C 4 5 (4 3)(5 4) (1)(1) 1 D 6 5 (6 3)(5 4) (3)(1) 3 (a) M x 3 M y 4 (x M x )(y M y ) 8 (c) (x M x ) (d) (y M y ) ( e ) ( x Mx)( y My) r ( x M ) ( y M ) x y
6 A15_ /4/07 1:34 PM Page A-15 Inferential Statistics A-15 formula and carry out the arithmetic (step e). The result in this case is an r of.94, a high and positive correlation suggesting that performances on repeated tests are very closely related. A participant doing well the first time is very likely to do well again; a person doing poorly at first will probably do no better the second time. Inferential Statistics The descriptive statistics from the incentives experiment tell the experimenter that the performances of the high- and low-incentive groups differ. But there is some uncertainty. Is the difference large enough to be important? Does it represent a stable effect or a fluke? The researcher would like to have some measure of confidence that the difference between groups is genuine and reflects the effect of incentives on mental tasks in the real world, rather than the effect of random or uncontrolled factors. One way of determining confidence would be to run the experiment again with a new group of participants. Confidence that incentives produced differences in performance would grow stronger if the same or a larger between-group difference occurs again. In reality, psychologists rarely have the opportunity to repeat, or replicate, their experiments in exactly the same way three or four times. But inferential statistics provide a measure of how likely it was that results came about by chance. They put a precise mathematical value on the confidence or probability that rerunning the same experiment would yield similar (or even stronger) results. inferential statistics A set of procedures that provides a measure of how likely it is that research results came about by chance. Differences Between Means: The t Test One of the most important tools of inferential statistics is the t test. It allows the researcher to ask how likely it is that the difference between two means occurred by chance rather than as a function of the effect of the independent variable. When the t test or other inferential statistic says that the probability of chance effects is small enough (usually less than 5 percent), the results are said to be statistically significant. Conducting a t test of statistical significance requires the use of three descriptive statistics. The first component of the t test is the size of the observed effect, the difference between the means. Recall that the mean is calculated by summing a group s scores and dividing that total by the number of scores. In the example shown in Table 1, the mean of the high-incentive group is 94/13, or 7.3, and the mean of the lowincentive group is 65/13, or 5. So the difference between the means of the high- and low-incentive groups is Second, we have to know the standard deviation of scores in each group. If the scores in a group are quite variable, the standard deviation will be large, indicating that chance may have played a large role in producing the results. The next replication of the study might generate a very different set of group scores. If the scores in a group are all very similar, however, the standard deviation will be small, which suggests that the same result would probably occur for that group if the study were repeated. In other words, the difference between groups is more likely to be significant when each group s standard deviation is small. If variability is high enough that the scores of two groups overlap, the mean difference, though large, may not be statistically significant. (In Table 1, for example, some people in the low-incentive group actually did better on the math test than some in the high-incentive group.) Third, we need to take the sample size, N, into account. The larger the number of participants or observations, the more likely it is that an observed difference between means is significant. This is so because, with larger samples, random factors within a group the unusual performance of a few people who were sleepy or anxious or hostile, for example are more likely to be canceled out by the majority, who better represent people in general. The same effect of sample size can be seen in coin tossing. If you toss a quarter five times, you might not be too surprised if
7 A16_ /4/07 1:34 PM Page A-16 A-16 APPENDIX B Statistics in Psychological Research heads comes up 80 percent of the time. If you get 80 percent heads after one hundred tosses, however, you might begin to suspect that this is probably not due to chance alone and that some other effect, perhaps some bias in the coin, is significant in producing the results. (For the same reason, even a relatively small correlation coefficient between diet and grades, say might be statistically significant if it was based on 50,000 students. As the number of participants increases, it becomes less likely that the correlation reflects the influence of a few oddball cases.) To summarize, as the differences between the means get larger, as N increases, and as standard deviations get smaller, t increases. This increase in t raises the researcher s confidence in the significance of the difference between means. Let s now calculate the t statistic and see how it is interpreted. The formula for t is: ( M1 M) t ( N1 1) S1 + ( N 1) S N1+ N N1+ N NN 1 where: M 1 mean of group 1 M mean of group N 1 number of scores or observations for group 1 N number of scores or observations for group S 1 standard deviation of group 1 scores S standard deviation of group scores Despite appearances, this formula is quite simple. In the numerator is the difference between the two group means; t will get larger as this difference gets larger. The denominator contains an estimate of the standard deviation of the differences between group means; in other words, it suggests how much the difference between group means would vary if the experiment were repeated many times. Because this estimate is in the denominator, the value of t will get smaller as the standard deviation of group differences gets larger. For the data in Table 1, degrees of freedom (df) The total sample size or number of scores in a data set, less the number of experimental groups. t ( M M ) 1 ( N1 1) S1 + ( N 1) S N1 + N N1 + N NN ( 1)( 5. 09) + ( 1)( 4. 46) with 4 df. 735 To determine what a particular t means, we must use the value of N and a special statistical table called, appropriately enough, the t table. We have reproduced part of the t table in Table 4. First, we have to find the computed values of t in the row corresponding to the degrees of freedom, or df, associated with the experiment. In this case, degrees of freedom are simply N 1 N (or two less than the total sample size or number of scores). Because our experiment had 13 participants per group, df In the row for 4 df in Table 4, you will find increasing values of t in each column. These columns correspond to decreasing p values, the probabilities that the difference between means occurred by chance. If an obtained t value is equal to or larger than one of the values in the t table (on the correct df line), then the difference between means that generated that t is said to be significant at the.10,.05, or.01 level of probability.
8 A17_ /4/07 1:35 PM Page A-17 Inferential Statistics A-17 TABLE 4 The t Table This table allows the researcher to determine whether an obtained t value is statistically significant. If the t value is larger than the one in the appropriate row in the.05 column, the difference between means that generated that t score is usually considered statistically significant. p Value df.10 (10%).05 (5%).01 (1%) Suppose, for example, that an obtained t (with 19 df) was.00. Looking along the 19 df row, you find that.00 is larger than the value in the.05 column. This allows you to say that the probability that the difference between means occurred by chance was no greater than.05, or 5 in 100. If the t had been less than the value in the.05 column, the probability of a chance result would have been greater than.05. As noted earlier, when an obtained t is not large enough to exceed t table values at the.05 level, at least, it is not usually considered statistically significant. The t value from our experiment was.60, with 4 df. Because.60 is greater than all the values in the 4 df row, the difference between the high- and low-incentive groups would have occurred by chance less than 1 time in 100. In other words, the difference is statistically significant. Beyond the t Test Many experiments in psychology are considerably more complex than simple comparisons between two groups. They often involve three or more experimental and control groups. Some experiments also include more than one independent variable. For example, suppose we had been interested not only in the effect of incentive size on performance but also in the effect of problem difficulty. We might then create six groups whose members would perform easy, moderate, or difficult problems and would receive either low or high incentives. In an experiment like this, the results might be due to the size of the incentive, the difficulty of the problems, or the combined effects (known as the interaction) of the two. Analyzing the size and source of these effects is typically accomplished through procedures known as analysis of variance. The details of analysis of variance are beyond the scope of this book. For now, note that the statistical significance of each effect is influenced by the size of the differences between means, by standard deviations, and by sample size in much the same way as we described for the t test. For more detailed information about how analysis of variance and other inferential statistics are used to understand and interpret the results of psychological research, consider taking courses in research methods and statistical or quantitative methods.
9 A18_ /4/07 1:35 PM Page A-18 A-18 APPENDIX B Statistics in Psychological Research SUMMARY Psychological research generates large quantities of data. Statistics are methods for describing and drawing conclusions from data. Describing Data Researchers often test the null hypothesis, which is the assertion that the independent variable will have no effect on the dependent variable. The Frequency Histogram Graphic representations such as frequency histograms provide visual descriptions of data, making the data easier to understand. Descriptive Statistics Numbers that summarize a set of data are called descriptive statistics. The easiest statistic to compute is N, which gives the number of observations made. A set of scores can be described by two other types of descriptive statistics: a measure of central tendency, which describes the typical value of a set of data, and a measure of variability. Measures of central tendency include the mean, median, and mode; variability is typically measured by the range and by the standard deviation. Sets of data often follow a normal distribution, which means that most scores fall in the middle of the range, with fewer and fewer scores occurring as one moves toward the extremes. In a truly normal distribution the mean, median, and mode are identical. When a set of data shows a normal distribution, a data point can be cited in terms of a percentile score, which indicates the percentage of people or observations falling below a certain score, and in terms of standard scores, which indicate the distance, in standard deviations, between any score and the mean of the distribution. Another type of descriptive statistic, a correlation coefficient, is used to measure the correlation between sets of scores. Inferential Statistics Researchers use inferential statistics to quantify the probability that conducting the same experiment again would yield similar results. Differences Between Means: The t Test One inferential statistic, the t test, assesses the likelihood that differences between two means occurred by chance or reflect the impact of an independent variable. Performing a t test requires using the difference between the means of two sets of data, the standard deviation of scores in each set, and the number of observations or participants. Interpreting a t test requires that degrees of freedom also be taken into account. When the t test indicates that the experimental results had a low probability of occurring by chance, the results are said to be statistically significant. Beyond the t Test When more than two groups must be compared, researchers typically rely on analysis of variance in order to interpret the results of an experiment.
DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More information6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More informationDescriptive Statistics and Measurement Scales
Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample
More informationDescriptive Statistics
Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web
More informationMeans, standard deviations and. and standard errors
CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard
More informationUsing Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
More informationThe correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationSTA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance
Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis
More informationMeasures of Central Tendency and Variability: Summarizing your Data for Others
Measures of Central Tendency and Variability: Summarizing your Data for Others 1 I. Measures of Central Tendency: -Allow us to summarize an entire data set with a single value (the midpoint). 1. Mode :
More informationIndependent samples t-test. Dr. Tom Pierce Radford University
Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More information99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm
Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the
More informationDescriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics
Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),
More informationMEASURES OF VARIATION
NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are
More informationMBA 611 STATISTICS AND QUANTITATIVE METHODS
MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationBiostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY
Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to
More informationStatistics. Measurement. Scales of Measurement 7/18/2012
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationThe right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median
CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box
More information4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"
Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses
More informationData Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationChapter 2: Descriptive Statistics
Chapter 2: Descriptive Statistics **This chapter corresponds to chapters 2 ( Means to an End ) and 3 ( Vive la Difference ) of your book. What it is: Descriptive statistics are values that describe the
More informationSTATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI
STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members
More informationMeasurement with Ratios
Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical
More informationIntroduction to Statistics for Psychology. Quantitative Methods for Human Sciences
Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html
More informationPie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.
Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of
More information6 3 The Standard Normal Distribution
290 Chapter 6 The Normal Distribution Figure 6 5 Areas Under a Normal Distribution Curve 34.13% 34.13% 2.28% 13.59% 13.59% 2.28% 3 2 1 + 1 + 2 + 3 About 68% About 95% About 99.7% 6 3 The Distribution Since
More informationStandard Deviation Estimator
CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of
More informationThis chapter discusses some of the basic concepts in inferential statistics.
Research Skills for Psychology Majors: Everything You Need to Know to Get Started Inferential Statistics: Basic Concepts This chapter discusses some of the basic concepts in inferential statistics. Details
More informationDescriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion
Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research
More informationChapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs
Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)
More informationContent Sheet 7-1: Overview of Quality Control for Quantitative Tests
Content Sheet 7-1: Overview of Quality Control for Quantitative Tests Role in quality management system Quality Control (QC) is a component of process control, and is a major element of the quality management
More informationUnit 9 Describing Relationships in Scatter Plots and Line Graphs
Unit 9 Describing Relationships in Scatter Plots and Line Graphs Objectives: To construct and interpret a scatter plot or line graph for two quantitative variables To recognize linear relationships, non-linear
More informationLesson 4 Measures of Central Tendency
Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central
More informationWeek 3&4: Z tables and the Sampling Distribution of X
Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal
More informationDESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS
DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 seema@iasri.res.in 1. Descriptive Statistics Statistics
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationCOMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk
COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared jn2@ecs.soton.ac.uk Relationships between variables So far we have looked at ways of characterizing the distribution
More informationWeek 4: Standard Error and Confidence Intervals
Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.
More informationNorthumberland Knowledge
Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationUNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
More information8. THE NORMAL DISTRIBUTION
8. THE NORMAL DISTRIBUTION The normal distribution with mean μ and variance σ 2 has the following density function: The normal distribution is sometimes called a Gaussian Distribution, after its inventor,
More informationThe Normal Distribution
Chapter 6 The Normal Distribution 6.1 The Normal Distribution 1 6.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Recognize the normal probability distribution
More informationChapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
More informationExercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More informationAnalyzing and interpreting data Evaluation resources from Wilder Research
Wilder Research Analyzing and interpreting data Evaluation resources from Wilder Research Once data are collected, the next step is to analyze the data. A plan for analyzing your data should be developed
More informationEight things you need to know about interpreting correlations:
Research Skills One, Correlation interpretation, Graham Hole v.1.0. Page 1 Eight things you need to know about interpreting correlations: A correlation coefficient is a single number that represents the
More informationBasic Concepts in Research and Data Analysis
Basic Concepts in Research and Data Analysis Introduction: A Common Language for Researchers...2 Steps to Follow When Conducting Research...3 The Research Question... 3 The Hypothesis... 4 Defining the
More information1.6 The Order of Operations
1.6 The Order of Operations Contents: Operations Grouping Symbols The Order of Operations Exponents and Negative Numbers Negative Square Roots Square Root of a Negative Number Order of Operations and Negative
More informationTesting Research and Statistical Hypotheses
Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you
More informationLAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
More information4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
More informationStatistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationCA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction
CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous
More informationExploratory data analysis (Chapter 2) Fall 2011
Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,
More informationDescriptive Statistics
Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9
More informationNormal distribution. ) 2 /2σ. 2π σ
Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a
More informationBNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
More informationExpression. Variable Equation Polynomial Monomial Add. Area. Volume Surface Space Length Width. Probability. Chance Random Likely Possibility Odds
Isosceles Triangle Congruent Leg Side Expression Equation Polynomial Monomial Radical Square Root Check Times Itself Function Relation One Domain Range Area Volume Surface Space Length Width Quantitative
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationStatistical tests for SPSS
Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly
More informationHow To Write A Data Analysis
Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction
More informationIntroduction; Descriptive & Univariate Statistics
Introduction; Descriptive & Univariate Statistics I. KEY COCEPTS A. Population. Definitions:. The entire set of members in a group. EXAMPLES: All U.S. citizens; all otre Dame Students. 2. All values of
More informationHISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
More informationChapter 4. Probability and Probability Distributions
Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the
More informationAP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics
Ms. Foglia Date AP: LAB 8: THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,
More informationChapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
More informationProjects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
More informationTwo-sample inference: Continuous data
Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As
More informationWHAT IS A JOURNAL CLUB?
WHAT IS A JOURNAL CLUB? With its September 2002 issue, the American Journal of Critical Care debuts a new feature, the AJCC Journal Club. Each issue of the journal will now feature an AJCC Journal Club
More informationCurriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010
Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different
More informationLecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
More informationChapter 2 Statistical Foundations: Descriptive Statistics
Chapter 2 Statistical Foundations: Descriptive Statistics 20 Chapter 2 Statistical Foundations: Descriptive Statistics Presented in this chapter is a discussion of the types of data and the use of frequency
More informationAMS 5 CHANCE VARIABILITY
AMS 5 CHANCE VARIABILITY The Law of Averages When tossing a fair coin the chances of tails and heads are the same: 50% and 50%. So if the coin is tossed a large number of times, the number of heads and
More informationData Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
More informationMeasurement & Data Analysis. On the importance of math & measurement. Steps Involved in Doing Scientific Research. Measurement
Measurement & Data Analysis Overview of Measurement. Variability & Measurement Error.. Descriptive vs. Inferential Statistics. Descriptive Statistics. Distributions. Standardized Scores. Graphing Data.
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationStatistics courses often teach the two-sample t-test, linear regression, and analysis of variance
2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample
More informationMathematical goals. Starting points. Materials required. Time needed
Level S6 of challenge: B/C S6 Interpreting frequency graphs, cumulative cumulative frequency frequency graphs, graphs, box and box whisker and plots whisker plots Mathematical goals Starting points Materials
More informationModule 4: Data Exploration
Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive
More informationIntroduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.
Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative
More informationFrequency Distributions
Descriptive Statistics Dr. Tom Pierce Department of Psychology Radford University Descriptive statistics comprise a collection of techniques for better understanding what the people in a group look like
More informationCharacteristics of Binomial Distributions
Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation
More informationNormality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com
More informationClass 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More information