Statistics Review PSY379
Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses Independent T-test Paired T-test One-way ANOVA Two-way ANOVA Regression
Measurement Scales Before we can examine a variable statistically, we must first observe, or measure, the variable. Measurement is the assignment of numbers to observations based on certain rules. Measurement attempts to transform attributes into numbers. How much is high vs. low stress? How much fast vs. slow learning of a maze? How much is good vs. bad memory?
Measurement Scales Non-metric (or qualitative) Nominal scale (Categories): Numbers indicate difference in kind; no order info (e.g., ethnicity, gender, id#s) Say that men is assigned 0 and women is assigned 1 ; doesn t mean 1 is better than 0 Ordinal scale (Orders): Numbers represent rank orderings; distances are not equal (e.g., grades, rank orderings on a survey)
Measurement Scales Metric (or quantitative) Interval scale: Equal intervals, arbitrary zero Ratios have no meaning (e.g., temperature in degrees F; 50-30 F = 120-100 F; 60 F 2 X 30 F) Ratio scale: Equal intervals, absolute zero Equal ratios are equivalent (e.g., weight, height)
Populations vs. Samples Population: all members of a specific group. parametric: a measure (e.g., mean and variance) computed for the population Sample: a finite subset of a predefined population. statistic: a measure (e.g., mean and variance) computed for the sample
Continuous vs. Discrete Discrete variable: one in which a measure can take on distinct values but not intermediate values (e.g., number of children -- it is either 1 or 2, but not 1.2). The most common form of discrete variable is based on counting. Continuous variable: approximations of the exact value; it is not possible to obtain the exact measure on a continuous variable, because there are always infinitely smaller gradation of measure (e.g., height we can say someone is 72 inches tall, but this is really approximating between 71.5 and 72.5 inches).
Independent vs. Dependent Independent variable: one manipulated by the experimenter, or the observed variable thought to cause or predict the dependent variable. In the relation Y = f(x), X is independent variable because the value of X does not depend on the value of Y. Dependent variable: one thought to result from the independent variable. In the relation Y = f(x), Y is the dependent variable because the value it takes on depends on the value of X
Descriptive Statistics - Descriptive Statistics (a.k.a. Summary Statistics) - Primarily concerned with the summary and description of a collection of data - Serves to reduce a large and unmanageable set of data to a smaller and more interpretable set of information
Descriptive Statistics Frequency distribution & histogram a function that summarizes the membership of individual observation to measurement classifications. Can be constructed regardless of whether the scale is nominal, ordinal, interval or ratio, as long as each and every observation goes into one and only one class.
Descriptive Statistics One of the goals in stats is to compare distributions of data, one data distribution with another data distribution. This would be easier if each data distribution can be summarized into one or two numbers. Central Tendency & Variability -- what is the descriptive central number and how much do individual scores vary from the number?
Descriptive Statistics Measures of Central Tendency Mean: typical/average score, sensitive to extreme scores Median: middlemost score; useful for skewed distribution Mode: most common or frequent score
Descriptive Statistics 100 80 Frequency 60 40 20 0 78 84 90 96 IQ scores 102 108 114
Descriptive Statistics Measures of Variability Variance (dispersion or spread): degree of spread in X (variable) Standard deviation (SD): a measure of variability in the original metric units of X (variable); the square root of the variance
Variance S 2 = (x i x ) 2 n-1 x
Variance S 2 = (x i x ) 2 n-1 x
IQ score Frequency 152 133 114 95 76 57 38 70 60 50 40 30 20 10 0 IQ scores Frequency 156.0 136.5 117.0 97.5 78.0 58.5 39.0 100 80 60 40 20 0 IQ score Frequency 160 140 120 100 80 60 40 20 100 80 60 40 20 0 IQ score Frequency 160 140 120 100 80 60 40 20 80 70 60 50 40 30 20 10 0
Other Measures Skewness is a measure of symmetry, or more precisely, the lack of symmetry. Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. That is, data sets with high kurtosis tend to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp peak.
Pop Quiz! Variance is the average of the squared differences between data points and the mean. Then why are the differences squared? Standard deviation is the square root of variance. Then why is the variance square rooted?
Inferential Statistics - A formalized method for solving a class of problems relating to the inference of properties to a large set of data from examination of a small set of data - Goal is to predict or to estimate characteristics of a population based on information obtained from a sample drawn from that population
Inferential Statistics We want to know about these: We have this to work with: Random Selection Population Sample Parameter Inference Statistic (Population mean) (Sample mean)
Normal distribution 67% of data within 1 SD of mean 95% of data within 2 SD of mean
Poisson distribution mean Mostly, nothing happens (lots of zeros)
Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses Independent T-test Paired T-test One-way ANOVA Two-way ANOVA Regression
Hypothesis testing 1. Assume null hypothesis (H 0 ) (e.g., the two sets of samples come from the same population) 2. Construct alternative hypothesis (H 1 ) (e.g., the two sets of samples do not come from the same population) 3. Calculate test statistic 4. Decide on rejection region for null hypothesis (e.g., 95% confidence in rejecting null hypothesis)
Hypotheses Null (H 0 ): no effect of our experimental treatment, status quo Alternative (H 1 ): there is an effect
T-tests One sample t-test compare a group to a known value (e.g., comparing the IQ of a specific group to the known average of 100) Paired samples t-test compare one group at two points in time (e.g., comparing pretest and posttest scores) Independent samples t-test compare two groups to each other
Paired t-test More examples Before-and-after observations on the same subjects (e.g. students diagnostic test results before and after a particular module or course) A comparison of two different methods of measurement or two different treatments where the measurements or treatments are applied to the same subjects (e.g. blood pressure measurements)
Paired t-test 1. Calculate the difference between the two observations on each pair, making sure you distinguish between positive and negative differences. 2. Calculate statistics (mean, SD etc.) for these difference scores. 3. Calculate the t-statistic (T). Under the null hypothesis, this statistic follows a t-distribution with n 1 degrees of freedom (n = sample size). 4. Use tables of the t-distribution to compare your value for T to the t n distribution.
Paired t-test Example: Suppose a sample of n students were given a diagnostic test before studying a particular subject and then again after completing it. Student Pre-test Post-test difference 1 18 22 +4 2 21 25 +4 3 16 17 +1 4 22 24 +2 5 19 16-3
Independent t-test Question: Do two samples come from different populations? NO H 0 DATA YES A B
Independent t-test Depends on whether the difference between samples is much greater than difference within sample. A B Between >> Within A B
Degrees of freedom (df) df = (number of independent observations) (number of restraints) or df = (number of independent observations) (number of population estimates) df = (a) (n - 1) a = number of different groups; n = number of observations (i.e., sample size)
Independent t-test How many degrees of freedom when sample sizes are different? (n 1-1) + (n 2-1)
T-tables df (twotailed).20.10.05 df (onetailed).10.05.025 1 3.078 6.314 12.706 2 1.886 2.920 4.303 3 1.638 2.353 3.182 4 1.533 2.132 2.776 Two samples, each n=3, with t-statistic of 2.50: infinity significantly 1.282 1.645 different? 1.960
T-tables df (twotailed).20.10.05 df (onetailed).10.05.025 1 3.078 6.314 12.706 2 1.886 2.920 4.303 3 1.638 2.353 3.182 4 1.533 2.132 2.776 Two samples, each n=3, with t-statistic of 2.50: infinity significantly 1.282 different? 1.645 No! 1.960
One-way (factor) ANOVA General form of the t-test; can have more than 2 samples H 0 : All samples the same H 1 : At least one sample different
One-way (factor) ANOVA General form of the t-test; can have more than 2 samples A B C H DATA 0 AB C H 1 A BC AC B
One-way (factor) ANOVA Just like t-test, compares differences between samples to differences within samples A B C T-test statistic (t) ANOVA statistic (F) Difference between means Standard error within sample MS between groups MS within group
ANOVA table df SS MS F p Treatment (between groups) df (X) SSX SSX df (X) } MS X MS E Look up! Error (within groups) df (E) SS E SS E df (E) } Total df (T) SS T
Suppose there are 3 groups of treatment (i.e., one factor with three levels), and there are 5 observations per group. alpha = 0.05, F 2,12 = 3.89 df SS MS F p Treatment (between groups) 2 69 34.5 11.8? Error (within groups) 12 35 2.92 Total 14 104
Two-way ANOVA Just like one-way ANOVA, except subdivides the treatment SS into: Treatment 1 Treatment 2 Interaction between 1 & 2
Two-way ANOVA Suppose there are two groups of treatment 1 and two groups of treatment 2, and there are 10 observations in each group: Treatment 1 (2 levels, so df = 1) Treatment 2 (2 levels, so df = 1) Treatment 1 x Treatment 2 interaction (1df x 1df = 1df) Error? df = k(n-1) = 4 (10-1) = 36
v df SS MS F Treatment 1 1 SS(T1) MS(T1) MS(T1) MS E Treatment 2 1 SS(T2) MS(T2) MS(T2) MS E Treatment 1 x Treatment 2 1 SS(T1XT2) MS(T1XT2) MS(Int) MS E Error (within groups) 36 SS E MS E Total 39 SS T
Interactions Combination of treatments gives additive effect Additive effect: T2 level2 T2 level2 T1 level 1 T1 level2
Interactions Combination of treatments gives non-additive effect Anything not parallel!
How to report Independent t-test: (Example) There was no overall difference in performance on control RAT items between younger and older adults, Ms = 0.39 and 0.32, respectively, t(18) = 1.34, p >.05.
ANOVA (or F-test): How to report (Example) Reading time (in seconds) on the control story was compared to the mean reading time for the four stories with distraction using a 2 (Age: young and old) X 2 (Story Type: without and with distraction) ANOVA with age as a between-subject variable and story type as a within-subject variable. Older adults were slower overall than younger adults, M = 66.45 and 51.33, respectively, F (1, 18) = 18.94, p <.01, the stories with distraction took longer to read than the stories without distraction, M = 79.83 and 37.95, respectively, F (1, 18) = 202.44, p <.01, and, in replication of the earlier work, the slowdown between the stories with and without distraction was greater for older than for younger adults, F (1, 18) = 7.43, p <.05.