Measurement & Data Analysis Overview of Measurement. Variability & Measurement Error.. Descriptive vs. Inferential Statistics. Descriptive Statistics. Distributions. Standardized Scores. Graphing Data. Assignment #5: Dealing with Data in Excel. On the importance of math & measurement No human investigation can be called real science if it cannot be demonstrated mathematically. ~ Leonardo da Vinci The stature of a science is commonly measured by the degree to which it makes use of mathematics. ~ S. S. Stevens When you can measure what you are talking about, and express it in numbers, you truly know something about it. Knowledge without measurement or numbers is meager and unsatisfactory. ~ Lord Kelvin The trouble with measurement is its seeming simplicity. ~ anonymous Steps Involved in Doing Scientific Research Question > Hypothesis > Research Design > Measurement > Analysis > Interpretation > Reporting Measurement Measurement refers to the assignment of numbers to people, behaviors, objects, events, etc. When a number is assigned, it refers to a "unit" of some underlying property (e.g., miles, hours, correct responses, aggressive behaviors, eye-blinks, etc.). Accurate, objective measurement is the cornerstone of all science including psychology. 1
Variability Refers to how measurements vary across individuals, situations, and time. Performance on Psych Exam 1. Todd = 92% 2. Sarah = 88% 3. Kelly = 82% 4. Sam = 78% 5. Maria = 62% GRE Performance in December 1. Sam = 1320 2. Sarah = 1280 3. Maria = 1150 4. Todd = 1090 5. Kelly = 990 GRE Performance in January 1. Sam = 1300 2. Todd = 1290 3. Sarah = 1270 4. Maria = 1180 5. Kelly = 1000 Psychology is the study of behavioral, cognitive, and emotional variability. Research questions are questions about variability. Statistics were designed to describe and account for variability. Measurement Error Observed Score Observed Score = True Score + Measurement Error 1. Todd = 92% 2. Sarah = 88% 3. Kelly = 82% 4. Sam = 78% 5. Maria = 62% Psych Exam True knowledge Stable Attributes Transient States Situational Factors 1. Sarah = 95% 2. Todd = 89% 3. Kelly = 85% 4. Maria = 82% 5. Sam = 75% Measure Features Scoring Errors "True Score" "Measurement Error" Refers to the meaning & interpretation of the numbers that are used to measure something. They are critical for determining the math operations that can be performed on the numbers, as well as the kind of statistics that one can use to analyze them. There are four different scales of measurement each of which emphasizes a specific property: Scale Nominal Ordinal Interval Ratio [NOIR] Property Identity Magnitude Equal Intervals True Zero Point [I M EZ] 2
Nominal. Used to place people or behaviors into two or more categories based on their identity (e.g., Male = 1, Female =2). Values are not used in calculations. Studies using nominal scales often involve simply counting up the number of particular people or behaviors. Examples of variables measured with a nominal scale include Gender, Race, Age (Young/Old), Club Membership, Zip Code, etc. Examples of empirical questions employing nominal scales: Are men or women more likely to help a stranded motorist? Are Psych majors or Computer-Science majors more likely to experiment with drugs? Ordinal. Used to rank order things (from best to worse, likely to unlikely, amount of agreement, etc.) based on their magnitude (strength). No assumption is made about the relative distance (spacing) between rankings; numeric calculations limited to > & <. Class Rank (Ordinal) 1 2 3 Case 1 Dan s GPA = 3.2 Sue s GPA = 3.1 Bob s GPA = 3.0 Case 2 Dan s GPA = 4.0 Sue s GPA = 3.0 Bob s GPA = 2.8 Examples of empirical questions employing ordinal scales: Do football teams with new coaches tend to move up or down in end- of-the-year team standings? How do older adults rank movies on a Likert scale that differ in the amount of violence they contain? Interval. Similar to an ordinal scale (numbers reflect magnitudes) but includes the idea of equal intervals between numbers. Values can be used in calculations (means, SDs, etc.) but there is no true zero amount (cf. temperature). Many common tests in psychology are based on interval scales (test of intelligence, anxiety, shyness, etc.). Examples of empirical questions employing interval scales: Does Prozac reduce depression as measured by the Beck Depression Inventory? Does growing up in an enriched environment increase intelligence as measured by the WAIS? Does marijuana use increase neuroticism as measured by the Five-Factor Personality Test? 3
Ratio. Similar to an interval scale (numbers reflect magnitudes with equal intervals) but includes a true zero point as well. Typically found in studies measuring physical properties (height, weight, time) as well as tests that measure the number, proportion, or % of correct or incorrect responses. Examples of empirical questions employing interval scales: Does alcohol slow reaction time? Does distraction during studying decrease exam performance? Does the volume of tissue in the frontal lobes decrease with age? Descriptive vs. Inferential Statistics Descriptive Statistics. Describes a set of data using measures of central tendency (mean, mode, median), dispersion (range, variance, standard deviation), and association (correlation). Focused on a sample taken from a larger population. Inferential Statistics. Allows one to make inferences about a set of data using a variety of statistical tools (Chi-square, t-test, ANOVA, ANCOVA, etc.). Such inferences are based on the mathematical relations between a sample and the population from which it was drawn. Although performed on samples, the focus is on generalizing to the population from which the samples were drawn. Descriptive Statistics: Central Tendency Measures of central tendency reduce a large set of numbers to just one that captures the "center" of the distribution. "What was the average score?" "What was the typical level of test performance?" Ten test scores: 60 68 68 68 74 78 85 94 96 99 Mean (M or X for sample, µ for population) = (scores) / (number of scores) Median = the middle score (such that 1/2 fall above it, and 1/2 fall below it) Mode = the most common (frequent) score = 79 = 76 = 68 4
Descriptive Statistics: Dispersion Measures of dispersion describe how much variability or "spread" there is in a set of numbers. Ten test scores: 60 68 68 68 74 78 85 94 96 99 Range = The lowest to highest score = 38 [60 to 98] Variance (S 2 for sample, 2 for population) = (scores-mean) 2 = 170.0 / (number of scores) = The average squared deviation from the mean. Standard Deviation (S or SD for sample, for pop.) = The average deviation from the mean. = How much, on average, does each value = 13.0 differ from the mean? = 2 (the square-root of the variance) Descriptive Statistics: Dispersion Variance helps describe data, esp. when means are the same. Target Shooting under various conditions Descriptive Statistics: Association Measures of association (correlation) describe how two variables are related to one another. +.90 -.95.00 Characterized by a value (the "correlation coefficient") that ranges from -1.0 to +1.0. The most common correlation coefficient in psychology is the Pearson product-moment coefficient (r). 5
Distributions "Distributions" refer to how the values of a variable are spread out across a group of individuals or events. Frequency Distribution: How often a value occurs across individuals or events. Histogram: A bar graph of a frequency distribution. It allows one to easily see Central Tendency (or lack thereof) & Dispersion. A Unimodal Distribution A Bimodal Distribution Distributions Frequency Distributions are often shown as line graphs which are created by drawing a continuous line through the top/center of each bar. Distributions come in many shapes. One important aspect of shape is how "skewed" the distribution is. Distributions One of the most important distributions in statistics is the Normal (Gaussian) Distribution (aka. The Bell Curve). Many phenomena in the life sciences & psychology are normally distributed (height, weight, shoe size, intelligence, etc.). Normal Dist's are symmetrical around their center, allowing one to easily describe a large amount of data. 6
Distributions Normal Distribution can be completely described by two parameters: µ and 2. Normal Dist's can be tall & skinny (thereby indicating low variance) or short & flat (thereby indicating high variance). low variance high variance The Normal (Gaussian) Distribution A key point in statistics: If an empirical distribution resembles the normal distribution, then the normal can be used to draw conclusions about the empirical data. Standardized Scores Allow one to make comparisons among test based on different units of measurement (e.g., reaction time & % correct). The most common standardized score is the Z score. Z = (score - mean) / SD A set of Z scores has a mean of 0, and a SD of 1. The effect of age on four measures of cognition having different units of measurement. 7
60 50 40 30 20 10 0 33 38 56 In General Graphing Data X-axis (aka. the abscissa); horizontal; usually used to depict IV. Cigarette Smoking a function of Different Pychological Disorders Anxeity Depression Schizophrenia Psychological Disorder Y- axis (aka. the ordinate); vertical; usually used to depict the DV. Bar Graphs. Used to show categorical variables. Numb ero fa cci den ts Highway Accidents as a function of Blood Alcohol Contant (BAC) 50 45 40 35 30 25 20 15 10 5 0 0.00 0.05 0.10 Blood Alcohol Content Line Graphs. Used to show continuous variables. Scatterplot (aka. Scatter Diagram). Used to depict the relation between two variables. Today's Lab Activity Excel Tutorial: Covering Basic & Advanced Excel functions as well as Formulas and Charts. Assignment #5: Due Next Thursday Organize, Format, Analyze, Graph, & Interpret data in Excel. To do this correctly, you will need to closely follow the posted instructions for Assignment 5. Assignment 5 is posted on the web, along with the data you will be analyzing, and examples of the two pages you be submitting. For this assignment, you will be (a) sending an Excel file to the TA; and (b) turning in 2* pages from your Excel file to me (your data & t-tests, and your graph). Mail your Excel file to MethodsTA@yahoo.com & turn in a your hardcopies to me. 8