Psychology 312: Lecture 6 Scales of Measurement Slide #1 Scales of Measurement Reliability, validity, and scales of measurement. In this lecture we will discuss scales of measurement. Slide #2 Outline Reliability and validity Types of scales of measurement How to determine appropriateness of measure. In doing so we will review the issues of reliability and validity. We will address the common types of scales of measurement used in social science research. Finally we will talk about how to best determine the appropriateness of a particular measure. Slide #3 Measurement Allows us to test our hypotheses. Relates to our operational definitions. o How we translate our variable/concept into something that we can measure and record. The issue of measurement is central to good experimental design, because measurement will allow us to formally test our hypothesis. This relates back to the idea of the operational definition a term that we introduced in the previous lecture. You will recall that the operational definition allows us to translate our variable or our concept of interest into something that we can actually measure and record. In the instance of the independent variable we must define that variable in order to determine how to manipulate it across the conditions in our experiment. In the instance of the dependent variable the operational definition will tell us how we plan to measure and record that variable during the experiment. Slide #4 Measurement Should be reliable and valid. o Reliable Consistent Shows same effect repeatedly. o Valid
Reflects the variable you say it measures (construct validity). When making decisions about measurement we want to be sure that we choose a measurement strategy that is both reliable and valid. By reliable I mean that are measurement should be consistent or it should show the same effect repeatedly. In terms of validity our measurement is valid if it in fact reflects the variable we say it reflects. Said another way that it has high construct validity. It may be helpful to think about a concrete example. Imagine that you have a bathroom scale. Your scale would be reliable if upon stepping on the scale five times in a row it consistently produced the same weight. The scale would be valid if the weight that it reveals is a true and accurate reflection of your actual weight. Slide #5 Measurement Note. A measurement can be reliable but not valid. But a measure cannot be valid without also being reliable. Using that same example you should note that a measurement can be reliable without be valid. If the bathroom scale produced the same five readings five times in a row, but those readings were not an accurate reflection of your actual weight then your scale would be reliable, but not valid. In contrast a measurement cannot be valid without also being reliable. If the bathroom scale produced five different readings there would be no way of knowing, which of those five readings if any of them was an accurate reflection of your true weight. Slide #6 Measurement The internal validity of a study depends upon both the reliability and the validity of the measures within that study. Putting all of that together you can see that the internal validity of the study depends upon both the reliability and the validity of the measures within that study. If our ultimate goal is to maximize internal validity we must take particular care in selecting measures that we are confident our both reliable and valid. Slide #7 Scales of Measurement Scaling defines the rules we use to transform our observations into numbers. Determines how we. o Display the data o Analyze the data o Interpret the data
Let s turn now to the issue of scales of measurement. Scaling defines the rules we use to transform our observations into numbers. In doing so scaling will determine how we will display our data, how we will analyze our data and ultimately what we will be able to interpret from our data. Slide #8 Scaling Example: Let s say we re interested in emotional response to particular visual stimuli. To explore the issue of scaling let s look at a specific example. Imagine that we are interested in evaluating the emotional response to particular visual stimuli. I once had a colleague that was interested in this particular question. In her laboratory her participants were asked to view a range of visual stimuli. As they did so she would then measure their emotional response. Slide #9 Scaling Example: How are we going to operationally define/measure emotional response? Changes in.. o Facial expression? o EEG activity? o Heart rate? o Self-report? o Some observable behavioral measure? So in this type of experiment how might one go about operational defining or measuring an emotional response. Notice that we could do so in a number of different ways. We could potentially measure changes in facial expression, or perhaps EEG activity, or heart rate, or perhaps use some self-report measure or finally use some other observable behavioral measure. All of these could potential serve as ways of defining or measuring emotional response. Which we choose would determine things like how we could display our data, how we could analyze our data and finally how we could interpret our data. Slide #10 Scaling Example: Then, we must decide what to record about the response: o Type? o Frequency? o Duration? o Latency? (i.e., time from presentation of stimulus to onset to first response) o Strength/Intensity? o Some combination of above? These all reflect scaling (numerical code)
Once we have decided which particular changes we are going to use as a measure of emotional response. We would also have to determine what specifically about those changes we plan to record. For instance will it be the type of change, the frequency of the change, the duration, the latency, the strength or intensity or some combination of many of these? All of these reflect back to the issue of scaling, because in all instances we are specifying how we plan to take our observations and translate them into a numeric code. Slide #11 Scales of Measurement Types of scales of measurement: o Nominal o Ordinal o Interval o Ratio In social science research there are several common types of scales of measurement. They include nominal, ordinal, interval and ratio. We will walk through each of these in turn. Slide #12 Nominal Scale Numbers represent categories or labels. Differences btw categories are qualitative (of kind), not quantitative (of degree, magnitude). By definition nominal data is data that is coded to represent categories or labels. The difference between the categories in this instance is qualitative meaning kind rather than quantitative meaning of degree of magnitude. Slide #13 Nominal Scale Examples: o Code for sex: Female = 1; male = 2 Code for emotional response to images: grimace =1, gag= 2, look away= 3, close eyes = 4 Common examples of nominal data include the following. For instance we could use a nominal scale to code for sex. In our experiment all females could be coded with a one and all males could be coded with a two. In addition if we wanted to code emotional response to the images we might use something like the following. If our participant reveals a grimace while viewing an image we would assign a one, a gag response would receive a two, a look away a three and finally if they close their eyes they receive a four. Slide #14
Nominal Scale Cannot add, subtract, multiply, or divide nominal data. Can only calculate frequency or percentage in each category. It is important to note that nominal data cannot be added, subtracted, multiplied or divided. This is because these mathematically manipulations do not make sense for categorical data. You can however use nominal data to calculate things like frequency or percentage in each category. So for instance we could report the percentage of females in our experiment vs. the percentage of males. Notice then that nominal data is data that is largely descriptive in nature. Slide #15 Ordinal Scale Provides categories AND rank. Numbers reflect some degree of quantitative difference (ranking). BUT, differences between values are not necessarily equal. If the goal of your description goes beyond description you will need to move to at least an ordinal scale. Ordinal data is data that provides both categories and rank. In this instance the numbers in this particular scale reflect some degree of quantitative difference, sometimes of ranking. You should note however that in this particular type of scale the differences between those values are not necessarily equal or cannot be assumed to be equal. Slide #16 Ordinal Scale Examples: o Rank images in terms of disgust, 1= most disgusting 2 = moderately disgusting 3 = least disgusting Note: the difference between a score of 1 and 2 may large, but the difference between a score of 2 and 3 may be small. An example of ordinal data in the present experiment might be something like the following. Imagine that we will show participants three images and ask them to rank those images in terms of their disgust. Using one for the most disgusting image and three for the leas disgusting. These are ordinal data, because we are asking participants to categorize three images, but the categorization has an applied ranking. Notice however that it may be the case that for particular participant s one image is very disgusting while the other two are not particularly distressing. In this case the difference between a score of one and two may be quite large, but the difference between a score of two or three might be very small. Slide #17 Ordinal Scale
Examples: o Scales on a questionnaire: 1 = strongly agree 2 = agree 3 = neutral 4 = disagree 5 = strongly disagree Here is another example of ordinal data that is very common in social science research. We frequently use scales like the following on questionnaires where we ask a participant a series of questions and for each of them we ask them to indicate whether they strongly agree, agree or neutral, disagree or strongly disagree with the statement. This is also an example of ordinal data, because we are asking them to categorize their answer, but the underlying scale implies a ranking. Slide #18 Interval Scale Values are related by a single, underlying quantitative dimension with equal intervals between the scale values. o Can be positive values, zero, & negative values. BUT, no absolute or true zero point representing total lack or absence of value. The third type of scaling is the interval scale. With interval data the values are related to a single, underlying quantitative dimension. Similar to ordinal data. However unlike ordinal data interval data includes or assumes equal intervals between the scale values. On a interval scale you can potential have positive values, a zero value and negative values. However keep in mind that in this type of scale there is no absolute or true zero point. That is a point that actually represents total lack or absence of a value. Slide #19 Interval Scale Examples: o Fahrenheit temperature: Different values represent values different levels of heat (nominal). Higher values represent greater heat (ordinal). Differences between values represent equal intervals. BUT, zero does not represent the absence of heat. One of the best examples of interval scaling is the Fahrenheit scale. When you measure temperature using Fahrenheit we assume that different values represent different levels of heat. In addition higher values represent greater heat and the differences between those values represent equal intervals. However on the scale zero does not equal the total absence of heat.
Slide #20 Interval Scale Examples: o IQ test: Different values represent different levels of intelligence (nominal). Higher values represent greater intelligence (ordinal). BUT, zero does not represent the absence of intelligence. An example that is perhaps is more relevant to social science would be the IQ test. On an IQ test different values represent different levels of intelligence and higher values represent greater values of intelligence. At the same time we assume that the difference between the scores on an intelligence test represent equal intervals. However a zero on these types of tests does not represent the total absence of intelligence. Slide #21 Interval Scale Can do mathematical operations. o add, multiply, subtract, divide, etc., But, cannot make claims based on relative magnitudes! o E.g., 4 deg. F not twice at hot as 2 deg. F o E.g., 50 IQ not twice as intelligent as 25 IQ One of the advantages of interval data is that they allow for several mathematical operations. So for instance you can add, multiply, subtract and divide these types of data. One of their limitations however is that we cannot use interval data to make claims based on relative magnitudes. So for instance it is not accurate to say that four degrees Fahrenheit is twice as hot as two degrees Fahrenheit. Likewise it is also not accurate to say that an IQ score of fifty is twice as intelligent as an IQ score of twenty five. Slide #22 Ratio Scale Everything interval data has with the difference being that there is an absolute zero. o Zero is the lowest value o True 0 represents lack of/absence of a value. Our last type of scaling is ratio scale. In ratio data we have everything that interval data has with the difference being that there is also a true or absolute zero in this type of scale. We assume in this case then that zero is the lowest value. Said another way a true zero represents complete lack of or absence of a value. Slide #23 Ratio Scale
Examples: o Food consumption o Reaction time o Accuracy Examples of ratio data would include food consumption, reaction time and accuracy. In all three cases a value of zero would represent the complete absence of the variable. Slide #24 Ratio Scale Can do mathematical operations. o Add, multiply, subtract, divide, etc. Can make relationship statements based on ratios. o Can say, value 4 is twice as much as value 2. Ratio data is often attractive, because, like interval data, ratio data is amendable to many mathematical operations. For instance you can add multiply, subtract and divide these types of data. In addition using this type of scale we are able to make relationship statements based on ratios. So for instance it is accurate to say on a ratio scale that a value of four is twice as much as a value of two. Slide #25 How to Determine Appropriateness of Measure? Consider your hypothesis. Consider the operational definition(s) of your variables. o What measure makes the most sense? o What instrument/devise will you use? Questionnaire? Stop watch? As we have walked through our discussion of scales of measurement you may have been asking yourself how do I determine the appropriateness of a particular measure. It is an excellent question and one that we have to consider before we close this discussion. When trying to determine the best way to measure the variables in your experiment you should start by going back to consider your original hypothesis, because the nature of this hypothesis will probably point you to the appropriate measurement. For instance if the intention of your experiment is merely descriptive in nature than nominal data may fit the bill, but if you want to say something about predictive power or causality you will likely have to step up to ordinal or ratio data. You will also want to consider the operational definition of your variables. For instance you might ask yourself what measure makes the most sense in this case. Does it make the most sense to measure emotional response based on facial expression or might I be more interested in measuring something like heart rate? Might that be a more valid measure of emotional response?
This question will also be related to the instrument that you plan to use to determine those measurements. So for instance if you wanted to measure emotional response using self-report than a questionnaire would be the most appropriate device, but if you wanted to measure the duration of a facial expression you are probably going to need a stop watch. Slide #26 How to Determine Appropriateness of Measure? Then ask o Is it a reliable measure? o Is it a valid measure? Finally before settling on a final measurement you will want to be sure to ask yourself what is a reliable measure in this instant and what is a valid measure? You want to be sure of course that your measurements at the end of the day are both reliable and valid thus insuring high internal validity in your experiment and making you confident that in the end when you interpret your results you can reasonably conclude that any change in your dependent variable was truly a reflection of the independent variable. Slide #27 Next Lecture That concludes this lecture. Next we will discuss Descriptive Statistics. That concludes this lecture. Next we will discuss Descriptive Statistics.