appendix B Grouped Frequency Distributions and Central Tendency OBJECTIVES FOR APPENDIX B After studying the text and working the problems, you should be able to: 1. Use four conventions for constructing grouped frequency distributions. Arrange raw data into a grouped frequency distribution 3. Find the mean, median, and mode of a grouped frequency distribution (In writing this appendix, I assumed that you have studied Chapter, Frequency Distributions and Graphs, and the central tendency sections of Chapter 3.) As you know from Chapter, converting a batch of raw scores into a simple frequency distribution brings order out of apparent chaos. For some distributions even more order can be obtained if the raw scores are arranged into a grouped frequency distribution. The order becomes even more apparent when grouped frequency distributions are graphed. In addition to grouping and graphing, this appendix covers the calculation of the mean, median, and mode of grouped frequency distributions. Grouped frequency distributions are used when the range of scores is too large for a simple frequency distribution. How large is too large? A rule of thumb is that grouped frequency distributions are appropriate when the range of scores is greater than 0. At times, however, ignoring this rule of thumb produces an improved analysis. Grouped Frequency Distributions As you may recall from Chapter, the only difference between simple frequency distributions and grouped frequency distributions is that grouped frequency distributions 373
374 Appendix B have class intervals in the place of scores. Each class interval in a grouped frequency distribution covers the same number of scores. The number of scores in the interval is symbolized i (interval size). Establishing Class Intervals There are no hard and fast rules for establishing class intervals. The ones that follow are used by many researchers, but some computer programs do not follow them. 1. The number of class intervals. The number of class intervals should be 10 to 0. On the one hand, with fewer than 10 intervals, the extreme scores in the data are not as apparent because they are clustered with more frequently occurring scores. On the other hand, more than 0 class intervals often make it difficult to see the shape of the distribution.. The size of i. If i is odd, the midpoint of the class interval will be a whole number, and whole numbers look better on graphs than decimal numbers. Three and five often work well as interval sizes. You may find that i is needed if you are to have 10 to 0 class intervals. If an i of 5 produces more than 0 class intervals, data groupers usually jump to an i of 10 or some multiple of 10. An interval size of 5 is popular. 3. The lower limit of a class interval. Begin each class interval with a multiple of i. For example, if the lowest score is 5 and i 3 (as happened with the Satisfaction With Life Scale (SWLS) scores in Table.4), the first class interval should be 3 5. An exception to this convention occurs when i 5. When the interval size is 5, it is usually better to use a multiple of 5 as the midpoint because multiples of 5 are easier to read on graphs. 4. The order of the intervals. The largest scores go at the top of the table. (This is a convention not followed by some computer programs.) Converting Unorganized Scores into a Grouped Frequency Distribution With the conventions for establishing class intervals in mind, here are the steps for converting unorganized data into a grouped frequency distribution. As an example, I will use the raw data in Table.1 and describe converting it into Table.4. 1. Find the highest and lowest scores. In Table.1, the highest score is 35 and the lowest score is 5.. Find the range of the scores by subtracting the lowest score from the highest score (35 5 30). 3. Determine i by a trial-and-error procedure. Remember that there are to be 10 to 0 class intervals and that the interval size should be convenient (3, 5, 10, or a multiple of 10). Dividing the range by a potential i value gives the approximate number of class intervals. Dividing the range, 30, by 3 gives 10, which is a recommended number of class intervals. 4. Establish the lowest interval. Begin the interval with a multiple of i, which may or may not be an actual raw score. End the interval so that it contains
Grouped Frequency Distributions and Central Tendency 375 i scores (but not necessarily i frequencies). For Table.4, the lowest interval is 3 5. (Note that 3 is not an actual score but is a multiple of i.) Each interval above the lowest one begins with a multiple of i. Continue building the class intervals. 5. With the class intervals written, underline each score (Table.1) and put a tally mark beside its class interval (Table.4). 6. As a check on your work, add up the frequency column. The sum should be N, the number of scores in the unorganized data. PROBLEMS *B.1. A sociology professor was deciding what statistics to present in her introduction to sociology classes. She developed a test that covered concepts such as the median, graphs, standard deviation, and correlation. She tested one class of 50 students, and on the basis of the results, planned a course syllabus for that class and the other six intro sections. Arrange the data into an appropriate rough-draft frequency distribution. 0 56 48 13 30 39 5 41 5 44 7 36 54 46 59 4 17 63 50 4 31 19 38 10 43 31 34 3 15 47 40 36 5 31 53 4 31 41 49 1 6 35 8 37 5 33 7 38 34 *B.. The measurements that follow are weights in pounds of a sample of college men in one study. Arrange them into a grouped frequency distribution. If these data are skewed, tell the direction of the skew. 164 158 156 148 180 176 171 150 15 155 161 168 148 175 154 155 149 149 151 160 157 158 161 167 15 168 151 157 150 154 189 Central Tendency of Grouped Frequency Distributions Mean Finding the mean of a grouped frequency distribution involves the same arithmetic as that for a simple frequency distribution. Setting up the problem, however, requires one additional step. Look at Table B.1, which has four columns (compared to the three in Table 3.3). For a grouped frequency distribution, the midpoint of the interval represents all the scores in the interval. Thus, multiplying the midpoint by its f value includes all the scores in that interval. As you can see at the bottom of Table B.1, summing the fx column gives fx, which, when divided by N, yields the mean.
376 Appendix B TABLE B.1 A grouped frequency distribution of Satisfaction With Life Scale scores with i 3 SWLS scores Midpoint (class interval) (X) f fx 33 35 34 5 170 30 3 31 11 341 7 9 8 3 644 4 6 5 4 600 1 3 14 308 18 0 19 8 15 15 17 16 5 80 1 14 13 3 39 9 11 10 5 50 6 8 7 0 0 3 5 4 8 N 100 39 In terms of a formula, m or X fx N For Table B.1, m or X fx N 39 100 3.9 Note that the mean of the grouped data is 3.9 but the mean of the simple frequency distribution is 4.00. The mean of grouped scores is often different, but seldom is this difference of any consequence. Median Finding the median of a grouped distribution is almost the same as finding the median of a simple frequency distribution. Of course, you are looking for a point that has as many frequencies above it as below it. To locate the median, use the formula Median location N 1 For the data in Table B.1, Median location N 1 100 1 50.5 As before, look for a point with 50 frequencies above it and 50 frequencies below it. Adding frequencies from the bottom of the distribution, you find that there are 37 scores below the interval 4 6 and 4 scores in that interval. The 50.5th score is in the
Grouped Frequency Distributions and Central Tendency 377 interval 4 6. The midpoint of the interval is the median. For the grouped SWLS scores in Table B.1, the median is 5. Thus, to find the median of a grouped frequency distribution, locate the class interval that is the location of the middle score. The midpoint of that interval is the median. Mode The mode is the midpoint of the interval that has the highest frequency. In Table B.1 the highest frequency count is 4. The interval with 4 scores is 4 6. The midpoint of that interval, 5, is the mode. PROBLEMS B.3. Find the mean, median, and mode of the grouped frequency distribution you constructed from the statistics questionnaire data (problem B.1). B.4. Find the mean, median, and mode of the weight data in problem B..