RRN 782 Content Validation and Cut-Score Evaluation for Accuplacer s Sentence Skills Meaning Assessment April 2007 Prepared by Keith Wurtz Date: 20070425, Revised: 20070507 SS-ContentValidity-and-CutScores.doc
Introduction The purpose of this paper is to present the results of a content analysis and cut-score validation of Accuplacer s Sentence Skills Meaning (SSM) test at Chaffey College. As a result of the requirement to re-evaluate the content and cut-scores of assessments every five years as well as the recent changes in the curriculum of English courses, the SSM test at Chaffey was re-evaluated by the Language Arts faculty. The first section of this paper examines the content review of the SSM and the second section examines the cut-score validation process. Content Validity (see Table 2) Executive Summary ENGL-1A 52% of English faculty indicated that the items on the assessment were important or critical to ENGL-1A for the successful acquisition of the skills taught in the course 84% of English faculty indicated that the items on the assessment were moderately important, important or critical for the successful acquisition of the skills taught in the course ENGL-450 48% of English faculty indicated that the items on the assessment were important or critical to ENGL-450 for the successful acquisition of the skills taught in the course 85% of English faculty indicated that the items on the assessment were moderately important, important or critical for the successful acquisition of the skills taught in the course ENGL-550 44% of English faculty indicated that the items on the assessment were important or critical to ENGL-550 for the successful acquisition of the skills taught in the course 81% of English faculty indicated that the items on the assessment were moderately important, important or critical for the successful acquisition of the skills taught in the course 2
ENGL-500 39% of English faculty indicated that the items on the assessment were important or critical to ENGL-500 for the successful acquisition of the skills taught in the course 64% of English faculty indicated that the items on the assessment were moderately important, important or critical for the successful acquisition of the skills taught in the course Cut-Score Validation Among the new methodologies that were explored, research suggests that utilizing cut-score ranges derived from averaged faculty ratings resulted in the optimum likelihood of student success (see Table 7) Course Current Average ENGL-500 22.5 37.4 38.1 48.5 ENGL-550 37.5 70.4 48.6 74.5 ENGL-450 70.5 94.4 74.6 91.5 ENGL-1A 94.5 120 91.6 120 Conclusion If faculty agree on the content analysis and cut-scores, the next step would involve using only the cut-score ranges to place students into courses for one year. At the end of that year the impact of those cut-score ranges on English course success for those who follow the placement recommendation will be examined. Moreover, after one year sufficient data would exist to identify the educational background measures (i.e. multiple measures ) that increase the probability that students will successfully complete the course. To capture students for the 2007 2008 academic year it is recommended that the placement rules be changed as of May 1 st to include the new cut-score ranges agreed upon by faculty. following: After meeting with the English Department the faculty agreed to on the 3
1. Based on the results presented in Table 7 and discussion among the English faculty the following cut-scores were established: a. 20 38.1 See a Counselor b. 38.1 48.5 ENGL-500 c. 48.6 74.5 ENGL-550 d. 74.6 94.5 ENGL-450 e. 94.5 120 ENGL-1A 2. The English faculty also agreed to change the cut-scores on May 1 st, 2007 and use the Fall 2007 and Spring 2008 semesters to gather data on the cutscore changes as well as the educational background measures. 3. The results of the cut-score changes and affect of the background measures on predicting success will be evaluated in Summer 2008. 4. As a result of the research in Summer 2008 new placement rules will be generated that incorporate both the score on the Sentence Skills Meaning test and the student s educational background measures to place students. 4
Content-Related Validity Evidence The English faculty have been actively involved in the Spring 2007 semester in an evaluation of English courses and the skills associated with each course (see Appendix A). The sequence skills identify the skills that students are expected to have acquired upon successful (i.e. A, B, C, or CR grade) completion of the course. Within the core English course sequence, the exit skills for an English course are the required entrance skills for the next course in the core sequence. Using the English skill sequence template as a reference guide, the English faculty have reviewed Accuplacer s Sentence Skills Meaning (SSM) Test to again determine whether the assessment instrument continues to accurately evaluate the skills and knowledge associated with ENGL-500, 550, 450, and 1A. Following the faculty review process, a content validity study was conducted for Accuplacer s SSM Test. Content validity assesses the appropriateness of a test for making placement recommendations for a sequence of courses using expert judgment (Cohen & Swerdlik, 2005). Equally important is that content validity identifies items on an assessment that reflect the construct of interest (Ding & Hershberger, 2002). To measure whether a test or assessment has content validity, experts rate how well each item reflects the skill being measured (Cohen & Swerdlik; Ding & Hershberger). Twenty-two faculty members of the English Department were asked to rate the importance of the academic skill or knowledge measured by the test item for the successful acquisition of the skills taught in each course. Not all 22 faculty evaluated every course. The English Department faculty were asked to evaluate courses they were most familiar / experienced with. The English instructors 5
independently completed the Content Review Form (see Appendix B). A scale of 1 5 was used as follows: 5 Critical 4 Important 3 Moderately Important 2 Of slight Importance 1 Not Relevant Multivariate outliers were examined using Mahalanobis Distances 1 (Mertler & Vannatta, 2005). The linear regression procedure in SPSS was used to calculate Mahalanobis Distance for all of the ratings in the analysis. None of the cases were identified as outliers. In addition to examining outliers, responses were also examined for respondents who responded to all of the items with the same number. For instance, any respondents where all of the items had the same number (e.g.: a 2 ) were excluded from the analysis. The reason for this is that since the responses were all the same it indicates the possibility that the respondent did not understand the directions for completing the analysis. As a result, the results for one case in ENGL-500 and one case in ENGL-550 were excluded from the analysis. Content Validity Results ENGL-1A. Twelve English faculty took the SSM Test, completing the test as if they were a student who possessed the required skills to enter ENGL-1A. The mean rating for the test as a whole was 3.48 (i.e. moderately important to important) indicating support for the test s content validity with ENGL-1A curriculum at Chaffey 1 Outliers are identified using chi-square values that are significant at p <.001. With 1 degree of freedom the chi-square criteria was 10.81. 6
Community College (see Table 1). In addition, 52% of the English faculty indicated that the items were important or critical and 84% indicated that the items were moderately important, important, or critical for the successful acquisition of the skills taught in the course (see Table 2). ENGL-450. Seventeen English faculty took the SSM Test, completing the test as if they were a student who possessed the required skills to enter ENGL-450. The mean rating for the test as a whole was 3.46 (i.e. moderately important to important) indicating support for the test s content validity with ENGL-450 curriculum at Chaffey Community College (see Table 1). In addition, 48% of the English faculty indicated that the items were important or critical and 85% indicated that the items were moderately important, important, or critical for the successful acquisition of the skills taught in the course (see Table 2). ENGL-550. Nine English faculty took the SSM Test, completing the test as if they were a student who possessed the required skills to enter ENGL-550. The mean rating for the test as a whole was 3.43 (i.e. moderately important to important) indicating support for the test s content validity with ENGL-550 curriculum at Chaffey Community College (see Table 1). In addition, 44% of the English faculty indicated that the items were important or critical and 81% indicated that the items were moderately important, important, or critical for the successful acquisition of the skills taught in the course (see Table 2). ENGL-500. Four English faculty took the SSM Test, completing the test as if they were a student who possessed the required skills to enter ENGL-500. The mean rating for the test as a whole was 3.00 (i.e. moderately important) indicating 7
strong for the test s content validity with ENGL-500 curriculum at Chaffey Community College (see Table 1). In addition, 39% of the English faculty indicated that the items were important or critical and 64% indicated that the items were moderately important, important, or critical for the successful acquisition of the skills taught in the course (see Table 2). Table 1 Average Faculty SSM Test Importance Ratings for ENGL-1A, 450, 550, and 500 English Average Faculty Importance Rating Faculty ENGL-1A ENGL-450 ENGL-550 ENGL-500 1 3.25 3.55 2 4.35 4.15 3 2.85 2.10 4 2.45 2.75 5 3.85 3.45 6 2.60 2.75 7 3.40 3.40 8 3.30 4.25 4.55 9 3.70 3.15 10 3.35 3.70 11 3.00 2.85 12 3.70 3.70 13 3.80 3.90 14 4.00 3.90 3.65 15 2.60 1.40 16* 17 4.05 3.85 18 3.95 4.00 19 3.10 2.85 20 3.20 21 3.40 3.20 22 4.50 N 12 17 9 4 Minimum 2.60 2.45 2.10 1.40 Maximum 4.05 4.50 4.55 4.15 Mean 3.48 3.46 3.43 3.00 SD.454.581.777 1.212 Note. SD is the standard deviation. *Instructor rated every question the same and data was excluded from the calculations. 8
Table 2 Number of Responses by Importance Ranking and Course Importance Rankings ENGL-1A ENGL-450 ENGL-550 ENGL-500 # % # % # % # % Not Relevant 3 1.3 4 1.2 0 0.0 14 17.5 Slight Importance 35 14.6 47 13.9 34 18.9 15 18.8 Moderately Important 77 32.1 124 36.6 67 37.2 20 25.0 Important 93 38.8 113 33.3 47 26.1 19 23.8 Critical 32 13.3 51 15.0 32 17.8 12 15.0 Total 240 100.0 339 100.0 180 100.0 80 100.0 Methodology Faculty Derived Cut-Scores Twenty-two English faculty independently completed Accuplacer s SSM assessment test. The participating instructors were asked to answer each question as a student would upon entrance into the course. Equally important, instructors only engaged in the assessment process for courses in which they were the most familiar. This was done for ENGL-1A, 450, 550, and 500. Outliers. Outliers for each SSM score by course were examined using SPSS s Stem-and-Leaf Plot program (Mertler & Vannatta, 2005). As a result, two outliers were identified and excluded from all analyses. In ENGL-500 the score of 119 was identified and excluded from the analysis, and in ENGL-550 the score of 118 was excluded from the analysis. Normality. In testing for normality, nonnormality is indicated if the skewness exceeds 1.0 (Tabachnick & Fidell, 2007; Mertler & Vannatta, 2005), the ratio of the skewness to the standard error for skewness exceeds the p =.0010 for a z score of 3.29 (Tabachnick & Fidell), the Kolmogorov-Smirnov (K-S) test is significant at the.001 level (Mertler & Vannatta), and the Shapiro-Wilk (S-W) test is significant at the 9
.001 level. The results in Tables 3 and 4 strongly indicate that the SSM faculty derived test scores are normally distributed. Moreover, none of the skewness statistics are higher than one, none of the skewness standard error ratios are 3.29 or higher, and only one of the K-S and S-W statistics are statistically significant. Table 3 Skewness, and Kurtosis Standard Error Ratios for Income Level and Weekly Hours by Sex and Age Skewness Kurtosis Variable Stat. Ratio Stat. Ratio Course ENGL-500 -.671.735-2.15-1.08 ENGL-550.390.612 -.906.735 ENGL-450.276.501-1.48-1.40 ENGL-1A -.871-1.37 -.328 -.266 Note. Ratios were calculated from the standard errors (SE) of the skewness and the kurtosis statistics. The Ratio for each skewness and kurtosis statistic was calculated by dividing the statistic by the corresponding standard error (Tabachnick & Fidell, 2007). *p <.001. Table 4 K-S and S-W Statistics by Income Level and Weekly Hours for Sex and Age K-S S-W Variable Statistic df p Statistic df p Course ENGL-500.289 5.200.875 5.288 ENGL-550.188 12.200.903 12.171 ENGL-450.160 17.200.890 17.046* ENGL-1A.170 12.200.883 12.096 *p <.05. 10
Results Tables 5 and 6 present the results of the faculty derived cut-scores. There are at least four possible methods for deriving the cut-scores: 1) use the current cutscores; 2) base the cut-scores on the minimum and maximum faculty derived assessment scores; 3) use the faculty derived averages for the cut-scores; or 4) use the faculty derived averages in combination with the mean standard error measurement. The minimum-maximum method takes the faculty derived minimum score in ENGL-500 as the initial cut-score to place into ENGL-500. In addition, the maximum score for ENGL-500 is used as the cut-score point for ENGL-550. Combinations of minimum and maximum scores are used to generate the ranges for the other courses. The faculty derived averages uses the average score for each course to generate the cut-score ranges. Finally, the mean standard error measurement takes subtracts the standard error from the mean and uses that number to generate the cut-score ranges. The standard error identifies how much the mean varies from sample to sample. Table 5 Average Faculty SSM Test Importance Ratings for ENGL-500, 550, 450, and 1A English Course 500 550 450 1A N 5 12 17 12 Minimum 28.9 28.9 41.2 49.7 Maximum 44.6 77.5 109.8 115.6 Mean 38.1 48.6 74.6 91.6 Standard Error 3.07 5.01 6.08 6.51 SD 6.9 17.4 25.1 22.6 Note. SD is the standard deviation. Standard Error Mean is the measure of how much the mean may vary from sample to sample. 11
Table 6 Average Faculty SSM Test Importance Ratings for ENGL-500, 550, 450, and 1A Possible Cut-Score English Course Ranges 500 550 450 1A Current Cut-Score Range 22.5-37.4 37.5-70.4 70.5-94.4 94.5-120 Minimum & Maximum 28.9-44.6 44.7-77.5 77.6-109.8 109.9-120 Average 38.1-48.5 48.6-74.5 74.6-91.5 91.6-120 Average (Standard Error) 35.0-43.5 43.6-68.4 68.5-84.9 85.0-120 Table 7 indicates that the success rates are highest for the cut-score ranges generated from the faculty derived averages. These cut-score ranges would raise the cut-off point for placing into MATH-500 from 22.5 to 38.1. As a result, students would need to earn 32% of the points rather than 19% to be placed into MATH-500. One of the other noticeable changes is that the cut-score for ENGL-1A would decrease from 94.5 to 91.6. 12
Table 7 2005 2006 Success in Each Derived Cut-Score Range for ENGL-500, 550, 450, and 1A English Course Possible Cut- 500 550 450 1A Score Ranges # N % # N % # N % # N % Current 22.5 37.4 48 119 40.3 37.5 70.4 384 656 58.5 70.5 94.4 922 1477 62.4 94.5 120 740 1138 65.0 Min. and Max. 28.9 44.6 62 143 43.4 44.7 77.5 350 554 63.2 77.6 109.8 787 1266 62.2 109.9 120 168 259 64.9 Average 38.1 48.5 18 28 64.3 48.6 74.5 294 468 62.8 74.6 91.5 715 1130 63.3 91.6 120 814 1243 65.5 Average (Error) 35.0 43.5 21 44 47.7 43.6 68.4 309 494 62.6 68.5 84.9 525 863 60.8 85.0 120 1007 1524 66.1 Note. These scores include students with the 2005 2006 placement requirements and recommendations. The bolded scores were the highest for that course. The derived cut-score ranges highlighted in green consistently have the highest success rates. 13
References Cohen, R. J., & Swerdlik, M.E. (2005). Psychological testing and assessment: An introduction to tests and measurement (6 th. Ed.). Mountain View, CA: Mayfield publishing. Ding, C.S. & Hershberger, S.L. (2002). Assessing content validity and content equivalence using structural equation modeling. Structural Equation Modeling, 9, 283-297. Retrieved July 9, 2006 from the PsycINFO database. Mertler, C.A. & Vannatta, R.A. (2005). Advanced Multivariate Statistical Methods (3 rd ed.): Practical Application and Interpretation. Glendale: Pyrczak Publishing. Tabachnick, B.G. & Fidell, L.S. (2007). Using Multivariate Statistics (5 th ed.). Boston: Pearson Education. 14
Appendix A ENGL-500 Preparation for College Writing In this class students will learn to: Write a short paragraph Revise written assignments Apply rules of grammar, punctuation, mechanics, spelling and usage Define new words and use those words in written assignments Identify paragraph patterns Use technology for editing, revising, and proofreading ENGL-550 Introduction to College Writing In this class students will learn to: Recognize the underlying structure of an essay, article or textbook chapter Apply effective reading, time management, study, and test-taking strategies Identify the parts of speech, phrases and clauses Compose sentences in a variety of sentence patterns Apply rules of punctuation and capitalization ENGL-450 Fundamentals of Composition In this class students will learn to: Express thoughts in clear, effective, logical prose Recognize and formulate clear and specific topic sentences and develop these into unified and complete paragraphs Analyze the structure of various kinds of paragraph development and construct paragraphs in such patterns Identify and practice the coherency and rhetorical devices that make a paragraph rational, clear, and aesthetically sound ENGL-1A Composition In this class students will learn to: Compose logical, coherent, unified essays with minimal errors in grammar, punctuation, and spelling Explain the relationships between audience, tone, purpose, and levels of diction Analyze the structure of various kinds of essay development and construct essays in such patterns Produce a multi-source research paper 15
Appendix B 64162 Faculty's First Name Chaffey College English Assessment for Content Review Sentence Skills Assessment Test Faculty's Last Name Date: / / Instructions: Please rate each item's importance for each English course that you are reviewing using the scale below. Keep in mind that some of you may be reviewing one course and some may be reviewing multiple courses. For example, if you are only familiar with ENGL-1A you would review ENGL-1A only, if you are familiar with ENGL-500 and 450 you would review those two courses only, etc. How important is the academic knowledge or skill measured by this item for successful acquisition of the skills taught in this course? 5 = Critically, 4 = Important, 3 = Moderately important, 2 = Of slight importance, 1 = Not relevant If you have any questions about how to complete this form, or take the assessment test please call Keith Wurtz at 909-466-2899 or Jim Fillpot at 909-941-2760. Test Item Number ENGL-500 ENGL-550 ENGL-450 ENGL-1A Test Item #1 Test Item #2 Test Item #3 Test Item #4 Test Item #5 Test Item #6 Test Item #7 Test Item #8 Test Item #9 Test Item #10 Test Item #11 Test Item #12 Test Item #13 Test Item #14 Test Item #15 Test Item #16 Test Item #17 Test Item #18 Test Item #19 Test Item #20 5 4 3 2 1 5 4 3 2 1 5 4 3 2 1 5 4 3 2 1 Prepared by Keith Wurtz Date: 20070403 Content-Analysis-SS Thank you! Please return the form to the Institutional Research Office! 16