RIT Scale Norms. For Use with Measures of Academic Progress. September, 2008

Transcription

1 RIT Scale Norms For Use with Measures of Academic Progress September, 2008

2 ii NWEA 2008 RIT Scale Norms Copyright 2008 Northwest Evaluation Association All rights reserved. No part of this manual may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from NWEA. Northwest Evaluation Association 5885 SW Meadows Road, Suite 200 Lake Oswego, OR Tel Fax

3 NWEA 2008 RIT Scale Norms iii Contents Prologue... ix CHAPTER 1 Introduction... 1 CHAPTER 2 Procedures for Developing the Norms... 3 Sample Development General Criteria for Inclusion of Candidate Test Event Records and Test Histories... 3 Test Event Samples for Norms Development of Growth Norms Reading, language usage, and mathematics General science, science concepts and processes, and MAP-PG CHAPTER 3 - Norming Samples Characteristics. 15 Geographic characteristics Ethnic characteristics CHAPTER 4 Grade Level Growth and Status Norms Grades 2 through 11 Standard MAP Grades Kindergarten and 1 MAP for Primary Grades CHAPTER 5 Growth by RIT Level.. 39 CHAPTER 6 Achievement Growth: Observed and Modeled What do we gain (or lose) by modeling achievement growth rather than using observed difference scores? Conclusions References APPENDIX A RIT Score to Percentile Rank Conversion Tables for Reading, Language Usage, Mathematics, Upper Level Mathematics, General Science Topics, and Science Concepts and Processes Beginning, Middle and End of Instructional Year Reading Language

4 iv NWEA 2008 RIT Scale Norms Mathematics General Science Science Concepts and Processes Upper Level Mathematics APPENDIX B Transition from 2005 to 2008 Norms

5 NWEA 2008 RIT Scale Norms v Tables CHAPTER 2 Procedures for Developing the Norms... 3 Table 2.1: Target Cell Proportions for Reading, Language Usage, and Mathematics Status Norms Samples Table 2.2: Median, Mean, and Standard Deviation (SD) of Instructional Weeks by Total Number of Test Events in Student Test Histories - Grade 4 MATHEMATICS - Initial RIT CHAPTER 3 - Norming Samples Characteristics. 15 Table 3.1a: Candidate Test Record Pool Available for Beginning of Year Status Norms Table 3.1b: Candidate Test Record Pool Available for End of Year Status Norms Table 3.2a: Sample Sizes for Beginning of Year Status Norms by Subject and Grade Level Table 3.2b: Sample Sizes for End of Year Status Norms by Subject and Grade Level Table 3.3: Local School District Information from NCES Common Core of Data and from Status Norms Candidate Pool (Across Reading, Language Usage, and Mathematics) by District Locale Classification Table 3.4: Local Regular and Alternative School Information from NCES Common Core of Data and from Status Norms Samples (Across Reading, Language Usage, and Mathematics) by School Locale Classification Table 3.5: Title 1 Eligible Local Regular and Alternative Schools in the NCES Common Core of Data and in the Status Norms Candidate Pools (Across Reading, Language Usage, and Mathematics) by School Locale Classification Table 3.6: Free and Reduced Price Lunch Program Participation by Local Regular and Alternative Schools in the NCES Common Core of Data and in the Status Norms Candidate Pools (Across Reading, Language Usage, and Mathematics) by School Locale Classification Table 3.7: Number of Students in the Norming Sample by State by Grade and Across Subjects for Tests Taken in the Beginning of Year Time Frame of the School Year Table 3.8: Number of Students in the Norming Sample by State by Grade and Across Subjects for Tests Taken in the End of Year Time Frame of the School Year

6 vi NWEA 2008 RIT Scale Norms Table 3.9a: Sample Size (N), Sample Percentage of Ethnic Group Representation for READING and National School Age Percentages of Ethnic Group Representation by Grade Level Table 3.9b: Sample Size (N), Sample Percentage of Ethnic Group Representation for LANGUAGE USAGE and National School Age Percentages of Ethnic Group Representation by Grade Level Table 3.9c: Sample Size (N), Sample Percentage of Ethnic Group Representation for MATHEMATICS and National School Age Percentages of Ethnic Group Representation by Grade Level Table 3.9d: Sample Size (N), Sample Percentage of Ethnic Group Representation for GENERAL SCIENCE and National School Age Percentages of Ethnic Group Representation by Grade Level Table 3.9e: Sample Size (N), Sample Percentage of Ethnic Group Representation for SCIENCE CONCEPTS and PROCESSES and National School Age Percentages of Ethnic Group Representation by Grade Level 28 CHAPTER 4 Grade Level Growth and Status Norms Table 4.1a: READING - Modeled Growth Means, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), Numbers of Students (N), and Mean Observed Initial RIT for Four Instructional Time Periods Table 4.1b: READING - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level of Test Events Randomly Selected from Ethnicity and School Level Poverty Stratification Cells and Meeting the Beginning, Middle and End of Year Time Frame Criteria Table 4.1c: READING - Differences In Medians, Means and Numbers of Students Between Stratified Samples and the Student Candidate Pool as It Would Have Been Included Using the 2005 Norm Study Inclusion Criteria (Stratified Sample Minus Candidate Pool Values) Table 4.2a: LANGUAGE USAGE - Modeled Growth Means, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), Numbers of Students (N), and Mean Observed Initial RIT for Four Instructional Time Periods Table 4.2b: LANGUAGE USAGE - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level of Test Events Randomly Selected from Ethnicity and School Level Poverty Stratification Cells and Meeting the Beginning, Middle and End of Year Time Frame Criteria

7 NWEA 2008 RIT Scale Norms vii Table 4.2c: LANGUAGE USAGE - Differences In Medians, Means and Numbers of Students Between Stratified Samples and the Student Candidate Pool as It Would Have Been Included Using the 2005 Norm Study Inclusion Criteria (Stratified Sample Minus Candidate Pool Values) Table 4.3a: MATHEMATICS - Modeled Growth Means, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), Numbers of Students (N), and Mean Observed Initial RIT for Four Instructional Time Periods Table 4.3b: MATHEMATICS - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level of Test Events Randomly Selected from Ethnicity and School Level Poverty Stratification Cells and Meeting the Beginning, Middle and End of Year Time Frame Criteria Table 4.3c: MATHEMATICS - Differences In Medians, Means and Numbers of Students Between Stratified Samples and the Student Candidate Pool as It Would Have Been Included Using the 2005 Norm Study Inclusion Criteria (Stratified Sample Minus Candidate Pool Values) Table 4.4a: UPPER LEVEL MATHEMATICS - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) for NWEA End-of-Course MATHEMATICS Tests from Within the End-of-Year Time Frame Criterion Table 4.4b: UPPER LEVEL MATHEMATICS - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) for NWEA End-of-Course MATHEMATICS Tests Included Under Same Criteria Used for the 2005 Norming Study Table 4.5a: GENERAL SCIENCE TOPICS - Observed Growth Means, Standard Deviations (SD), Numbers of Students (N) in the Intact Group, Mean Initial RIT Table 4.5b: GENERAL SCIENCE TOPICS - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level and Part of the Year Table 4.6a: SCIENCE CONCEPTS AND PROCESSES - Observed Growth Means, Standard Deviations (SD), Numbers of Students (N) in the Intact Group, Mean Initial RIT Table 4.6b: SCIENCE CONCEPTS AND PROCESSES - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level and Part of the Year... 37

8 viii NWEA 2008 RIT Scale Norms Table 4.7: Observed Fall to Spring Growth Means, Standard Deviations (SD), Numbers of Students (N) and Mean Initial RITs For Primary Grade Students Table 4.8: Primary Grade - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level and Part of the Year CHAPTER 5 Growth by RIT Level Tables : READING Grades 2-10 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals Tables : LANGUAGE USAGE Grades 2-10 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals Tables : MATHEMATICS Grades 2-10 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals Tables : GENERAL SCIENCE TOPICS - Grades 2-10 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT Tables : SCIENCE CONCEPTS and PROCESSES - Grades 2 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT

9 NWEA 2008 RIT Scale Norms ix Prologue We were asked recently why the RIT Scale Norms are updated so often. Implied in the question was that five norming studies from 1996 through 2008 seems like a lot. Every three years is certainly a shorter norming study cycle than we observe for other major achievement tests. However, in view of how these norms are used, we believe that at least two factors argue for a three-year cycle. First, the population of schools and school districts using NWEA assessment changes rapidly, both in number and composition. For example, in the 2002 study there were 322 districts from 24 states represented. This was roughly a third of the districts using NWEA assessments at that time. Three years later, 794 districts from 32 states were included in the study. While the number of districts was still about a third of all NWEA affiliated districts, the districts were more diverse in size and geography than the collection of districts from Conducting norming studies every three years allows such changes to be represented sooner, thus making the norms more representative of the population of NWEA affiliated districts. Second, over a three-year period a large amount of data accumulates and can be merged with similar existing data. This affords opportunities to use supplemental sampling procedures to better reflect the makeup of the U.S. school-age population. The abundance of individual student test histories also allows growth norms to be created that are more informative, more stable, and able to confidently project student growth beyond a single academic year. Each three-year period seems to uncover at least one new procedure to improve the quality of norms-based references for NWEA assessments. Like the previous three norming studies, this study expands the sample base and range of content of its predecessor and introduces a few new procedures. It differs from the 2005 study in that: It includes data from 6,905 schools located in 1,123 districts in 42 states. This represents increases of 23%, 41% and 31% of the number of schools, districts and states, respectively. Its candidate pool of test records came from over 2.9 million students, compared to 2.3 million students in It is based entirely on results from Measures of Academic Progress (MAP), compared to 77% in It provides status and growth norms in reading and mathematics for early primary grades students (kindergarten through beginning of grade 2). The assessments administered for these norms, MAP for Primary Grades (MAP-PG), did not exist in 2005.

10 x NWEA 2008 RIT Scale Norms It provides grade level growth norms as well as limited RIT point growth norms for general science topics and science concepts and processes. These were not part of the 2005 norms. It uses stratified sampling for grades 2-11 reading, language usage and mathematics status norms to mirror the U.S. school age population on the variables of ethnicity and school percentage of student eligibility for free and reduced price lunch. It incorporates information from school and district calendars to inform the estimates of achievement status and growth for reading, language usage, and mathematics in grades Achievement growth in these areas for these grades is modeled as a function of instructional weeks. It provides the ability to set achievement growth projections (targets) and to evaluate growth in grades 2-11 reading, language usage, and mathematics based on the number of instructional weeks over time periods that can extend well beyond a single school year. One thing that should be emphasized is that this norming study does not change the NWEA measurement scales in any way. The skills possessed by a student with a RIT score of 210 three, six, or even nine years ago are exactly the same as the skills possessed by a student achieving that score today. It is that constancy of the scales that allows us to measure growth, and to compare student performance across time. As you move to the new norms, how your students achievement is measured or how their growth is determined is not changing. What is changing is simply the norms used for student comparisons. In order to make this transition as smooth as possible, we have prepared a description of the major differences between the 2005 and this study that merit attention. This description appears in Appendix B. Finally, this study required data - a great deal of data. These data were retrieved from the NWEA Growth Research Database (GRD). However, we clearly recognize that it is the trust and commitment of NWEA educational partners that allows the GRD to grow and flourish, and to return the benefits of research findings to the larger educational community. We are very thankful for this trust and commitment and pledge to do our best to be worthy of them in the future. Carl Hauser, PhD G. Gage Kingsbury, PhD September, 2008

11 CHAPTER 1 Introduction Most NWEA assessments that provide estimates of student achievement share a common characteristic: each provides an estimate of a student s position on an underlying, unidimensional scale of achievement called a RIT scale. The exceptions are the skills tests associated with the NWEA primary grades assessments. While it is common to refer to the RIT scale it is important to understand that there are five separate RIT scales in use, one each for the content domains of reading, language usage, mathematics, general science topics, and science concepts and processes. Each scale was developed using modern measurement theory in the forms of item response theory and the Rasch model, in particular. The scales were developed independent of grade level structure and therefore do not rely on student grade level for their meaning. Each scale uses RITs as a common metric to convey the domain s continuum of content difficulty (from the most basic and simple to very advanced and complex) as well as the student s position with respect to that content (i.e., achievement level). This allows scores to be interpreted in terms of instructional content: what the student grasps well, what the student is in the process of learning, and what the student has not yet reached. The grade level independence of RIT scales allows these determinations to be domain-centric; a particular score on one of the scales carries essentially the same meaning in terms of the student s status within the domain regardless of the student s grade level. Ingebo (1997) demonstrated how these scales each serve as a constant against which achievement status and growth can be judged. When taken in concert, these features establish a rich environment for the sensitive measurement of achievement status and growth along a continuum of instruction within a domain. A fundamental question asked of any test result is, What does it mean? Of course the answer is not always as straightforward as the question itself. The meaning of a test score is defined by the real-world references to which the score is linked. Assessments based on RIT scales allow direct references to curricular content and content standards. They also allow the comparison of a student s position to predefined performance standards. While such references are often sufficient for launching instruction, schools and school districts may need to reference local student performance against a larger population of students. The NWEA RIT Scale Norms grew out of this need. By describing how students from many schools performed when measured on NWEA assessments, schools and districts will have reasonable ways to compare a single student, a school or the entire district to that of a much larger, meaningful group a norm sample. This document provides this type of information.

12 2 NWEA 2008 RIT Scale Norms Norming studies carried out by many test publishers often follow a common pattern. A single test or perhaps several parallel forms of a single test are administered to and scaled using samples of students from a selected population. A key outcome of these studies is often a set of auxiliary scores such as percentile ranks or normal curve equivalent (NCE) scores. When parallel forms are involved, these auxiliary scores apply to the reference test as well as its parallel versions. Norms developed in this manner might be referred to as test or form-centric. That is, the test or test form and the scale on which performance is reported are inextricably linked; scale score references apply only to the reference test form and its associated parallel forms. NWEA uses a somewhat different approach. All of the tests used in this study were Measures of Academic Progress (MAP) assessments. All tests were built by NWEA staff and are similar in length and content within a domain. However, unlike traditional tests, each student is administered a test with items chosen for the student as the test progresses. Individual tests are constructed by selecting items from banks of Raschcalibrated items. As a student proceeds through a test, the difficulty of items presented is adapted to the student s level of performance on all previous items. This has the effect of maximizing the information in the test score. Since each test item has been calibrated to the same scale using item response theory, scores from different tests in the same domain can be interpreted in the same manner; all test scores refer to the same underlying scale. Therefore, this document describes norms related to scales of measurement rather than norms related to an individual test or set of test forms. The resulting set of norms is independent of the specific content of any particular test. Updates to test content can be made with the confidence that percentile ranks, and status and growth norms, will still be useful after the update. A discussion of measurement aspects of MAP assessments is included in the Technical Manual for the NWEA Measures of Academic Progress (NWEA, 2008).

13 NWEA 2008 RIT Scale Norms 3 CHAPTER 2 Procedures for Developing the Norms Sample Development General Criteria for Inclusion of Candidate Test Event Records and Test Histories Grades 2 through 11 Standard MAP. Test event records from all NWEA MAP test events that had been administered as part of a school or school district sponsored testing program in fall 2002 through fall 2007 were considered as candidates for inclusion in the study. The content domains for these tests included reading, language usage, mathematics, science concepts and processes, and general science topics. All tests were built around a multi-goal structure that was based on a set of content standards. In most instances, the content standards from the state in which the test was to be administered were used to structure the test s content. Mathematics test content was partitioned into blocks that were bounded by the content specifications of multiple grades. The objectives and skills that defined these boundaries were those provided in the state s content specifications for grades 2 through 5 and for grades 6 and above. When a student s instruction fell outside this basic structure, other MAP tests were used (such as MAP for primary grades tests and upper level mathematics tests. In order to be included in the candidate pool for the study, each test record from a standard MAP test was required to meet all of several conditions. These conditions were imposed to ensure that test scores were valid and could be associated with segments of the U.S. school age population in terms of ethnic background and school level poverty. The conditions also ensured that the test events could be tied to the amount of instructional time that had elapsed from the beginning of the school year or from the last test event for the student prior to taking the test of interest. The specific criteria required that each standard MAP test record included data to verify that the test event: Represented a complete, normally terminated test administration Was one for which at least 40% of the items were answered correctly a fairly liberal lower limit for computerized adaptive tests Came from a school for which the instructional calendar was known Came from a school which administered the MAP tests to at least 80% of its students who were in the same grade in the same testing time frame (term)

14 4 NWEA 2008 RIT Scale Norms Came from a school that was included in the National Center for Education Statistics 2005 Common Core of Data and had non-missing school level information concerning the number of students who were eligible to receive free and reduced priced lunches Came from a school where the specific MAP test was not part of the school s first term of administering the test Was the more precise (lower standard error of measurement) of two test events in the same domain area that were administered within a three-week period of one another Included the student s ethnic code on the test event record. Early primary grades. MAP test events from the early primary grades test, MAP-PG, were included in the study using different criteria than those used for standard MAP tests. Different criteria were used for several reasons. Initially, there were many fewer test events available for inclusion; MAP-PG only had adequate numbers of test events from fall 2006 through spring Given this limited time frame, the necessity of district calendar information became moot. MAP-PG tests also involve different test administration procedures than standard MAP tests. Tests in each domain (reading and mathematics) are administered in two 30-item parts to accommodate younger students attention levels. Due to the two-part nature of the tests, procedures used for calculating the overall score for the domain in MAP-PG tests are somewhat different than those used for standard MAP tests. For a standard MAP test, domain scores can be made available immediately after the student has responded to the last item. For the MAP-PG tests, only the partial-domain score can be made available since only half the goal areas in the domain are sampled in a single test. To compute the RIT score for the entire domain (referred to as the combined score ), both partial-domain tests must be completed within 28 days of one another. A separate scoring process is then used to combine the response vectors from each partial test to score them as a single test. The resulting total score is different than a simple average of the two partial domain scores except in the special case of when a student correctly answers exactly 50% of the items on each of the two partial-domain tests. Only entire domain RIT scores are used in this study. The conditions set for standard MAP test events were relaxed for MAP-PG tests. Criteria for including MAP-PG test events were simply that the event: Represent a complete (both parts with a single score), normally terminated test administration Be one for which at least 35% of the items were answered correctly Come from a school which administered the same test to at least 80% of its students who were in the same grade in the same testing time frame (term)

15 NWEA 2008 RIT Scale Norms 5 Was the more precise (lower standard error of measurement) of two test events in the same domain area that were administered within a three-week period of one another All MAP-PG test records that met these criteria from fall 2006, winter 2007, and spring 2007 were included in the study. Test Event Samples for Norms Reading, language usage, and mathematics from standard MAP Grades Test event samples were created using a two-stage process. The first stage involved using school district calendar information to make estimates of instructional time that had elapsed between the start of a school year and a test event or between two test events. In the second stage, student ethnic background information and their school s reported level of free and reduced price lunch eligibility were used to stratify the samples in proportion to the U.S. school age population. This stratification was only applied to the creation of status norms. Growth norms used all test events that met the general criteria cited above and were one of at least three test events for the same student in the same content domain. Estimates of instructional days. Instructional time was an essential element in the creation of both status norms and growth norms for students who were administered standard MAP in grades 2 through 11. The initial unit of time was an instructional day. Each district s unique calendar for the academic year served as an anchor. Individual district (or school) calendar information was extracted from data provided from Mountain Measurement, Inc. and the Rentrak Corporation. This information was most frequently collected using personal interviews of district level personnel. For each district, the first day of the school year, the last day of the school year, and each week day between these two days for which no instruction was provided were identified. Days that represented unanticipated instructional or non-instructional days (e.g. school closure for an emergency) were not captured. All schools in the district were assigned the resulting schedule unless they were identified as operating on a known alternative schedule. In those cases the alternative schedule was assigned to the school. The process for identifying alternative schedules was inconsistent and is a known source of variability in the instructional time data. While we suspect that the magnitude of this variability is comparatively small, a formal estimate of its true magnitude was not done and therefore represents a limitation in this study. Each district s (or school s) base calendar data were used to extrapolate school calendars for preceding school years, back to the fall of This was carried out under the assumption that the pattern of instructional and non-instructional days within an academic year would be relatively stable from year to year. Instructional days were numbered sequentially by date for each district beginning with the first instructional day in fall of the academic year and ending with the

16 6 NWEA 2008 RIT Scale Norms last instructional day in November The resulting vector of numbered instructional dates was referenced by test administration date to index the number of instructional days between test events at the individual student level. These intervals were converted to weeks to make interpretation easier. Using the instructional days data, time frames for beginning of year tests, middle of year tests, and end of year tests were established. All test events that met the initial selection criteria and were from 2006 and 2007 were used. The variable of interest here was the number of instructional days that had elapsed from the first day of the school year for the student test taker until the day the student took the test of interest. The mean and variance of this variable were calculated for each grade level in each content area and in each of the commonly used nominal testing seasons: fall, winter, and spring. Results of these calculations were used to establish assessment time frame criteria. The centers of these time frames were roughly 20 days, 89 days, and 153 days from the beginning of the academic year of the student s school for the fall, winter, and spring terms respectively. A 1.25 standard deviation band placed around these means resulted in approximately 79% of the test events in each season being included in an interval which ranged from 5.4 to 6.0 instructional weeks, depending on season, content area and grade level. These time frames were considerably narrower than those used in previous NWEA norms studies, but were considered sufficiently wide to represent schools most commonly used testing periods. Each test event included in the status norms was included in one of the three time frames. Test events falling outside their criterion time frame (based on season, content area, and grade level) were excluded from consideration in the status norms. Stratification. The second part to creating status norms samples used the National Center for Education Statistics (NCES) Common Core of Data Public School Universe (CCD-PSU) 2006 dataset. The CCD-PSU was queried across schools by grade level to determine: a) which third of the grade level distribution of student free and reduced price lunch eligibility each school-grade combination fell into; and b) the school s enrollment in each of five ethnic groups (Native American, African American, Asian, Hispanic, and European American [Caucasian]). The combination of these variables formed a 15 cell (3-FRL X 5-ethnic groups) matrix at each grade, 2 through 11. These matrices were used to determine the cell proportions of test events at each grade level that would comprise the sample for the status norms. The resulting cell proportions are given in Table 2.1. Each test event included in the status norms was included in one of these cells.

17 NWEA 2008 RIT Scale Norms 7 Table 2.1 Target Cell Proportions for Reading, Language Usage, and Mathematics Status Norms Samples Grade Low Free/Reduced Lunch Percent (0-33%) Native American Asian African American Hispanic European American Medium Free/Reduced Lunch Percent (34-66%) Native American Asian African American Hispanic European American High Free/Reduced Lunch Percent (67-100%) Native American Asian African American Hispanic European American Final sampling was carried out at each combination of the school year part (i.e., beginning, middle, and end to replace fall, winter, and spring, respectively), content area, and grade level. For each cell, j, in a grade, g, the ratio of the target proportion, p t.j, (from Table 1) to the corresponding observed proportion, p obs, (from the candidate pool of test events) was calculated. The cell resulting in the maximum ratio across all cells in the grade was taken as the reference cell for the grade, j. ref. The observed number of test events, n j.ref, in j. ref, was divided by the reference cell target proportion,., to determine the grade level sample size, N g :. /.. For example, at the beginning of the year in grade 2 reading, the maximum ratio across cells was found in the cell defined by Asian students in the high category of percentage of free and reduced lunch eligibility. The maximum ratio was 4.37 and the number of test events in this cell was for this cell was Therefore,

18 8 NWEA 2008 RIT Scale Norms The sample size for each cell in the grade, n j, was then determined simply by General science and science concepts and processes. Samples for general science and science concepts and processes were drawn from the fall and spring testing seasons of 2006 and Test events included were only required to meet the general criteria for inclusion listed above and the time frame criterion. All of these test events were used for status norms. Stratification based on ethnic background and school level percentage of free and reduced priced lunch eligibility was not a part of the selection strategy for status norms. The candidate pool size and structure could not support this procedure. Development of Growth Norms As was the case for the development of status norms, the structure of data available for samples varied according to how the test was delivered (Standard MAP or MAP-PG) and the content area involved. Reading, language usage, and mathematics tests taken in standard MAP were the only test events that were part of individual student test histories extending back to fall General science and science concepts and processes test events extended back as far as the fall 2004 testing season. MAP-PG used only test events from the school year. It is important to note that the unit of time used in estimating growth in early primary grades and in science was the testing season while it was instructional weeks for reading, language usage, and mathematics in grades 2 through 10. Reading, language usage, and mathematics. Test events were included in samples for growth norms when they met all of the general criteria for inclusion in the study and were one of at least three test events in a student s test history in the same content domain. Samples were formed from these test histories at each grade level, where the grade level identifying the test history was always the grade level of the first test event in the test history or a part of the test history. Individual test histories were parceled out in two ways. First, a test event that took place in the first half of the school year was allowed to be the first test event in a test history represented at a grade level as long as at least two test events from the same student followed it. This same form of parceling was done for test events that took place in the second half of the school year. The result of these two procedures resulted in a great deal of overlap of parts of test histories across the span of the content area. An example of how a portion of this overlap may look is provided in Figure 2.1. Shown in Figure 2.1 are four hypothetical test histories from students A through D. Test events are represented as dots. Student A s history could be used in beginning-of-year and end-of-year grade

19 NWEA 2008 RIT Scale Norms 9 level growth norms for grades 4-7. Student B s history could only be used for the end-of-year grade level growth norms for grades 6 and 7. Similarly, the test histories for students C and D could only be used in beginning-of-grade growth norms for grades 6 and 9, respectively. Stated slightly differently, the inclusion of a test history or a portion of a test history in grade level growth norms was based only on when in the school year (beginning or end) the first test event occurred and how many test events followed it; the particular academic year of the test event was irrelevant. Academic Year Instructional weeks B D A C Figure 2.1. Hypothetical test histories. Grade Level Growth estimates. Different methods were used to estimate growth. The method used was dictated by the data available. For reading, language usage, and mathematics test events that came from standard MAP tests, a growth modeling approach was used in order to take advantage of the longitudinal nature of the data. For general science and science concepts and processes as well as for MAP-PG, the method used in the 2005 norming study was used; the difference in observed RIT scores was used as the best available estimate for achievement growth. For reading, language usage, and mathematics, achievement was modeled as a function of instructional time. Only simple, unconditional linear and quadratic multilevel growth models were used (Raudenbush & Bryk, 2002; Singer & Willett, 2003). These models took their customary forms, specifically: Level 1: Level 2:,

20 10 NWEA 2008 RIT Scale Norms for the linear model, and Level 1: Level 2:,, for the quadratic model. In these models, Level 1 contains all test events, t, making up the test history for individual student, i, as well as the measurement error,, the difference between the growth trajectory and the observed score for test event t. Each test history is nested in a student at Level 2. represents the number of instructional weeks that a test event, t, is from the beginning of the test history or test history segment that is being used. The π 0i parameter represents the initial level of achievement and the remaining π s represent the change in Y that is associated with each unit change in W. The terms,, and are the means of the intercept, and the linear and quadratic change parameters, respectively. The r s are the random effects associated with a particular student in the estimation of their respective growth parameter, π. Growth estimates were calculated for each grade level in a content area using both the linear and the quadratic models cited above. For these estimates at each grade, the first test event for a student that occurred in the target grade and all subsequent test events for the student were used. The same procedure was repeated using the last test event for a student that occurred in the target grade as the initial reference. Growth estimates were also calculated for successive starting RIT values across the achievement range for each grade. This involved an iterative process within the grade level. Each iteration included the following steps: 1. Select all test histories (or partial histories) where the initial RIT was in a specific five-point range. The middle value in this range is taken as the reported value. 2. Calculate both the linear and the quadratic growth models (as defined above) on the selected test histories, when the grade level was below 9; for grades 9 and 10, calculate only the linear model. 3. For grade levels below 9, capture the parameter estimates for both models as well as the significance tests, supporting each parameter s inclusion in the model.

21 NWEA 2008 RIT Scale Norms For grade levels below 9, determine if there is an instructional week value, W, that will yield Y ti(l) = Y ti(q). If so, capture that value and assign it to the variable cross-over-point (XOP). 5. Increment by 1 RIT the last five-point range used and repeat steps 1 through 4 until all test histories are exhausted or there are insufficient data for stable estimates. A typical summary of steps 1 through 4 is depicted in Figure 2.2. The figure shows the linear and the quadratic growth models for more than 18,000 grade 4 students. All students began their 4 th grade year at RIT levels of 200 though 204 (202 reported). All students had at least three test events in the portion of their test history that began at the beginning of their 4 th grade year. The value for XOP was approximately 97 instructional weeks. Approximately 56% of the test records used to model growth in Figure 2.2 had 3 to 5 test events. The number of test events that was most commonly associated with XOP was 6-7. Test records that had test events occurring beyond XOP represented about 20% of the total. The instructional-weeks distribution is shown by the number of test occasions in Table Grade 4 Mathematics - Initial RIT = Growth Instructional Weeks Figure 2.2. Typical summary of a single iteration in the model estimation process. The decision of which model to use for each initial RIT in the analyses for grades 2 through 8 was made in a straightforward manner. When the coefficient,, for the quadratic term resulted in a value where its test of significance was t > 2, the quadratic model was used; otherwise the linear

22 12 NWEA 2008 RIT Scale Norms model was used. This decision led to modeled growth estimates that were more consistent (than linear models ) with observed growth estimates for instructional-week time periods corresponding to common term-to-term growth. When an XOP had been identified and W > XOP, the growth estimate was determined by averaging the corresponding estimates from the linear and the quadratic models. In Figure 1.2, this is represented as a dashed line. This was done to mitigate the uncertainty of model selection with increasingly sparse numbers of test events that occurred further away from the initial test event. Table 2.2 Median, Mean, and Standard Deviation (SD) of Instructional Weeks by Total Number of Test Events in Student Test Histories - Grade 4 MATHEMATICS - Initial RIT 202 Test Events Instructional Weeks Median Mean SD N Students Percent Students , , , , , , , , , General science, science concepts and processes, and MAP-PG. The inclusion of test events was carried out following essentially the same procedures as for reading, language usage, and mathematics using MAP tests. There were, however, two exceptions: 1) the testing season structure of test events was used rather than beginning of year vs. end of year structure, and 2) a test event for a student was only required to be followed by a second test event in the same domain with at least one test season separating the two events rather than requiring the first test event in a test history or history segment to be followed by at least two additional test events.

23 NWEA 2008 RIT Scale Norms 13 Growth estimates. For the two science areas, only the term in which a student took a test and the student s grade in that term were used to organize growth calculations; the year in which the student was in the grade was dropped from consideration. Therefore, if two students had spring grade 4 tests, but one test was taken in spring 2005 and the other was spring 2007, and both students had test scores in the same content area from the previous fall, both would have been included in the fall-to-spring growth estimates. Fall to spring, fall to fall, spring to spring, and spring to fall growth estimates were calculated in both areas. For each comparison, the growth estimate was determined by subtracting the first observed score from the second observed score. All growth estimates are based on intact sub-samples of students (students who had both test scores). As a result, it is reasonable to expect the grade level growth estimates to differ somewhat from differences in grade level mean scores from the total sample. The nature of these differences varies. As a step toward detailing these differences, tables reporting growth include the mean initial RIT score for students in each intact group involved in the comparison. These procedures were also used for the MAP-PG test events, except that all test events came from the academic year.

24 14 NWEA 2008 RIT Scale Norms

25 NWEA 2008 RIT Scale Norms 15 CHAPTER 3 Norming Samples Characteristics The study utilized several samples of student test records. The largest samples for status norms were drawn from a pool of candidate records of reading, language usage, and mathematics tests administered in standard MAP to students in grades 2 through 11. This candidate pool consisted of 2,914,096 individual students from 6,905 schools in 1,123 districts located in 42 states. All test events in this pool met all the general criteria for inclusion in the study. The vast majority (93%) of these students tested in two or more content areas. Methods for assembling these samples were presented in the previous chapter. Samples to estimate growth in reading, language usage, and mathematics from standard MAP tests were comprised of test events that met all the general criteria for inclusion. Further sampling was not carried out. These samples included test records from 2,242,544 students who had taken a minimum of three tests in a content area. These students attended 6,752 of the 6,905 schools contributing to the status samples. Across all three content areas, these students completed 28,183,770 test events. The average numbers of test events in student test histories in each content area were about six in grades 2 through 5, five in grade 6, and four in grades 7 through 9. Nearly twice as many students had test histories in reading and mathematics as in language usage. Alternative methods were used for creating samples for science and primary grades reading and mathematics. Test records from students in grades 2 through 10 who took general science and science concepts and processes in the falls and springs of 2006 and 2007 were used if their test events met the general criteria for inclusion. Further sampling was not carried out. Growth in science was estimated from the test events of 299,085 individual students from 1,348 schools in 346 districts located in 34 states. In both content areas (general science and science concepts and processes) these students completed 1,677,988 test events. For MAP-PG, samples for status norms were formed from the candidate pool as described in the previous chapter. Further sampling was not carried out. In primary grade reading and mathematics, fall-to-spring growth was estimated for 17,551 students in kindergarten and grade 1. These students came from 151 school districts located in 28 states. The sizes of the candidate pool at each grade level for all content areas are presented in Tables 3.1a and 3.1b for beginning of the year tests and end of the year tests, respectively. Corresponding sample sizes created at each grade level and for all content areas appear in Tables 3.2a and 3.2b.

26 16 NWEA 2008 RIT Scale Norms Table 3.1a Candidate Test Record Pool Available for Beginning of Year Status Norms* Language Grade Reading Usage Mathematics K 3,999 4, ,505 14, ,757 91, , , , , , , , , , , , , , , , , , , , , , , ,300 54,378 96, ,257 14,333 21,880 General Science Science Concepts & Processes 3,966 4,289 16,653 17,503 19,805 17,887 22,420 20,497 24,285 20,692 35,703 29,976 36,952 31,852 11,346 10,656 5,343 4,655 Totals 2,445,689 1,269,307 2,428, , ,007 * Candidate records were those that met all selection criteria, including being in the beginning of year time frame defined for the grade level and subject. Table 3.1b Candidate Test Record Pool Available for End of Year Status Norms* Language Grade Reading Usage Mathematics K 15,851 15, ,402 23, , , , , , , , , , , , , , , , , , , , , , ,796 85, , ,070 40,663 76, ,858 11,138 15,264 General Science Science Concepts & Processes 6,161 5,470 17,319 22,441 17,677 26,727 20,434 20,111 20,271 19,207 18,053 17,357 33,477 28,972 4,538 6,276 2,243 2,050 Totals 2,414,670 1,324,798 2,361, , ,611 * Candidate records were those that met all selection criteria, including being in the end of year time frame defined for the grade level and subject.

27 NWEA 2008 RIT Scale Norms 17 Table 3.2a Sample Sizes for Beginning of Year Status Norms by Subject and Grade Level Language Grade Reading Usage Mathematics K 3,999 4, ,505 14, ,365 14,385 47, ,953 20, , ,121 26, , ,500 28, , ,034 35, , ,154 38, , ,863 37, , ,237 11,922 33, ,622 6,413 17, ,986 2,925 4,239 General Science Science Concepts & Processes 3,966 4,289 16,653 17,503 19,805 17,887 22,420 20,497 24,285 20,692 35,703 29,976 36,952 31,852 11,346 10,656 5,343 4,655 Totals 778, , , , ,007 Table 3.2b Sample Sizes for End of Year Status Norms by Subject and Grade Level Grade Reading Language Usage Mathematics General Science Science Concepts & Processes K 15,851 15, ,402 23, ,704 17,078 58,135 6,161 5, ,037 22, ,543 17,319 22, ,033 27, ,778 17,677 26, ,246 25, ,962 20,434 20, ,376 37, ,027 20,271 19, ,117 35, ,580 18,053 17, ,185 34, ,872 33,477 28, ,892 8,791 48,425 4,538 6, ,154 1,942 19,784 2,243 2, ,503 1,461 1,982 Totals 763, , , , ,611

28 18 NWEA 2008 RIT Scale Norms Tables 3.3 through 3.6 provide additional demographic information about the samples used for determining status norms for reading, language usage, and mathematics in grades 2 through 11. These tables are broken out by the organizational entity s (i.e. district or school) locale classification. They compare information about district or school level variables between the status norms samples and the National Center for Education Statistics Common Core of Data (NCES- CCD). The data presented for status norms samples were compiled after the stratified sampling process. Table 3.3 compares the NCES-CCD and the status norms sample in terms of the numbers and percentages of districts present in each locale category. The average enrollment in each locale category is presented. The table shows that the districts represented in the norming sample were generally close in percentage terms to the NCES-CCD set. Notable differences were in the overrepresentation in the norms sample of districts in large and midsize cities, in fringe of large cites, and in small towns. The norms sample under-represented school districts in rural areas. In terms of enrollment, those districts in the norms sample located in large cities had more than twice the average enrollment than corresponding districts from the NCES-CCD set. In all other locale classifications, except midsize cities, the average enrollment of districts represented in the norming sample was slightly larger than districts in the NCES-CCD set. Corresponding information at the school level is presented in Table 3.4. Table 3.4 shows that the percentage of schools represented in the norming sample in each locale classification is generally consistent with the corresponding percentages in the NCES-CCD set. The notable exception is in the large city locale classification where the percentage of schools in the norming sample was less than half the percentage of schools in the NCES-CCD set. The proportional representations of Title 1 eligible schools in the norming sample and in the NCES- CCD set are provided in Table 3.5. Differences in percentages between locale classifications showed no consistent differences in size or direction. Average enrollment in Title 1 eligible schools in the norming sample in larger and midsize cities in metropolitan areas was generally lower than Title 1 eligible schools in the NCES-CCD set. In other locale classifications, average enrollments were comparable. Table 3.6 compares the percentages of Federal Free and Reduced Priced Lunch Program (FRPL) participation of schools represented in the norming study and those in the NCES-CCD set, again by locale classification. With the exception of the urban fringe of midsize city classification, FRPL participation of schools represented in the norming sample was generally slightly lower than in the NCES-CCD set.

29 NWEA 2008 RIT Scale Norms 19 Table 3.3 Local School District Information from NCES Common Core of Data and from Status Norms Candidate Pool (Across Reading, Language Usage, and Mathematics) by District Locale Classification Districts Represented in Status Norms Sample Districts Represented in NCES Common Core of Data* Locale classification Number of districts Percentage of districts Average enrollment Number of districts Percentage of districts Average enrollment Unknown , ,811 Large City >= 250, , ,631 Midsize < 250,000 principal city of a metropolitan core , ,858 Urban fringe of a large city ,709 2, ,103 Urban fringe of a midsize city ,053 1, ,058 Large town , ,907 Small town ,590 1, ,268 Rural, outside of a metropolitan and micropolitan cores ,087 5, Rural, inside a metropolitan core ,719 2, ,815 * Source: National Center for Education Statistics Common Core of Data - Local Education Agency Universe Dataset for

30 20 NWEA 2008 RIT Scale Norms Table 3.4 Local Regular and Alternative School Information from NCES Common Core of Data and from Status Norms Samples (Across Reading, Language Usage, and Mathematics) by School Locale Classification Schools Represented in Status Norms Sample Schools Represented in NCES Common Core of Data* Locale classification Number of schools Percentage of schools Average enrollment Number of schools Percentage of schools Average enrollment Unknown , Large City >= 250, , Midsize < 250,000 principal city of a metropolitan core 1, , Urban fringe of a large city 1, , Urban fringe of a midsize city , Large town Small town , Rural, outside of a metropolitan and micropolitan cores 1, , Rural, inside a metropolitan core 1, , * Source: National Center for Education Statistics Common Core of Data - Public School Universe Dataset for

31 NWEA 2008 RIT Scale Norms 21 Table 3.5 Title 1 Eligible Local Regular and Alternative Schools in the NCES Common Core of Data and in the Status Norms Candidate Pools (Across Reading, Language Usage, and Mathematics) by School Locale Classification Schools Represented in Status Norms Sample Schools Represented in NCES Common Core of Data* Locale classification Schools reporting Percentage Title 1 Average Enrollment Title 1 Schools reporting Percentage Title 1 Average Enrollment Title 1 Unknown , Large City >= 250,000 Midsize < 250,000 principal city of a metropolitan core Urban fringe of a large city Urban fringe of a midsize city , , , , , , Large town Small town , Rural, outside of a metropolitan and micropolitan cores Rural, inside a metropolitan core 1, , , , All locales 6,761 93,354 * Source: National Center for Education Statistics Common Core of Data - Public School Universe Dataset for

32 22 NWEA 2008 RIT Scale Norms Table 3.6 Free and Reduced Price Lunch Program Participation by Local Regular and Alternative Schools in the NCES Common Core of Data and in the Status Norms Candidate Pools (Across Reading, Language Usage, and Mathematics) by School Locale Classification Schools Represented in Status Norms Sample Schools Represented in NCES Common Core of Data* Locale classification Schools reporting Average percentage of students Schools reporting Average percentage of students Unknown Large City >= 250, , Midsize < 250,000 principal city of a metropolitan core Urban fringe of a large city Urban fringe of a midsize city , , , , Large town Small town , Rural, outside of a metropolitan and micropolitan cores Rural, inside a metropolitan core 1, , , , All locales 6,387 84,023 * Source: National Center for Education Statistics Common Core of Data - Public School Universe Dataset for Geographic characteristics Breakdowns of the states represented in the status norms are presented in Tables 3.7 and 3.8, for tests taken at the beginning and at the end of the school year, respectively. A total of 1,123 school districts are represented in these two tables. Western states (west of and including Montana, Wyoming, Colorado and New Mexico) are the home for 22% of these districts. Roughly 34% are in eastern states (east of and including Michigan, Indiana, Kentucky, Tennessee and Mississippi). The remaining 44% are in the 13 states that are in the mid-section of the country. Figure 3.1 illustrates the geographical distribution of these districts.

33 NWEA 2008 RIT Scale Norms 23 Table 3.7 Number of Students in the Norming Sample by State by Grade and Across Subjects for Tests Taken in the Beginning of Year Time Frame of the School Year Estimated Grade % of state 2-10 State Total enrollm't* District Schools AR , , , AZ 941 2,064 1,906 2,478 2,435 2,307 2, , CA 2,152 2,875 3,012 3,194 3,277 2,321 2, , CO 1,763 4,405 4,858 4,903 5,114 5,429 5,934 3,012 2,404 37, DE 781 1,285 1,249 1,481 1,649 2,024 1, , FL , GA 1,213 1,656 1,862 1,949 2,003 1,624 2, , IA 155 1,619 1,836 1,836 2,215 2,480 2,725 1,250 1,162 15, ID 694 1,419 1,382 1,759 1,903 2,326 2,171 1,318 1,305 14, IL 2,577 7,321 7,038 8,138 8,720 8,380 8, , IN 3,195 4,967 5,092 5,609 6,623 5,310 7,763 2, , KS 1,344 6,810 7,446 7,681 8,712 7,899 8,000 3,681 3,170 54, KY 1,038 1,614 1,423 1,819 1, ,197 1, , MA ,287 1,281 1,367 1,371 1, , MD 0 1,808 2,165 2,680 2,911 2,668 2, , ME 232 1,285 1,397 1,723 1,647 1,783 2, , MI 1,110 1,666 1,988 2,163 2,131 1,935 2,239 1, , MN 5,166 9,689 10,256 10,690 12,489 11,453 10,820 3,892 1,865 76, MO MT , NC , ND 351 1,145 1,160 1,293 1, , , NE , NH ,624 1,201 1, , NJ ,118 1,356 1,127 1, , NM 664 2,856 2,960 3,052 3,559 3,455 3,967 1, , NV , NY , OH , OK OR PA , SC 13,161 18,723 19,166 18,990 21,149 21,693 21,895 9,269 2, , TN , TX 0 1,789 1,807 2,230 1,942 2,168 1, , UT VT WA 1,144 4,476 5,403 5,443 6,022 5,451 6,283 2,751 1,475 38, WI 1,892 3,670 4,336 4,694 4,495 4,473 4,596 2, , WY , Total 43,915 90,268 95, , , , ,116 40,962 21, , ,232 * Estimated from U.S. Department of Education, National Center for Education Statistics, Common Core of Data, Public School Universe Table, limited to the school year.

34 24 NWEA 2008 RIT Scale Norms Table 3.8 Number of Students in the Norming Sample by State by Grade and Across Subjects for Tests Taken in the End of Year Time Frame of the School Year Estimated Grade % of state 2-10 State Total enrollm't* Districts Schools AR , , , AZ 1,017 1,723 1,546 2,109 1,668 1,686 2, , CA 1,750 2,338 2,572 2,604 2,488 2,332 1, , CO 4,777 7,469 7,560 7,824 7,749 7,475 7,308 3,421 1,804 55, CT , DE 1,108 1,636 1,760 1,967 2,268 2,136 2, , FL , GA 1,041 1,250 1,426 1,497 1,225 1,476 1, , IA 634 1,932 2,171 2,197 2,355 2,835 2,731 1,536 1,379 17, IL 5,160 9,277 8,690 10,519 10,161 10,149 8,559 1, , IN 7,416 9,801 10,179 10,644 11,320 10,378 11,630 3,799 1,166 76, KS 2,095 6,871 6,980 7,220 7,505 7,387 6,976 5,405 3,187 53, KY 1,331 1,503 1,120 1,282 1, , , MA 1,141 1,262 1,075 1,163 1, , , MD 0 1,989 2,264 2,873 2,822 2,585 2, , ME 505 2,009 1,695 2,286 2,113 2,446 2,821 1, , MI 2,872 3,956 4,092 4,660 4,928 4,365 4,397 1, , MN 6,800 10,134 10,463 11,004 10,772 10,453 10,048 4,652 1,942 76, MO , MT , NC , ND 755 1,397 1,289 1,412 1,186 1,147 1, , NE , NH 760 1,657 1,877 1,951 2,687 2,267 2, , NJ 376 1,774 1,790 1,678 1,862 1,846 1, , NM 2,115 3,956 3,830 3,996 4,423 4,464 4,592 1, , NV , NY , OH , OK OR PA , RI SC 11,283 12,724 12,238 12,036 12,493 12,274 12,734 5, , TN , TX , UT VA , VT WA 1,700 4,759 5,158 4,932 5,318 4,076 4,962 2,176 1,252 34, WI 2,745 4,620 5,049 5,989 6,083 5,545 5,796 2, , WY Total 61, , , , , , ,964 41,741 19, , ,053 5,651 * Estimated from U.S. Department of Education, National Center for Education Statistics, Common Core of Data, Public School Universe Table, limited to the school year.

35 NWEA 2008 RIT Scale Norms 25 Figure 3.1. Geographic distribution of school districts in the status norms sample. Ethnic characteristics Samples for reading, language usage, and mathematics status norms for grades 2 through 11 all conformed to the stratification table (ethnic category X school level category of free and reduced price lunch eligibility) provided in Table 2.1. The largest percentage difference observed between the norm study and the corresponding NCES information within a grade and subject area was less than 1%. Ethnic characteristics for norms samples in all content areas including science concepts and processes as well as general science are provided in tables 3.9a through 3.9e. These tables illustrate that the norm samples for reading, language usage, and mathematics closely match the national ethnic percentages at each grade level, even though school level percentage of free and reduced price lunch categories were used to construct these tables. For the two science areas tables 3.9d and 3.9e reveal that Asian/Pacific Islander and Hispanic students were slightly underrepresented in the study while African American and European American students were slightly over-represented. Native American students were well represented. The overall differences in proportions were generally reflected at the individual grade levels. The exceptions were for Native

36 26 NWEA 2008 RIT Scale Norms American students in grades 2, 7, 9, and 10; Hispanic students in grades 9 and 11; and African American students in grades 9 through 11. Table 3.9a Sample Size (N)*, Sample Percentage of Ethnic Group Representation for READING and National School Age Percentages of Ethnic Group Representation by Grade Level Ethnic Group Native American/ Alaskan Native Asian/Pacific Islander African American Hispanic European American Grade N 1,127 2,302 2,519 2,768 2,936 2,675 2, Sample % National % N 4,361 9,177 9,975 10,800 10,993 9,646 9,490 3,401 1, Sample % National % N 16,401 33,853 36,182 40,505 43,141 39,539 38,130 14,963 6,521 1,161 Sample % National % N 20,643 41,834 44,073 47,855 48,438 42,429 40,322 15,484 6,853 1,196 Sample % National % N 52, , , , , , ,447 45,295 23,155 4,678 Sample % National % * Based on beginning-of-year and end-of-year samples only. Table 3.9b Sample Size (N)*, Sample Percentage of Ethnic Group Representation for LANGUAGE USAGE and National School Age Percentages of Ethnic Group Representation by Grade Level Ethnic Group Native American/ Alaskan Native Asian/Pacific Islander African American Hispanic European American Grade N Sample % National % N 1,443 2,021 2,568 2,488 3,298 3,287 3, Sample % National % N 5,428 7,453 9,315 9,333 13,023 13,477 12,710 3,867 1, Sample % National % N 6,831 9,209 11,349 11,022 14,529 14,462 13,442 4,004 1, Sample % National % N 17,388 23,954 30,747 30,369 41,360 42,600 41,820 11,709 4,989 2,740 Sample % National % * Based on beginning-of-year and end-of-year samples only.

37 NWEA 2008 RIT Scale Norms 27 Table 3.9c Sample Size (N)*, Sample Percentage of Ethnic Group Representation for MATHEMATICS and National School Age Percentages of Ethnic Group Representation by Grade Level Ethnic Group Native American/ Alaskan Native Asian/Pacific Islander African American Hispanic European American Grade N 1,255 2,439 2,500 2,734 2,986 2,725 2,886 1, Sample % National % N 4,857 9,723 9,903 10,668 11,180 9,823 10,299 3,483 1, Sample % National % N 18,258 35,863 35,915 40,008 44,151 40,264 41,379 15,328 6,332 1,040 Sample % National % N 22,981 44,315 43,753 47,269 49,256 43,206 43,757 15,859 6, Sample % National % N 58, , , , , , ,140 46,392 22,488 3,840 Sample % National % * Based on beginning-of-year and end-of-year samples only. Table 3.9d Sample Size (N)*, Sample Percentage of Ethnic Group Representation for GENERAL SCIENCE and National School Age Percentages of Ethnic Group Representation by Grade Level Grade Ethnic Group Native American/ Alaskan Native Asian/Pacific Islander African American Hispanic European American N Sample % National % N 266 1,317 1,454 1,589 1,546 1,879 1, Sample % National % N 1,859 5,921 6,448 7,501 8,026 10,157 12,917 2,453 1,206 Sample % National % N 1,952 7,298 7,854 8,843 8,953 8,990 10,924 3,190 1,382 Sample % National % N 5,958 19,034 21,276 24,409 25,489 32,115 43,837 9,717 4,671 Sample % National % * Based on fall and spring samples

38 28 NWEA 2008 RIT Scale Norms Table 3.9e Sample Size (N)*, Sample Percentage of Ethnic Group Representation for SCIENCE CONCEPTS and PROCESSES and National School Age Percentages of Ethnic Group Representation by Grade Level Grade Ethnic Group Native American/ Alaskan Native Asian/Pacific Islander African American Hispanic European American N Sample % National % N 292 1,627 1,843 1,642 1,565 1,968 1, Sample % National % N 1,964 7,726 8,707 7,771 8,113 10,578 12,786 2,708 1,289 Sample % National % N 2,223 8,464 9,035 9,135 9,051 9,549 11,293 4,083 1,398 Sample % National % N 6,287 24,835 28,737 25,289 25,768 33,436 43,247 12,364 4,733 Sample % National % * Based on fall and spring samples

39 NWEA 2008 RIT Scale Norms 29 CHAPTER 4 Grade Level Growth and Status Norms Over the past several years, student achievement has been increasingly viewed from two relevant and complimentary perspectives: achievement status and achievement growth. When both are considered in concert, important insight into understanding a student s achievement can be gained. Knowing a student s achievement status at a particular point in time is, at best, interesting. Viewing that same status estimate as one of several obtained over multiple time points (i.e. growth) adds, at least, a sense of progress. Similarly, the trajectory of a student s status estimates from over multiple time points can be captivating, but without using one of the estimates as a reference for the trajectory, little more than directional meaning is provided. In each case, additional references are needed to make sense of the data. Details of the knowledge and skills associated with an achievement estimate from MAP, such as those provided by DesCartes (NWEA, 2005), can provide important points of references for instructional purposes. In addition, normative information can provide references to which individual and group achievement estimates can be compared for decision making or understanding. Normative references to student performance within grade levels are the focus of this chapter. Status norms. In the context of a grade level, status norms allow us to answer some basic and commonly asked questions. These include: How does the reading performance of our 7 th grade students compare to the performance of 7 th grade students from across the country who took the same test at approximately the same time? Is the average performance of this group in the upper quartile of students in the same grade across the country? Is this student s performance relative to the norming sample consistent across domains? Discussions of performance standards can involve status norms to determine their likely impact or their observed impact when incorporated into an accountability system the difference between desired performance (a standard) and observed performance (a norm). Typical questions that status norms can help to answer include: If a performance standard is set at a RIT = X and our students performance is consistent with the norming sample, what percentage of our students is likely to not meet the standard?

40 30 NWEA 2008 RIT Scale Norms How does the difficulty of the grade 3 performance standards compare to the difficulty of the grade 5 performance standards? Are performance standards consistently difficult moving from grade to grade? How far is this score from a particular performance standard? Growth norms. When successive status estimates from the same student are treated as a unit and aggregated across students sharing a common characteristic (e.g. the same grade level), norms of the rate of movement (i.e., growth) can be calculated. Questions that can be addressed by growth norms are numerous and include, among others: If this group of students progresses like similar students in the norming sample, what is its mean achievement level likely to be in 32 instructional weeks? In 74 instructional weeks? Are all students in this class progressing at a rate similar to students in the norm sample who began instruction at about the same instructional level? How much growth is reasonable to expect from students in a particular grade level over two years of instruction? Has the rate of growth for this cohort of students changed in a pattern that is consistent with the rates from the norming samples? Growth in relation to performance standards allows insight into the standards themselves as well as the movement of students in relation to the standards. When the standards are defined on or transformed onto a RIT scale, many growth and standards related questions can be addressed. For example: Given the current performance levels of our grade 4 students, what percentage of them are likely to meet the grade 6 performance standard? How do the two-year projected performance levels of our students relate to expected performance levels? Do the relationships differ by grade level? By content area? For students not expected to grow enough to meet a particular performance standard, how many more weeks of instruction would be required to bring them to the level of the performance standard? When performance standards between contiguous grades are not consistently difficult, how will the growth expectations for lower performing students need to change if performance expectations are to be met? Grades 2 through 11 Standard MAP Reading, language usage, and mathematics. The procedures for determining grade level growth and status norms were described in Chapter 2. Results from those procedures for students in grades 2 through 11 who took reading, language usage, and mathematics appear in Tables 4.1 through 4.3, respectively. Each of these tables is presented as three sub-tables, a through c.

41 NWEA 2008 RIT Scale Norms 31 The a sub-tables contain the grade level growth means for four instructional periods: a) beginning of the year to middle of the year, b) beginning of the year to end of the year, c) beginning of the year to beginning of the following year, and d) end of the year to end of the next year. The lengths of periods were based on a traditional U.S. school year of 180 days. It should be noted that the values appearing in the standard deviation (SD) columns in these tables represent the variance of the individual student growth trajectories. This represents a change from the 2005 norming study where the standard deviations were representing the variance of the change scores. When compared to the corresponding values from the 2005 norming study, those from the current study are typically 30% to 60% higher. Grade level status norms appear in the b sub-tables. Norms were computed for the beginning of the year ( 1-7 instructional weeks), the middle of the year ( instructional weeks), and ( instructional weeks). The data in these tables represent the performance of students in the random samples that were stratified as described in Chapter 2. The c sub-tables provide indices of the impact of the sample stratification procedures relative to the sample creation procedures used in the 2005 study. Differences are presented for the grade level means, medians, and numbers of students between the data in the a sub-tables and their corresponding values from samples of students if the sample creation procedures from the 2005 norming study had been followed. Negative values in these tables indicate that the value from the 2008 study was that amount lower than if the sample creation procedure from the 2005 study had been used. Upper level mathematics (Algebra 1, Geometry, and Algebra 2) results are included in Tables 4.4a and 4.4b. The records selected for these tables were not subjected to the same stratification procedures used for Tables 4.1 through 4.3. Tables 4.4a and 4.4b differ only with respect to when in the school year tests were taken. Table 4.4a restricted the test events to those from tests taken at the end of the school year and in the time frame used to define the end-of-year period ( instructional weeks). Test events summarized in Table 4.4b were allowed to take place any time during the school year. Table 4.4b likely includes test events that were administered as a pretest prior to formal instruction in the test content. While this practice is discouraged, it does happen. The 2005 norming study did not include a specific procedure to minimize the impact of pretests on the norms. This is the most likely explanation for the slightly higher norms observed in this study (Table 4.4a) compared to the 2005 upper level mathematics norms.

42 32 NWEA 2008 RIT Scale Norms Table 4.1a. READING - Modeled Growth Means, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), Numbers of Students (N), and Mean Observed Initial RIT for Four Instructional Time Periods Beginning of Year to Middle of Year ( 17 weeks) Beginning of Year to End of Year ( 32 weeks) Beginning of Year to Beginning of Next Year ( 36 weeks) End of Year to End of Next Year ( 36 weeks) Mean Mean Mean Mean Ending Initial Initial Initial Initial Grade Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Table 4.1b. READING - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level of Test Events Randomly Selected from Ethnicity and School Level Poverty Stratification Cells and Meeting the Beginning, Middle and End of Year Time Frame Criteria Beginning-of-Year Middle-of-Year End-of-Year Grade Median Mean SD N Median Mean SD N Median Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,503 * Values based on fewer than 2000 cases are shaded. Exercise caution when using these values. Table 4.1c. READING - Differences In Medians, Means and Numbers of Students Between Stratified Samples and the Student Candidate Pool as It Would Have Been Included Using the 2005 Norm Study Inclusion Criteria (Stratified Sample Minus Candidate Pool Values) Beginning-of-Year Middle-of-Year End-of-Year Grade Median Mean N Median Mean N Median Mean N , , , , , , , , , ,

43 NWEA 2008 RIT Scale Norms 33 Table 4.2a. LANGUAGE USAGE - Modeled Growth Means, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), Numbers of Students (N), and Mean Observed Initial RIT for Four Instructional Time Periods Beginning of Year to Middle of Year( 17 weeks) Beginning of Year to End of Year ( 32 weeks) Beginning of Year to Beginning of Next Year( 36 weeks) End of Year to End of Next Year ( 36 weeks) Mean Mean Mean Mean Ending Initial Initial Initial Initial Grade Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Table 4.2b. LANGUAGE USAGE - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level of Test Events Randomly Selected from Ethnicity and School Level Poverty Stratification Cells and Meeting the Beginning, Middle and End of Year Time Frame Criteria Beginning-of-Year Middle-of-Year End-of-Year Grade Median Mean SD N Median Mean SD N Median Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,461 * Values based on fewer than 2000 cases are shaded. Exercise caution when using these values. Table 4.2c. LANGUAGE USAGE - Differences In Medians, Means and Numbers of Students Between Stratified Samples and the Student Candidate Pool as It Would Have Been Included Using the 2005 Norm Study Inclusion Criteria (Stratified Sample Minus Candidate Pool Values) Beginning-of-Year Middle-of-Year End-of-Year Grade Median Mean N Median Mean N Median Mean N , , , , , , , , , ,

44 34 NWEA 2008 RIT Scale Norms Table 4.3a. MATHEMATICS - Modeled Growth Means, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), Numbers of Students (N), and Mean Observed Initial RIT for Four Instructional Time Periods Beginning of Year to Middle of Year( 17 weeks) Beginning of Year to End of Year ( 32 weeks) Beginning of Year to Beginning of Next Year( 36 weeks) End of Year to End of Next Year ( 36 weeks) Mean Mean Mean Mean Ending Initial Initial Initial Initial Grade Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Table 4.3b. MATHEMATICS - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level of Test Events Randomly Selected from Ethnicity and School Level Poverty Stratification Cells and Meeting the Beginning, Middle and End of Year Time Frame Criteria Beginning-of-Year Middle-of-Year End-of-Year Grade Median Mean SD N Median Mean SD N Median Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , Interpolated ,982 * Values based on fewer than 2000 cases are shaded. Exercise caution when using these values. Table 4.3c. MATHEMATICS - Differences In Medians, Means and Numbers of Students Between Stratified Samples and the Student Candidate Pool as It Would Have Been Included Using the 2005 Norm Study Inclusion Criteria (Stratified Sample Minus Candidate Pool Values) Beginning-of-Year Middle-of-Year End-of-Year Grade Median Mean N Median Mean N Median Mean N , , , , , , , , ,

45 NWEA 2008 RIT Scale Norms 35 Table 4.4a UPPER LEVEL MATHEMATICS - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) for NWEA End-of-Course MATHEMATICS Tests from Within the End-of-Year Time Frame Criterion* Mathematics Course** Median Mean SD N Algebra ,583 Geometry ,218 Algebra ,678 * Sample was not stratified based on ethnicity or school level percentage of free & reduced price lunches. ** Based exclusively on spring test events - More likely that test events were End-of-Course tests. Table 4.4b UPPER LEVEL MATHEMATICS - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) for NWEA End-of-Course MATHEMATICS Tests Included Under Same Criteria Used for the 2005 Norming Study Mathematics Course* Median Mean SD N Algebra ,423 Geometry ,797 Algebra ,726 **based exclusively on fall and spring test events - Greater likelihood of End-of-Course test being used as a "pre test". General science, and science concepts and processes. Norms for growth and status in the science areas were developed using different procedures than those used for reading, language usage, and mathematics. The differences are spelled out in Chapter 2. Tables 4.5 and 4.6 contain the norms for general science and science concepts and processes, respectively. In this case, the a sub-tables refer to observed growth for fall-to-spring, fall-to-fall, and spring-to-spring comparisons.

46 36 NWEA 2008 RIT Scale Norms Table 4.5a GENERAL SCIENCE TOPICS - Observed Growth Means, Standard Deviations (SD), Numbers of Students (N) in the Intact Group, Mean Initial RIT Fall-to-Spring Fall-to-Fall Spring-to-Spring Mean Mean Mean Ending Initial Initial Initial Grade Mean SD N RIT Mean SD N RIT Mean SD N RIT , , , , , , , , , , , , , , , , , , , , , , , , , , Table 4.5b GENERAL SCIENCE TOPICS - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level and Part of the Year Beginning-of-Year Middle-of-Year (Interpolated) End-of-Year Grade Median Mean SD N Median Mean SD Median Mean SD N , , , , , , , , , , , , , , , , , ,243

47 NWEA 2008 RIT Scale Norms 37 Table 4.6a SCIENCE CONCEPTS AND PROCESSES - Observed Growth Means, Standard Deviations (SD), Numbers of Students (N) in the Intact Group, Mean Initial RIT Fall-to-Spring Fall-to-Fall Spring-to-Spring Mean Mean Mean Ending Initial Initial Initial Grade Mean SD N RIT Mean SD N RIT Mean SD N RIT , , , , , , , , , , , , , , , , , , , , , , , , Table 4.6b SCIENCE CONCEPTS AND PROCESSES - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level and Part of the Year Beginning-of-Year Middle-of-Year (Interpolated) End-of-Year Grade Median Mean SD N Median Mean SD Median Mean SD N , , , , , , , , , , , , , , , , , ,227

48 38 NWEA 2008 RIT Scale Norms Grades Kindergarten and 1 MAP for Primary Grades. Growth norms for primary grade students are provided in Table 4.7. This table is unique in that it includes the first time we have observed growth from a lower grade, kindergarten here, in which growth was lower than in the succeeding grade. While we are not sure of the factors contributing to this shift, our primary hunch is that testing in kindergarten is much less uniform than it is in other grades. Differences in school schedules, school structures, and student levels of independence and maturity seem like plausible avenues for further exploration in this area. Kindergarten and first grade status norms appear in Table 4.8. The pattern of grade level to grade level means and medians is similar to other NWEA status norms. Within grade, however, the pattern for kindergarten for both content areas shows a positive skew for all three parts of the year. This is unusual for NWEA computerized adaptive tests and may suggest a lower operational boundary for this version of the MAP for Primary Grades tests. Table 4.7 Observed Fall to Spring Growth Means, Standard Deviations (SD), Numbers of Students (N) and Mean Initial RITs For Primary Grade Students Reading Mean Grade Mean sd N Initial RIT K , , Mathematics K , , Table 4.8 Primary Grade - Medians, Means, Standard Deviations (SD) and Numbers of Students (N) by Grade Level and Part of the Year Reading Beginning-of-Year Middle-of-Year End-of-Year Grade Median Mean SD N Median Mean SD N Median Mean SD N K , , , , , ,109 Mathematics Beginning-of-Year Middle-of-Year End-of-Year Grade Median Mean SD N Median Mean SD N Median Mean SD N K , , , , , ,672

49 NWEA 2008 RIT Scale Norms 39 CHAPTER 5 Growth by RIT Level Achievement growth norms were first reported at a finer grain size than grade level in the 2002 RIT Scale Norms (NWEA, 2002). This was the first step in extending our understanding of patterns of academic growth beyond general observations of changes in growth (e.g., students lower in achievement status tend to grow more than their peers who are higher in achievement status) toward greater descriptive detail. The RIT Block Norms from 2002 framed relationships between growth and initial status where initial status was partitioned into 10 RIT point blocks. This procedure is repeated in this study for the growth of early primary grade students in reading and mathematics. The relatively small numbers of test records at these grade levels precluded a finer level of detail. A further step was taken in the 2005 RIT Scale Norms (NWEA, 2005) by using a RIT point perspective to detail relationships between initial status and growth. In addition to giving detail about how initial score status is related to growth, the 2005 study also illustrated how the nature of the relationship between initial status and growth changes with changes in position on the RIT scale, content area and grade level. This approach is repeated in this study for general science topics and science concepts and processes. In the current study we have taken a step toward moving the 2005 study focus on RIT point growth a bit further. This step was prompted by the realization of two important and related limitations to the 2005 presentation of RIT point growth. The first is that growth was calculated using two data points: a fall score and a spring score, two fall scores, or two spring scores. The particular scores contributing to any score difference could have come from any point within the testing seasons of fall or spring. These two seasons are each roughly 14 weeks long. Thus, a student s change could, at the extremes, be based on two tests separated by as few as 8 weeks and as many as 35 weeks in a fall-to-spring comparison. This adds variability into the season-to-season mean differences. More importantly, however, it also undermines interpretation when attempting to evaluate a student s growth when the growth is based on a very different time span. The second limitation is a natural result of the first. By limiting estimates to two terms of data, we lose our ability to make longer term estimates to determine reasonable growth expectations over time periods spanning multiple terms. To set a multi-year growth expectation for a student, we are faced with basing part of the expectation on empirical data from the norms and then deciding how the norms should be used to set expectations for subsequent years to arrive at a single

50 40 NWEA 2008 RIT Scale Norms expectation. Norms for subsequent years might be treated additively, differentially weighted, or treated idiosyncratically based on past performance or substantive theory. Unfortunately, the 2005 RIT point growth norms provide little guidance for choosing from among these alternatives. To address these limitations for reading, language usage, and mathematics in grades 2 through 10, we implemented the two level growth models outlined in Chapter 2. As noted, this approach only included individual test histories with a minimum of three test events. The advantage to this approach is that growth is modeled as a function of instructional weeks. The instructional-weeks information present in each student s test history was used to calculate the mean growth rate for students starting out at the same RIT level (± 2 RITs). Using this approach, reasonable growth expectations for students in grades 2 through 8 can be calculated for any number of instructional weeks that is less than or equal to three full instructional years ( 108 instructional weeks). For students in grades 9 and 10, 72 instructional weeks is a more likely maximum period. For the Tables 5.1a through 5.27b, presented below, mean growth for each RIT value is presented for 17, 32, and 36 instructional weeks from the beginning of the instructional year as well as for 36 instructional weeks from the end of the instructional year to the end of the next instructional year. RIT point growth norms for general science topics and for science concepts and processes are presented in Tables 5.28 through It is important to note that these norms are based on a sparser data set than those for reading, language usage, and mathematics. This is due to differences in testing practices for science between districts, schools, and grade levels. The structure of the available data did not lend itself to modeling growth as a function of instructional weeks. For this reason, RIT point growth was calculated as it was in the 2005 norming study as the mean difference between tests from two testing seasons when the test scores from the first testing season were constrained to be equal. The limitation of the data set is obvious in the tables; only fall-to-spring and fall-to-fall comparisons could be computed. Finally, the relationship between initial status and growth for early primary grade students is detailed using the RIT Block Norms procedure developed for the 2002 norms. These norms are presented in Table 5.46 below.

51

52 42 NWEA 2008 RIT Scale Norms Table 5.1a READING Grade 2 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,682 * Starting RIT score is always based on grade 2 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

53 NWEA 2008 RIT Scale Norms 43 Table 5.1b READING Grade 2 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,407 * Starting RIT score is always based on grade 2 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

54 44 NWEA 2008 RIT Scale Norms Table 5.2a READING Grade 3 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,834 * Starting RIT score is always based on grade 3 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

55 NWEA 2008 RIT Scale Norms 45 Table 5.2b READING Grade 3 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,081 * Starting RIT score is always based on grade 3 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

56 46 NWEA 2008 RIT Scale Norms Table 5.3a READING Grade 4 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,468 * Starting RIT score is always based on grade 4 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

57 NWEA 2008 RIT Scale Norms 47 Table 5.3b READING Grade 4 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,084 * Starting RIT score is always based on grade 4 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

58 48 NWEA 2008 RIT Scale Norms Table 5.4a READING Grade 5 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,789 * Starting RIT score is always based on grade 5 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

59 NWEA 2008 RIT Scale Norms 49 Table 5.4b READING Grade 5 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,231 * Starting RIT score is always based on grade 5 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

60 50 NWEA 2008 RIT Scale Norms Table 5.5a READING Grade 6 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,742 * Starting RIT score is always based on grade 6 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

61 NWEA 2008 RIT Scale Norms 51 Table 5.5b READING Grade 6 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,055 * Starting RIT score is always based on grade 6 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

62 52 NWEA 2008 RIT Scale Norms Table 5.6a READING Grade 7 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,662 * Starting RIT score is always based on grade 7 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

63 NWEA 2008 RIT Scale Norms 53 Table 5.6b READING Grade 7 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,666 * Starting RIT score is always based on grade 7 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

64 54 NWEA 2008 RIT Scale Norms Table 5.7a READING Grade 8 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,201 * Starting RIT score is always based on grade 8 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

65 NWEA 2008 RIT Scale Norms 55 Table 5.7b READING Grade 8 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,010 * Starting RIT score is always based on grade 8 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

66 56 NWEA 2008 RIT Scale Norms Table 5.8a READING Grade 9 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,424 * Starting RIT score is always based on grade 9 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

67 NWEA 2008 RIT Scale Norms 57 Table 5.8b READING Grade 9 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,358 * Starting RIT score is always based on grade 9 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

68 58 NWEA 2008 RIT Scale Norms Table 5.9a READING Grade 10 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,644 * Starting RIT score is always based on grade 10 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

69 NWEA 2008 RIT Scale Norms 59 Table 5.9b READING Grade 10 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,172

70 60 NWEA 2008 RIT Scale Norms Table 5.10a LANGUAGE USAGE Grade 2 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,275 * Starting RIT score is always based on grade 2 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

71 NWEA 2008 RIT Scale Norms 61 Table 5.10b LANGUAGE USAGE Grade 2 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,217 * Starting RIT score is always based on grade 2 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

72 62 NWEA 2008 RIT Scale Norms Table 5.11a LANGUAGE USAGE Grade 3 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,465 * Starting RIT score is always based on grade 3 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

73 NWEA 2008 RIT Scale Norms 63 Table 5.11b LANGUAGE USAGE Grade 3 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,908 * Starting RIT score is always based on grade 3 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

74 64 NWEA 2008 RIT Scale Norms Table 5.12a LANGUAGE USAGE Grade 4 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,719 * Starting RIT score is always based on grade 4 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

75 NWEA 2008 RIT Scale Norms 65 Table 5.12b LANGUAGE USAGE Grade 4 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,331 * Starting RIT score is always based on grade 4 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

76 66 NWEA 2008 RIT Scale Norms Table 5.13a LANGUAGE USAGE Grade 5 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,680 * Starting RIT score is always based on grade 5 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

77 NWEA 2008 RIT Scale Norms 67 Table 5.13b LANGUAGE USAGE Grade 5 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,974 * Starting RIT score is always based on grade 5 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

78 68 NWEA 2008 RIT Scale Norms Table 5.14a LANGUAGE USAGE Grade 6 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,884 * Starting RIT score is always based on grade 6 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

79 NWEA 2008 RIT Scale Norms 69 Table 5.14b LANGUAGE USAGE Grade 6 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,849 * Starting RIT score is always based on grade 6 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

80 70 NWEA 2008 RIT Scale Norms Table 5.15a LANGUAGE USAGE Grade 7 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,803 * Starting RIT score is always based on grade 7 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

81 NWEA 2008 RIT Scale Norms 71 Table 5.15b LANGUAGE USAGE Grade 7 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,436 * Starting RIT score is always based on grade 7 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

82 72 NWEA 2008 RIT Scale Norms Table 5.16a LANGUAGE USAGE Grade 8 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,349 * Starting RIT score is always based on grade 8 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

83 NWEA 2008 RIT Scale Norms 73 Table 5.16b LANGUAGE USAGE Grade 8 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,004 * Starting RIT score is always based on grade 8 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

84 74 NWEA 2008 RIT Scale Norms Table 5.17a LANGUAGE USAGE Grade 9 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,678 * Starting RIT score is always based on grade 9 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

85 NWEA 2008 RIT Scale Norms 75 Table 5.17b LANGUAGE USAGE Grade 9 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,969 * Starting RIT score is always based on grade 9 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

86 76 NWEA 2008 RIT Scale Norms Table 5.18a LANGUAGE USAGE Grade 10 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,417 * Starting RIT score is always based on grade 10 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

87 NWEA 2008 RIT Scale Norms 77 Table 5.18b LANGUAGE USAGE Grade 10 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 10 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

88 78 NWEA 2008 RIT Scale Norms Table 5.19a MATHEMATICS Grade 2 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,619 * Starting RIT score is always based on grade 2 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

89 NWEA 2008 RIT Scale Norms 79 Table 5.19b MATHEMATICS Grade 2 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* * (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,491 * Starting RIT score is always based on grade 2 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

90 80 NWEA 2008 RIT Scale Norms Table 5.20a MATHEMATICS Grade 3 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,795 * Starting RIT score is always based on grade 3 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

91 NWEA 2008 RIT Scale Norms 81 Table 5.20b MATHEMATICS Grade 3 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,582 * Starting RIT score is always based on grade 3 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

92 82 NWEA 2008 RIT Scale Norms Table 5.21a MATHEMATICS Grade 4 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,389 * Starting RIT score is always based on grade 4 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

93 NWEA 2008 RIT Scale Norms 83 Table 5.21b MATHEMATICS Grade 4 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,012 * Starting RIT score is always based on grade 4 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

94 84 NWEA 2008 RIT Scale Norms Table 5.22a MATHEMATICS Grade 5 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,691 * Starting RIT score is always based on grade 5 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

95 NWEA 2008 RIT Scale Norms 85 Table 5.22b MATHEMATICS Grade 5 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,412 * Starting RIT score is always based on grade 5 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

96 86 NWEA 2008 RIT Scale Norms Table 5.23a MATHEMATICS Grade 6 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,424 * Starting RIT score is always based on grade 6 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

97 NWEA 2008 RIT Scale Norms 87 Table 5.23b MATHEMATICS Grade 6 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,098 * Starting RIT score is always based on grade 6 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

98 88 NWEA 2008 RIT Scale Norms Table 5.24a MATHEMATICS Grade 7 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,540 * Starting RIT score is always based on grade 7 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

99 NWEA 2008 RIT Scale Norms 89 Table 5.24b MATHEMATICS Grade 7 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,577 * Starting RIT score is always based on grade 7 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

100 90 NWEA 2008 RIT Scale Norms Table 5.25a MATHEMATICS Grade 8 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,172 * Starting RIT score is always based on grade 8 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

101 NWEA 2008 RIT Scale Norms 91 Table 5.25b MATHEMATICS Grade 8 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,195 * Starting RIT score is always based on grade 8 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

102 92 NWEA 2008 RIT Scale Norms Table 5.26a MATHEMATICS Grade 9 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT score is always based on grade 9 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

103 NWEA 2008 RIT Scale Norms 93 Table 5.26b MATHEMATICS Grade 9 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT score is always based on grade 9 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

104 94 NWEA 2008 RIT Scale Norms Table 5.27a MATHEMATICS Grade 10 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* Beginning of Yr to Middle of Yr ( 17 Instr'l Wks) Beginning of Yr to End of Yr ( 32 Instr'l Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT score is always based on grade 10 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

105 NWEA 2008 RIT Scale Norms 95 Table 5.27b MATHEMATICS Grade 10 Growth Estimates, Standard Deviation of Individual Test Scores Around Growth Trajectories (SD), and Numbers of Students (N) for Four Comparison Intervals* (cont.) Beginning of Yr to Beginning of Next Yr ( 36 Instr'l. Wks) End of Yr to End of Next Yr ( 36 Instr'l. Wks) Start Start Start Start RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N RIT Estimate SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,288 * Starting RIT Level is always based on grade 10 performance. Shaded cells indicate that starting RIT was outside the percentile range of Italicized estimates are based on fewer than 1,000 cases and should be used with caution.

106 96 NWEA 2008 RIT Scale Norms Table 5.28 GENERAL SCIENCE TOPICS - Grade 2 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N * Starting RIT Level is always based on grade 2 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

107 NWEA 2008 RIT Scale Norms 97 Table 5.29 GENERAL SCIENCE TOPICS - Grade 3 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 3 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

108 98 NWEA 2008 RIT Scale Norms Table 5.30 GENERAL SCIENCE TOPICS - Grade 4 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 4 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

109 NWEA 2008 RIT Scale Norms 99 Table 5.31 GENERAL SCIENCE TOPICS - Grade 5 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 5 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

110 100 NWEA 2008 RIT Scale Norms Table 5.32 GENERAL SCIENCE TOPICS - Grade 6 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 6 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

111 NWEA 2008 RIT Scale Norms 101 Table 5.33 GENERAL SCIENCE TOPICS - Grade 7 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,435 * Starting RIT Level is always based on grade 7 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

112 102 NWEA 2008 RIT Scale Norms Table 5.34 GENERAL SCIENCE TOPICS - Grade 8 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 8 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

113 NWEA 2008 RIT Scale Norms 103 Table 5.35 GENERAL SCIENCE TOPICS - Grade 9 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 9 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

114 104 NWEA 2008 RIT Scale Norms Table 5.36 GENERAL SCIENCE TOPICS - Grade 10 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 10 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

115 NWEA 2008 RIT Scale Norms 105 Table 5.37 SCIENCE CONCEPTS and PROCESSES - Grade 2 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N * Starting RIT Level is always based on grade 2 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

116 106 NWEA 2008 RIT Scale Norms Table 5.38 SCIENCE CONCEPTS and PROCESSES - Grade 3 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 3 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

117 NWEA 2008 RIT Scale Norms 107 Table 5.39 SCIENCE CONCEPTS and PROCESSES - Grade 4 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 4 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

118 108 NWEA 2008 RIT Scale Norms Table 5.40 SCIENCE CONCEPTS and PROCESSES - Grade 5 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 5 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

119 NWEA 2008 RIT Scale Norms 109 Table 5.41 SCIENCE CONCEPTS and PROCESSES - Grade 6 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 6 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

120 110 NWEA 2008 RIT Scale Norms Table 5.42 SCIENCE CONCEPTS and PROCESSES - Grade 7 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,117 * Starting RIT Level is always based on grade 7 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

121 NWEA 2008 RIT Scale Norms 111 Table 5.43 SCIENCE CONCEPTS and PROCESSES - Grade 8 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 8 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

122 112 NWEA 2008 RIT Scale Norms Table 5.44 SCIENCE CONCEPTS and PROCESSES - Grade 9 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , * Starting RIT Level is always based on grade 9 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

123 NWEA 2008 RIT Scale Norms 113 Table 5.45 SCIENCE CONCEPTS and PROCESSES - Grade 10 Term-to-Term Mean Growth, Standard Deviations (SD) and Numbers of Students (N) by Starting RIT* Fall-to-Spring Fall-to-Fall Start Start Start Start RIT Mean SD N RIT Mean SD N RIT Mean SD N RIT Mean SD N , , , , , , , , , , , , * Starting RIT Level is always based on grade 10 performance. Shaded cells indicate that starting RIT was outside the percentile range of 1-99.

124 114 NWEA 2008 RIT Scale Norms

125 NWEA 2008 RIT Scale Norms 115 CHAPTER 6 Achievement Growth: Observed and Modeled Sound estimates of achievement growth depend on a few fundamental elements. These include: A measurement scale that spans the time period of interest (commonly called a vertical scale). The five major RIT scales are examples of this element. Test content that is at least aligned to the content standards of the declared curriculum, if not the instructed curriculum Tests that minimize the measurement error in each score. Kingsbury and Hauser (2004) suggest that when the ratio of the standard error of measurement to the standard deviation does not exceed.3, a reasonable level of precision is present in the measure. A standardized method of accounting for time between measurement occasions. Ideally, the method would establish a unit of time that best balances intuitive interest of consumers of the data, the amount of change that might be expected within a single unit, and flexibility of use. These elements certainly don t guarantee growth will be estimated well, but in the absence of any one of them, the quality of estimates will be adversely affected, regardless of the analytic model used to make them. As indicated earlier, this study departs from the practice in previous NWEA norm studies of using simple difference scores as estimates of student growth. Rather growth norms were created by modeling changes on achievement as a function of time. The impetus for this shift owes nothing to historic and persistent arguments that difference scores are inherently unreliable and unfair. These arguments were revealed as longstanding myths by the work of Rogosa (1995) and his colleagues Rogosa & Willett (1983), Rogosa, Brant, & Zimowski (1982). What did provide impetus for this shift was foreshadowed in Chapter 5: change over two points in time limits our frame of reference for using growth estimates in establishing growth expectations for individual students and in evaluating growth against those expectations. When student achievement growth is aggregated for a group of students sharing a general proxy variable for experience, such as grade level, it normally shows a positive trend over a single year and over multiple years. It is very unusual when large samples are involved for a later grade to show more growth than the grade immediately preceding it. While we understand, intellectually,

126 116 NWEA 2008 RIT Scale Norms that this is not always true for individual students, it can be difficult to explain when we see much less growth one year than they made the previous year. In fact, both positive and negative changes in growth and achievement estimates for individual students are commonplace in most large datasets of longitudinal student performance. This is illustrated well in Figure th percentile 50 th percentile 85 th percentile Instructional Weeks Figure 6.1. Reading growth trajectories for students ending grade 2 in three different positions in the achievement distribution. Shown in the figure are the score histories of 300 students. All students represented were randomly selected from those who had a RIT score in at the end of grade 2 that was within two RIT points of either the 15 th, 50 th, or 85 th percentile rank in reading. In addition, each student had a beginning-of-year and an end-of-year test score for all grades 3 through 6. Instructional weeks were determined in the manner described in Chapter 2. The small rectangle close to the Y-axis of each panel represents the range of initial RIT scores from the end of grade 2. The trend line in each panel is the result of fitting a quadratic growth model to the data. In each panel there is an ascending growth curve across the 140 instructional week period. If we focus in each panel on the differences from the green box to the points located at zero instructional weeks, we see the great variability in scores from the end of one school year to the beginning of the next; a variability that is largely mimicked throughout the panel. Moving across panels, from lower to higher levels of initial status, the occasion-to-occasion variability shrinks somewhat.

127 NWEA 2008 RIT Scale Norms 117 However in each panel, for any period that would constitute an instructional year of, for example, 36 instructional weeks, see a great deal of variability. What do we gain (or lose) by modeling achievement growth rather than using observed difference scores? In view of the variability of scores from occasion to occasion, this is a reasonable question. Modeling growth over longer time spans will not reduce variability in longitudinal data but it will help us to determine growth expectations and evaluate growth as students move through the time span. Using the growth trajectories of many students who have been grouped within a small range of initial RIT scores, the growth models determine the rate at which students grow across the range of interest. Similar to the use of two points in time difference scores to determine mean growth for a group of similar students, the use of growth models assumes that the group of students used to determine growth estimates for the norms will grow in much the same way that future students who share the same characteristics will grow. The differences lie in the data used to make the estimates and how time is used. In terms of results, the difference score method yields a single estimate while the modeled growth method can yield an estimate for any number of time units (instructional weeks) from the beginning of the time span of interest. It is also reasonable to ask what differences would be expected between the two methods. The panels in Figures 6.2 through 6.4 provide these differences for reading, language usage and mathematics, respectively. The observed growth appearing in each panel was calculated by subtracting the initial RIT score from the score from the same student that appeared 24 to 35 weeks later. The growth model was the one used for the same grade and target RIT level that was presented in the beginning of yr to end of yr ( 32 instructional weeks) sub-tables presented in Chapter 5. The series labeled 2005 are the RIT point growth norms taken from the 2005 norm study. The figures all show a similar picture. When the initial RIT scores are at the lower achievement levels in the earlier grades, growth estimates based on differences scores (observed) are slightly higher than the corresponding (same initial RIT level) estimates that are based on the growth model. The 2005 series tend to fall between the observed and the modeled series and tend to be more consistent with the modeled growth series. For initial RIT scores that are much higher in the grade level, modeled growth series tend to be slightly higher than the observed series but there is little consistency in the differences between these series and the 2005 series. We can point to several factors that are likely to have contributed to these differences. Chief among them are: Test inclusion criteria the criteria in this study were more restrictive in terms of individual score validity than was the case in the 2005 study

128 118 NWEA 2008 RIT Scale Norms 25 Grade 2 Reading 20 Grade 3 Reading 32 week change Observed Modeled week change Observed Modeled Initial RIT Initial RIT 20 Grade 4 Reading 20 Grade 5 Reading 32 week change Observed Modeled week change Observed Modeled Initial RIT Initial RIT 20 Grade 6 Reading 20 Grade 7 Reading 32 week change Observed Modeled week change Observed Modeled Initial RIT Initial RIT Figure 6.1. Observed and modeled growth in reading by grade and initial RIT score.

129 NWEA 2008 RIT Scale Norms Grade 8 Reading 20 Grade 9 Reading 32 week change Observed Modeled week change Observed Modeled Initial RIT Initial RIT 20 Grade10 Reading 32 week change Observed Modeled Initial RIT Figure 6.1 (cont.) Observed and modeled growth in reading by grade and initial RIT score. Time between test occasions the time between test occasions used to calculate difference scores in this study was held to a fairly narrow range (24 to 35 weeks), while in the 2005 study the actual time between test occasions was left free to vary and was bound only by occurrence in a term Sampling differences the current study was restricted to school districts with known calendars; this was not a restriction for the 2005 study Modeling growth when time is included as a continuous factor in a growth model, growth estimates will be based on the relationship between achievement estimates and time. The further test series extend beyond the initial test, the more likely it is that the (modeled) growth estimates will differ from (shorter term) observed growth estimates. Differences are likely to be greatest when the data are initially most variable (lower initial RIT scores).

130 120 NWEA 2008 RIT Scale Norms 25 Grade 2 Language Usage 20 Grade 3 Language Usage 32 week change Observed Modeled week change Observed Modeled Initial RIT Initial RIT 20 Grade 4 Language Usage 20 Grade 5 Language Usage 32 week change Observed Modeled week change Observed Modeled Initial RIT Initial RIT 20 Grade 6 Language Usage 20 Grade 7 Language Usage 32 week change Observed Modeled week change Observed Modeled Initial RIT Initial RIT Figure 6.2. Observed and modeled growth in language usage by grade and initial RIT score.

131 NWEA 2008 RIT Scale Norms Grade 8 Language Usage 20 Grade 9 Language Usage 32 week change Observed Modeled week change Observed Modeled Initial RIT Initial RIT 20 Grade 10 Language Usage 32 week change Observed Modeled Initial RIT Figure 6.2 (cont.) Observed and modeled growth in language usage by grade and initial RIT score.

132 122 NWEA 2008 RIT Scale Norms 25 Grade 2 Mathematics 25 Grade 3 Mathematics 32 week change Modeled Observed week change Modeled Observed Initial RIT Initial RIT 25 Grade 4 Mathematics 25 Grade 5 Mathematics 20 Modeled 20 Modeled 32 week change Observed week change Observed Initial RIT Initial RIT 25 Grade 6 Mathematics 25 Grade 7 Mathematics 20 Modeled 20 Modeled 32 week change Observed week change Observed Initial RIT Initial RIT Figure 6.3. Observed and modeled growth in mathematics by grade and initial RIT score.

133 NWEA 2008 RIT Scale Norms Grade 8 Mathematics 25 Grade 9 Mathematics 32 week change Modeled Observed week change Modeled Observed Initial RIT Initial RIT 15 Grade 10 Mathematics 32 week change Modeled Observed Initial RIT Figure 6.3 (cont.) Observed and modeled growth in mathematics by grade and initial RIT score. Conclusions. The method of estimating achievement growth in this study represents a difference in the way growth norms are calculated. It will have a slight impact on how growth expectations (targets) are set at the individual student level. As such, it will also affect the criteria used to evaluate individual student level growth. This change in growth norms estimation will not affect the way student growth is represented in daily practice; growth will continue to be based on differences between observed RIT scores for the foreseeable future. However, this change requires a minor adjustment in thinking about individual student growth and holds some important advantages. These include: Meeting or exceeding an individual growth expectation (target) indicates that the student s growth met or exceeded the long term growth trend of students in the same grade and who started at the same RIT level. Whether or not the result is adequate is still a judgment best

134 124 NWEA 2008 RIT Scale Norms left to the student and the student s teacher. Individual, aspirational growth goals are still encouraged. Previous suggestions regarding the plausibility of unexpected growth (NWEA, 2005) remain unchanged. Further, the suggestion for evaluating odd or unexpected growth using the standard errors of the RIT scores,, can benefit by using target values of 4.7 for reading and language usage and 5.0 for mathematics (Kingsbury & Hauser, 2007; Hauser, Kingsbury & Wise, 2008). Change errors that exceed these values will generally involve a test event with low individual validity. The set time period over which growth can be estimated is linked only to instructional weeks. The time periods can be specified for any reasonable time period (e.g., instructional weeks 110 for grades 2 through 8; 75 for grades 9 and 10). The expected growth component of the Hybrid Success Model (Kingsbury & McCall, 2006) can be set with greater confidence when an achievement status target is two to four years in the future.

135 References Ingebo, G. S. (1997). Probability in the measure of achievement. Chicago: MESA Press. Hauser, C., Kingsbury, G. G. & Wise, S. L. (2008, March). Individual validity: Adding a missing link. Paper presented at the Annual Meeting of the American Educational Research Association, New York, NY. Kingsbury, G. G. (2003, April). A long-term study of the stability of item parameter estimates. Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL. Kingsbury, G. G. & Hauser, C. (2004, April). Computerized adaptive testing and No Child Left Behind. Paper presented at the Annual Meeting of the American Educational Research Association, San Diego, CA. Kingsbury, G. G. & Hauser, C. (2007, April). Individual validity in the context of an adaptive test. Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL. Kingsbury, G. G. & McCall, M. S. (2006). The Hybrid Success Model: Theory and practice. In R. Lissitz (Ed.), Longitudinal and Value Added Models of Student Performance (pp ). Maple Grove, MN: JAM Press. Kingsbury, G. G., McCall, M. & Hauser, C. (2008). Tools for measuring academic growth. In E. V. Smith & G. E. Stone (Eds.), Applications of Rasch Measurement in Criterion-Reference Testing: Practice Analysis to Score Reporting (In press). Maple Grove, MN: JAM Press. Northwest Evaluation Association. (2002, August). RIT scale norms: For use with Achievement Level Tests and Measures of Academic Progress. Portland, OR: Author. Northwest Evaluation Association. (2005, August). RIT scale norms: For use with Achievement Level Tests and Measures of Academic Progress. Portland, OR: Author. Raudenbush, S. W. & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2 nd ed). Thousand Oaks, CA: Sage Publications. Rogosa, D. (1995). Myths and Methods: Myths about longitudinal research plus supplemental questions. In J. Gottman (Ed.), The Analysis of Change (pp. 3 66). Mahwah, NJ: Lawrence Erlbaum Associates. Rogosa, D. R., & Willett, J. B. (1983). Demonstrating the reliability of the difference score in the measurement of change. Journal of Educational Measurement, 20,

136 126 NWEA 2008 RIT Scale Norms Ragosa, D., Brandt, D., Zimowski, M. (1982). A growth curve approach to measurement of change. Psychological Bulletin, 92, Singer, J. D. & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. New York, NY: Oxford University Press. SPSS for Windows Version 15.0 [Computer software]. (2006). Chicago: SPSS, Inc.

137 NWEA 2008 RIT Scale Norms 127 APPENDIX A RIT Score to Percentile Rank Conversion Tables for Reading, Language Usage, Mathematics, Upper Level Mathematics, General Science Topics, and Science Concepts and Processes Beginning, Middle and End of Instructional Year

138 128 NWEA 2008 RIT Scale Norms Northwest Evaluation Association Beginning of School Year READING RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population.

139 NWEA 2008 RIT Scale Norms 129 Northwest Evaluation Association Beginning of School Year READING RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population.

140 130 NWEA 2008 RIT Scale Norms Northwest Evaluation Association Middle of School Year READING RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population.

141 NWEA 2008 RIT Scale Norms 131 Northwest Evaluation Association Middle of School Year READING RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population.

142 132 NWEA 2008 RIT Scale Norms Northwest Evaluation Association End of School Year READING RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population.

143 NWEA 2008 RIT Scale Norms 133 Northwest Evaluation Association End of School Year READING RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population.

144 134 NWEA 2008 RIT Scale Norms Northwest Evaluation Association Beginning of School Year LANGUAGE USAGE RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile

145 NWEA 2008 RIT Scale Norms 135 Northwest Evaluation Association Beginning of School Year LANGUAGE USAGE RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile

146 136 NWEA 2008 RIT Scale Norms Northwest Evaluation Association Middle of School Year LANGUAGE USAGE RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile

147 NWEA 2008 RIT Scale Norms 137 Northwest Evaluation Association Middle of School Year LANGUAGE USAGE RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile

148 138 NWEA 2008 RIT Scale Norms Northwest Evaluation Association End of School Year LANGUAGE USAGE RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile

149 NWEA 2008 RIT Scale Norms 139 Northwest Evaluation Association End of School Year LANGUAGE USAGE RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile

150 140 NWEA 2008 RIT Scale Norms Northwest Evaluation Association Beginning of School Year MATHEMATICS RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population.

151 NWEA 2008 RIT Scale Norms 141 Northwest Evaluation Association Beginning of School Year MATHEMATICS RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population.

152 142 NWEA 2008 RIT Scale Norms Northwest Evaluation Association Middle of School Year MATHEMATICS RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population. Italized cell entries are interpolated values.

153 NWEA 2008 RIT Scale Norms 143 Northwest Evaluation Association Middle of School Year MATHEMATICS RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population. Italized cell entries are interpolated values.

154 144 NWEA 2008 RIT Scale Norms Northwest Evaluation Association End of School Year MATHEMATICS RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population.

155 NWEA 2008 RIT Scale Norms 145 Northwest Evaluation Association End of School Year MATHEMATICS RIT Score to Percentile Rank Conversion %ile Kdgn Grd 1 Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 Grd 11 %ile Shaded cells are based on samples that were NOT stratified to reflect the U.S. school age population.

156 146 NWEA 2008 RIT Scale Norms Northwest Evaluation Association Beginning of School Year GENERAL SCIENCE RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

157 NWEA 2008 RIT Scale Norms 147 Northwest Evaluation Association Beginning of School Year GENERAL SCIENCE RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

158 148 NWEA 2008 RIT Scale Norms Northwest Evaluation Association Middle of School Year GENERAL SCIENCE RIT Score to Percentile Rank Conversion (Interpolated) %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

159 NWEA 2008 RIT Scale Norms 149 Northwest Evaluation Association Middle of School Year GENERAL SCIENCE RIT Score to Percentile Rank Conversion (Interpolated) %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

160 150 NWEA 2008 RIT Scale Norms Northwest Evaluation Association End of School Year GENERAL SCIENCE RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

161 NWEA 2008 RIT Scale Norms 151 Northwest Evaluation Association End of School Year GENERAL SCIENCE RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

162 152 NWEA 2008 RIT Scale Norms Northwest Evaluation Association Beginning of School Year SCIENCE CONCEPTS AND PROCESSES RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

163 NWEA 2008 RIT Scale Norms 153 Northwest Evaluation Association Beginning of School Year SCIENCE CONCEPTS AND PROCESSES RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

164 154 NWEA 2008 RIT Scale Norms Northwest Evaluation Association Middle of School Year SCIENCE CONCEPTS AND PROCESSES RIT Score to Percentile Rank Conversion (Interpolated) %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

165 NWEA 2008 RIT Scale Norms 155 Northwest Evaluation Association Middle of School Year SCIENCE CONCEPTS AND PROCESSES RIT Score to Percentile Rank Conversion (Interpolated) %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

166 156 NWEA 2008 RIT Scale Norms Northwest Evaluation Association End of School Year SCIENCE CONCEPTS AND PROCESSES RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

167 NWEA 2008 RIT Scale Norms 157 Northwest Evaluation Association End of School Year SCIENCE CONCEPTS AND PROCESSES RIT Score to Percentile Rank Conversion %ile Grd 2 Grd 3 Grd 4 Grd 5 Grd 6 Grd 7 Grd 8 Grd 9 Grd 10 %ile

168 158 NWEA 2008 RIT Scale Norms Northwest Evaluation Association End of Course UPPER LEVEL MATHEMATICS RIT Score to Percentile Rank Conversion %ile Algebra 1 Geometry Algebra 2 %ile %ile Algebra 1 Geometry Algebra 2 %ile

169 NWEA 2008 RIT Scale Norms 159 APPENDIX B Transition from 2005 to 2008 Norms Norms created at different points in time for any behavior or type of performance are likely to vary. This will be the case even if all procedures are identical across studies. Much of the variation from one study to another is simply due to the fact that each study is dependent on the sample of people it used to develop its norms. The potential sources of sample differences are not hard to identify. They include factors such as changes in demographic makeup of school districts as well as changes in teaching methods, testing programs or school effectiveness. Differences between the 2008 RIT scale norming study results and those of the 2005 study can, of course, be attributed to the same types of sample differences that we observed for other norming studies. There is, however, an important difference between this study and the 2005 study that was not present when moving from the 2002 study to the 2005 study. This difference was the stratified random sampling that was used for reading, language usage, and mathematics in grades 2 through 11. In the 2005 study, all test events that were valid and met the inclusion criteria (virtually identical to those detailed in Chapter 2) were included. In the current study, the test events that met the inclusion criteria were further sampled to match the U.S. school-age population with respect to ethnicity and school level percentage of students eligible for free and reduced price lunch. In addition, the current study restricted the inclusion of test events to those that occurred in the defined time frames for beginning, middle, and end of the year relative to the district calendar that was in the student s district. These are fairly dramatic changes in the way the sample for status norms was determined compared to the procedures used for the 2005 study. In the face of these differences in procedures, we saw little reason to speculate as to the other sources of differences in the results. We expected to see differences between the 2008 norms and the 2005 norms that corresponded to these changes. The differences observed were quite modest. For beginning of year (fall) reading tests across grades 2-10, the 2008 norms were an average of.9 RITs higher (range RITs) than the fall 2005 norms. For end of year (spring) reading tests, the 2008 norms were an average of.5 RITs higher (range RITs). In language usage, the beginning-of-year and end-of-year differences were of the same general size ( -.5 RITs) but the 2008 norms were slightly lower than the 2005 norms. Difference in mathematics status norms followed the pattern for reading but the average differences did not exceed.3 RITs. The grade level 2008 norms for reading and language

170 160 NWEA 2008 RIT Scale Norms usage were accompanied by 5% and 3% less variance, respectively, than the corresponding 2005 norms. Variance for mathematics between the two studies remained virtually the same across grades. From the standpoint of grade level distributions, the vast majority of differences between RITs at corresponding percentile ranks (2008 minus 2005) fell between -2 RITs and +2 RITs. Larger differences were noted mostly in the lower quartile of reading and language usage for grades 2 through 4. This was also the case for mathematics but was less pronounced. In all of these differences within the lower quartile, 2008 RITs were higher for the corresponding percentile rank. In grades 9 and 10, the 2008 RITs were associated with somewhat lower percentiles than the 2005 RITs. These were typically below the median and ranged from 1 to 2 RITs in reading, from 3 to 5 RITs in language, and from 3 to 4 RITs in mathematics. These differences are provided in a Microsoft Excel file (INTERACT Percentiles 2008.xlsx &.xls) that is available from NWEA. Average beginning-of-year to end-of-year (fall-to-spring), beginning-of-year to beginning-of-thenext-year (fall-to-fall), and end-of-year to end-of-the-next-year (spring-to-spring) growth differences were all less than.65 RITs across grades and were most frequently below.25 RITs. Within individual grade-content area combinations, the largest difference observed was 1.44 RITs for endof-grade 8 to end-of-grade 9 mathematics; growth was higher in the 2005 study. Given the changes from the 2005 norm study to the 2008 norm study, these differences seem neither unreasonable nor unexpected. The three elapsed calendar years, the change in the sample, and the change in the sample sculpting methodology had little impact on student performance and growth. The changes observed represent minor changes in a reference against which performance of an individual student or group of students is compared. What is important is that the underlying scale used to measure achievement remains invariant across time. This is a demonstrated characteristic of the RIT scales (Kingsbury, 2003). This means, for example, that a RIT score of 211 carries the same meaning regardless of when the test was taken, the grade level of the student who took it, and what set of norms were in place at the time. This characteristic allows us to measure student growth past a constant scale. The fact that little change in status and growth was noted between the 2005 study and the current study is not unexpected, since the sample sizes in each study are so large that a major change in education would be needed to affect the norms substantially. On the other hand, individual schools and districts show remarkable differences in how their students grow. While this is a fascinating story, it is one for another time.