Office of Superintendent of Public Instruction Smarter Balanced Assessment Consortium Request for Information 2013-31 Test Delivery Certification Package September 4, 2013
Table of Contents Introduction... 3 Background... 4 Key Components of the Summative Assessment Technology Platform... 6 Vendor Relations... 9 Costs... 9 Request for Information... 10 General Questions... 11 2
Introduction The Office of Superintendent of Public Instruction (OSPI) seeks responses from qualified RESPONDENTs experienced in specification development, product certification and systems integration to develop a set of three (3) certification packages Test-Taking Devices, Test Administration and Item Scoring. Each package will contain all of the requirements, specifications, and services that a product vendor needs to prepare and submit a product for certification. Administration of the certification service itself will be the subject of a separate RFP to be issued at a future date. Your responses to this Request for Information (RFI) will help Smarter Balanced and OSPI understand the capabilities within the existing testing and test vendor community. It will also help Smarter Balanced and OSPI with key decisions about establishment of a certification program to aid implementation of the Smarter Balanced Assessment System. Smarter Balanced and OSPI are currently studying the various elements of system and application integration and how the testing and test vendor community will be positioned to support state implementation of next generation assessments. Included are high-level system requirements for a framework of certification based on present Consortium understanding. This document contains questions Smarter Balanced and OSPI need answered in response to our Request for Information. These questions are intended to help focus your efforts on the areas we feel are most important based on currently identified needs. 3
Background Authorized under the American Recovery and Reinvestment Act of 2009 (ARRA), the Race to the Top Assessment Program provides funding to consortia of states to support the development and implementation of new common high-quality assessments that: 1) are aligned with the Smarter Balanced s common set of college- and career-ready, K 12 standards (the basis of which is the Common Core State Standards), 2) are valid and instructionally useful, 3) provide accurate information about what students know and can do, and 4) measure student achievement against standards or expectations designed to ensure that all students gain the knowledge and skills needed to succeed in college and the workplace. Over the past decade, state assessment results have brought much-needed visibility to disparities in achievement among groups of students and have helped meet increasing demand for data that can be used to improve teaching and learning. These new assessments are intended to play a critical role in educational systems, providing administrators, educators, parents, and students the data and information needed to continuously improve teaching and learning. To fully meet the twin needs of accountability and instructional improvement, however, states need assessment systems that are based on standards designed to prepare students for college and the workplace, and that more validly measure what students know and can do. Further, states need assessment systems that better reflect good instructional practice and support a culture of continuous improvement in education by providing information that can be used meaningfully and in a timely way to determine school and educator effectiveness, identify professional development and support needs, improve programs, and guide instruction. Overview of Comprehensive Assessment Systems Grants This grant category supports the development of assessment systems by consortia of states that provide valid, reliable, and fair performance results for individuals and groups of students that can be used for accountability purposes and to guide best instructional practice. Comprehensive Assessment Systems grants provide funding for the development of new assessment systems that measure student knowledge and skills against a common set of college- and career-ready standards (as defined in the U.S. Department of Education s Notice of Invitation for Applications or NIA) in mathematics and English language arts in a way that covers the full range of those standards, elicits complex student demonstrations or applications of knowledge and skills where appropriate, and provides an accurate measure of student achievement across the full performance continuum. Assessment systems developed with Comprehensive Assessment Systems grants must include one or more summative assessment components in mathematics and in English language arts that are administered at least once during the academic year in Grades 3 through 8 and at least once in high school; and that produce student achievement data and student growth data (both as defined in the NIA) that can be used to determine whether individual students are college- and career- ready (as defined in the NIA) or on track to being college- and career-ready (as defined in the NIA). In addition, assessment systems developed with Comprehensive Assessment Systems grants must assess all students, including English learners (as defined in the NIA) and students with disabilities (as defined in the NIA). Finally, assessment systems developed with Comprehensive Assessment Systems grants must produce 4
data (including student achievement data and student growth data) that can be used to inform (a) determinations of school effectiveness; (b) determinations of individual principal and teacher effectiveness for purposes of evaluation; (c) determinations of principal and teacher professional development and support needs; and (d) teaching, learning, and program improvement. In addition to meeting the need for assessment systems that can be used to determine whether students are college- and career-ready, this grant category seeks to ensure that the results from those systems will, in turn, be used meaningfully by institutions of higher education (IHEs). Under this grant category, we intend to promote collaboration and better alignment between public elementary, secondary and postsecondary education systems by establishing a competitive preference priority for applications that include commitments from public IHEs or IHE systems to participate in the design and development of the Smarter Balanced s high school summative assessments and to implement policies that exempt from remedial courses and place into creditbearing college courses students who meet the Smarter Balanced-adopted achievement standard (as defined in the NIA) for those assessments. 5
Key Components of the Summative Assessment Technology Platform Smarter Balanced has previously awarded contracts to develop the following components which collectively form the technology platform: Item Authoring Tool: A tool for the creation of new assessment items and managing the workflow involved in reviewing the items, adding accommodations and approving them for use in Smarter Balanced assessments. Item Bank: A storage service that maintains the collection of Smarter Balanced and other assessment items including metadata that indicates learning objectives to which the items are aligned, difficulty calibration data, usage data and so forth. The item bank has tools to collect a set of assessment items into a test package for the use within a test administration system. Test Delivery System: A set of web applications that manage the registration of students for tests, the delivery of those tests to the students, scoring of test items, integration of item scores into an overall test score and delivery of scores to the data warehouse. Data Warehouse: A comprehensive store of all Smarter Balanced assessment registrations and results and a system to generate reports on that data. Secure Browser: A special web browser that limits student access to authorized applications for the duration of the test and also facilitates accommodations for students with special needs. An exception to the secure browser requirement will be considered if an operating system and/or browser are otherwise able to meet the consortium s test security requirements. While these components are being developed using traditional project management, each will be converted into an open source project with associated open source licensing, open code repository and community coordination. Smarter Balanced will deploy and operate the Item Authoring, Item Bank and Data Warehouse services. States are responsible for deploying and operating Test Delivery Systems. Smarter Balanced expects most states to procure test administration services from vendors whose products are certified to deliver Smarter Balanced assessments. To ensure that tests are delivered and scored per the Consortium specifications, Smarter Balanced will create a certification program for test delivery systems. The intent of this RFI is determining specific processes, program expectations, and resource demands to afford development of a quality certification package (or quality packages) to support the Smarter Balanced Assessment System. The certification package will contain all of the specifications, sample data, test harnesses and other services a vendor needs to develop and prepare a product to be certified. A possible outcome of this RFI may be a future Request for Proposals (RFP) issued to garner actual development and performance of certification service or possible alternative contractual arrangement such as an intergovernmental agreement with another publicsector entity. Test Registration, Test Delivery and Scoring are the three (3) operations that compose a Test Delivery System. The following figure shows how these state-procured systems are intended to 6
interact with Smarter Balanced-hosted systems. Consortium Hosted Item Bank (author, approve, versions, etc.) Operational Items Test Bank (blueprints, test items, etc.) Data Warehouse Reporting Test Delivery System** Item Scoring (Deterministic, AI, Hand) Adaptive Engine Determine next set of items ** Hosted at a State or LEA data center, or Smarter Balancedcertified vendor. Test Delivery Test Scheduling Eligible students Pools of items packaged into tests Scheduled tests Test Registration Scored item responses Items Responses Validate student registration Student ID, school, grade, ethnicity, etc. Extract file with student-level results State Student Data System Student-level * and aggregated Web-based reports * Student-level reports will be generated by Consortium if State allows student PII data to be stored at Consortium. Otherwise, State will host an instance of the Data Warehouse and Reporting components. District SIS District will need to register students if no State system is available The Test Registration system accepts data from district or state Student Information Systems and schedules students for testing. It includes a user interface for validating and updating student registration information and for adding accommodation requests. Notably, Test Registration must validate registration using an application programming interface (API) service provided by the data warehouse. Test Delivery must render assessment items to the student s browser with authenticity, collect student responses and store them in the right format. Student responses are then sent to the Item Scoring. As Smarter Balanced uses computer adaptive testing, the adaptive engine selects test items in real-time based on student performance with previous items. Items can be scored in three (3) ways: Deterministic Scoring (sometimes called machine scoring) is used for items that have an unambiguous definition of correctness that can be determined by algorithm. Deterministic scoring applied to selected response items like multiple choice is a recognized application, but it can also be applied to a subset of constructed response items where the answer is a mathematical formula, a number, a manipulation of an on-screen item, a keyword and so forth. Hand-Scoring is for items that require human application of a rubric to assign the score. 7
Artificial Intelligence (AI) Scoring uses statistical methods usually from a sampling of hand-scored responses to assign the score. To be certified, a Test Administration System must do the following: Register students based on imported data from state or district systems. Validate registration against the Smarter Balanced data warehouse. Deliver Smarter Balanced assessments with authenticity. Score assessment items accurately and consistently. Apply the adaptive algorithm, consistently choosing the next item for the student based on previous performance. Manage the items appropriately for non-real-time item scoring (such as Human and AI scoring) so that they are available to be included in the test scoring process and potentially available for informing subsequent item selections for the student via the adaptive algorithm. Integrate item scores into test scores Deliver student responses and scores to the data warehouse in the proper format. Anything else determined necessary through due diligence of the supporting vendor. Smarter Balanced anticipates that the following components would be required in a certification package: Assessment Item Format Specification Test Package Format Specification Sample Test Package including items that exercise all required features of the assessment item format. Adaptive Algorithm (specifies how item results are used to select the next item to be displayed to the student) Deterministic Scoring Algorithm Sample Test Package that exercises the adaptive engine. Data Warehouse API Specifications (for test registration validation and for results reporting) Item Rendering Source Code (capable of rendering items in the specified format on a web browser) Deterministic Scoring Source Code Adaptive Engine Source Code Source code to the balance of the Test Delivery System Test harness that manifests the Data Warehouse APIs against which vendors can test their systems. Test harness that manifests the Security Assertion Markup Language (SAML) Single Sign- On APIs against which vendors can test their systems. Simulated Student Registration Data Simulated Student Response Data Simulated Scored Response Data Specifications for the certification tests including exercises to be performed and expected results. Other specifications, services, test harnesses, source code and sample data as determined by RESPONDENT s due diligence. The Test Administration package being developed by American Institutes for Research (AIR) 8
under contract to Smarter Balanced (through OSPI) should meet the certification requirements and serve as a reference implementation. It will be released under an open source license. Since operational tests are scheduled for the 2014-2015 school year, vendors are anxious to begin preparing for certification before the full source code release. For this reason, source code for certain features is called out separately in the above list and it will be a priority to release these components in advance of the full package. Vendor Relations With operational tests scheduled for the 2014-2015 school year, developers are anxious to begin building certifiable products before the entire certification packages are available. As a result, RESPONDENT should identify the components of the certification packages that make logical delineation for incremental release in draft form, if appropriate to support test vendor implementation in building test delivery products. RESPONDENT should also address elements of developer relation management identifying key areas of technology platform development that Smarter Balanced, its member-states, and the supporting test product vendors should attend to during the product building and certification process. Additionally, RESPONDENT should address implications and resource demands expected in attending to relations management in the areas of: responding to field inquiries, answering questions about the specifications and gathering system feedback. Feedback would be expected to improve the certification packages, to prioritize releases and inform the other contractors developing the specifications and the open source implementation of technology platforms. Cost RESPONDENT is requested to provide information regarding anticipated costs for developing a certification package (or set of packages) based on the details delineated in this RFI. Where use of existing certification tools warrant, specify the anticipated cost savings and provide the supporting technical rationale for employing an existing tool. If response focuses on custom designed tools, provide the technical rationale for expending resources to build a unique package or set of packages. 9
Request for Information Respondents must indicate if response refers to a system or program used in another capacity for similar purposes. If so, please indicate the specific design intentions of the system, respondent s analysis supporting implementation for Smarter Balanced s need, the duration of the system/program use, and a summary of t h e technical attributes. Vendors must completely describe the architecture and technical details of a proposed solution, including use of any third-party software. If a r ef er enced system or program meets only part of Smarter Balanced s requirements, respondents need to specify which of the needed elements are present in the referenced system/program, which are partially present, and which are not present. Estimated Schedule of Activities RFI Coordinator RFI Activity Date Issue Request for Information September 4, 2013 Question and answer period September 5-17, 2013 Responses due September 24, 2013 Review responses September 25 October 4, 2013 The RFI Coordinator is the sole point of contact for Smarter Balanced and OSPI for this procurement. All communication intended by a prospective respondent, upon receipt of this RFI, shall be with the RFI Coordinator: Name Debra Crawford Physical Address 600 Washington Street South Mailing Address PO Box 47200 City, State, Zip Code Olympia WA 98504-7200 E-Mail Address debbie.crawford@k12.wa.us Any other communication will be considered unofficial and non-binding on Smarter Balanced and OSPI. Consultants are to rely on written statements issued by the RFI Coordinator. 10
General Questions The following questions have been developed by our project team and express the range of information we are seeking. We encourage you to tell us about any relevant capacity, experience or insights your team may have in addition to the areas identified below. Background These questions are intended to give us perspective on the balance of your response and should not be interpreted as minimum qualifications. 1. What experience do you have developing software and IT applications? 2. What experience do you have hosting large IT operations? 3. Have you developed or consumed specifications for data formats? Please elaborate. 4. Have you developed or consumed specifications for protocols? Please elaborate. 5. Have you worked with education technology standards such as SIF, IMS QTI, IMS APIP, etc.? 6. Have you participated in standards compliance efforts either as provider or consumer? 7. Are you a member of any standards organizations? Do you participate in their standards development processes? 8. Have you developed requirements for software or IT procurement? Please elaborate. 9. Have you participated in open source software development? Please elaborate. Specifications 10. How would you translate internal specifications and code developed by existing vendors to publicfacing specifications? What elements to you expect to add, change or remove in that process? 11. What process would you recommend for specification review? How would that process differ from typical approaches given software is already under development using internal specifications? 12. How do you see the specifications and the open source reference implementation relating to each other in terms of release schedules, distribution, repository and management? 13. How would you develop and represent the specification for the adaptive algorithm and scoring algorithm? How would the specifications relate to the source code to those algorithms? Sample Data 14. How much sample data will be necessary in the following categories? Why? Simulated student registration data Sample test items Student responses Scored results 15. Are there other categories of sample data that you recommend? Please elaborate. 16. How would you go about creating the sample data? 11
Test Harnesses 17. We anticipate the need for test harnesses that simulate APIs for Single Sign-On and the Data Warehouse. Do you recommend any others? 18. How would you expect the test harnesses to relate to the actual open-source implementations of the respective services? Would that change over time? 19. How would you go about developing and deploying the test harnesses? 20. How do you recommend that the test harnesses be hosted? Who would be responsible for managing the hosted instance? How would that change over time? Certification Tests 21. How would you manage the development and review of the certification test specifications? What form would they take? 22. Do you expect that any of the certification tests could be automated? Why or why not? 23. Should any of the certification process be delegated to other organizations (e.g. standards bodies)? Why or why not? 24. Following development of the certification package (described by this RFI), Smarter Balanced will procure administration of the certification program (actual testing of vendor products). Should these two (2) services be provided by the same or different organizations? Please elaborate. Vendor Relations Cost 25. How would you identify potential candidates for certification and make them aware of the opportunity and the certification package? 26. How would the vendor community participate in development and review of certification package components? 27. How would you manage vendor support in such a way that treats all vendors equitably? For example, how would you respond to queries in a way that doesn t give excess favor to the vendor that asked the question? The cost information being requested below will be used to help us allocate budget and make a go, no-go decision. If we decide to proceed, you will have an opportunity to refine these figures based on our Request for Proposals. 28. Given the information you have about our requirements and any additional recommendations you have made in response to the above questions, are you able to provide an estimated cost for development of the certification package? If so, please provide an itemization of major tasks to be performed and the estimated cost of each. Where use of existing certification tools warrant, specify the anticipated cost savings and provide the supporting technical rationale for employing an existing tool. If response focuses on custom designed tools, provide the technical rationale for expending resources to build a unique package or set of packages. 12