e-asttle Glossary The following terms are commonly used in e-asttle:



Similar documents
Glossary for the Database on Learning Assessments

Guidelines for Best Test Development Practices to Ensure Validity and Fairness for International English Language Proficiency Assessments

GLOSSARY of Assessment Terms

Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration

TIMSS IEA s TIMSS 2003 International Report on Achievement in the Mathematics Cognitive Domains

The development and implementation of COMPASS : Competency Based Assessment in Speech Pathology

5. Formally register for INS299 for the number of semester hours indicated on your contract.

Writing learning objectives

Writing Good Learning Objectives

Educational. Evaluation and Testing. Prepared by Ridwan Mohamed OSMAN

Purposes and Processes of Reading Comprehension

TAXONOMY OF EDUCATIONAL OBJECTIVES (Excerpts from Linn and Miller Measurement and Assessment in Teaching, 9 th ed)

ALIGNING TEACHING AND ASSESSING TO COURSE OBJECTIVES. John Biggs,

Asse ssment Fact Sheet: Performance Asse ssment using Competency Assessment Tools

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Battery Bias

Cognitive Domain (Bloom)

USING THE PRINCIPLES OF ITIL ; SERVICE CATALOGUE

USING THE PRINCIPLES OF ITIL ; SERVICE CATALOGUE. Examination Syllabus V 1.2. October 2009

The Effects of Testlets on Reliability and Differential Item Functioning *

EXAM PREPARATION GUIDE

Computerized Adaptive Testing in Industrial and Organizational Psychology. Guido Makransky

CREATING LEARNING OUTCOMES

Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models

The Alignment of Common Core and ACT s College and Career Readiness System. June 2010

CODING THE NATURE OF THINKING DISPLAYED IN RESPONSES ON NETS OF SOLIDS

Assessment Terminology for Gallaudet University. A Glossary of Useful Assessment Terms

How To Write The English Language Learner Can Do Booklet

HECAT: Chapter 5 curriculum fundamentals

Assessment in Singapore:

TIMSS 2011 User Guide for the International Database

TEST AND TYPES OF TEST

Nottingham Trent University. Centre for Academic Development and Quality. Guidance on how to construct and use grading matrices.

Factsheet ITIL -V3 Capability module Service Offerings and Agreements

TESTING CENTER SARAS USER MANUAL

ACT Research Explains New ACT Test Writing Scores and Their Relationship to Other Test Scores

The Galileo Pre-K Online Educational Management System

Assessment and Outcomes for Counselor Education Programs

PSYCHOLOGY Vol. II - The Construction and Use of Psychological Tests and Measures - Bruno D. Zumbo, Michaela N. Gelin, Anita M.

oxford english testing.com

Section 1. Test Construction and Administration

Glossary of Testing, Measurement, and Statistical Terms

Mathematics Cognitive Domains Framework: TIMSS 2003 Developmental Project Fourth and Eighth Grades

Factsheet ITIL -V3 Capability module Release, Control and Validation

THE ATLANTIC STUDY. Novel applications of modern measurement theory, now and in the future

INTERMEDIATE QUALIFICATION

Civic Education in California: Policy Recommendations

Outcomes for All Professional Degrees in Music Utah State University 2015

Assessment Terminology Accountability Achievement Test Action Research Affective Alternative Assessment Analytic Scoring Aptitude Test

Journey Towards International Benchmarking: A Story about Private Schools in the United Arab Emirates (UAE)

Abstract Title Page. Title: Conditions for the Effectiveness of a Tablet-Based Algebra Program

Text of article appearing in: Issues in Science and Technology, XIX(2), Winter James Pellegrino Knowing What Students Know

New Mexico 3-Tiered Licensure System

VOLUME MEASUREMENT AND CONSERVATION IN LATE PRIMARY SCHOOL CHILDREN IN CYPRUS

The Learning Skills Pyramid

TIPS FOR WRITING LEARNING OBJECTIVES

CRCT Content Descriptions based on the Georgia Performance Standards. Reading Grades 1-8

Evaluating the Fit between Test Content, Instruction, and Curriculum Frameworks: A. Review of Methods for Evaluating Test Alignment 1

A Guide to Curriculum Development: Purposes, Practices, Procedures

Writing Goals and Objectives If you re not sure where you are going, you re liable to end up some place else. ~ Robert Mager, 1997

Understanding Your Praxis Scores

INTERMEDIATE QUALIFICATION

Transcript: What Is Progress Monitoring?

Broad and Integrative Knowledge. Applied and Collaborative Learning. Civic and Global Learning

Team-Based Learning Student Study Guide

Demonstrating Understanding Rubrics and Scoring Guides

MOOC takers outcome beliefs, their attitude, and their intention toward enrolling in MOOCs and completing them

Appendix A: Annotated Table of Activities & Tools

Instructional Design for Engineering Programs

INTERMEDIATE QUALIFICATION

Critical Thinking in Community Colleges. ERIC. Digest.

Performing a data mining tool evaluation

EXAM PREPARATION GUIDE

Learning theories Judy McKimm

INTERMEDIATE QUALIFICATION

Sample Questions with Explanations for LSAT India

Effects of Different Response Types on Iranian EFL Test Takers Performance

Data Coding and Entry Lessons Learned

Alignment of Taxonomies

Keith A. Boughton, Don A. Klinger, and Mark J. Gierl. Center for Research in Applied Measurement and Evaluation University of Alberta

Rasch Measurement. Forskarutbildningskurs vid Karlstads universitet höstterminen 2014

History and Purpose of the Standards for Educational and Psychological Testing

Responsibilities of Users of Standardized Tests (RUST) (3rd Edition) Prepared by the Association for Assessment in Counseling (AAC)

Using Eggen & Kauchak, Educational Psychology: Windows on Classrooms for the New York State Teacher Certification Examinations

Qualification details

MODULE 12: EVALUATION TECHNIQUES

Catering for students with special needs

Teaching, Learning and Assessment Policy

Transcription:

e-asttle Glossary The following terms are commonly used in e-asttle: Achievement Objective (AO) Advanced (A) Analytic scoring asttle Basic (B) Calibration Cognitive processing Component Construct Constructed response Content CTT A specified part of the unpacked subject curriculum described in the e-asttle curriculum map. Student is consistently meeting the criteria at this level. Little disconfirming evidence is found. This student is ready to move on to material at the next curriculum level. A method of subjective scoring, often used in the assessment of writing and speaking skills, where a separate score is awarded for each of a number of features of the task, as opposed to one global score (see holistic scoring). Assessment Tools for Teaching and Learning. Showing signs of these elements. Elements are evident in embryonic form. This is the entry level behaviour described by the curriculum for this level. Determining the value of a test item against a particular measurement scale reflects item difficulty. IRT methods are used for this. (See also item calibration). The level of complexity required to respond to an item. e-asttle uses a SOLO taxonomic description of an item. Two categories are used surface (uni-structural and multistructural) and deep (relational and extended abstract). A high level description that groups related functionality together. The trait or traits that a test is intended to measure. Can also be defined as an ability or set of abilities that will be reflected in test performance, and about which inferences can be made on the basis of test scores. An item that requires the test taker to provide their own response. Sometimes called supply items, as the test taker has to supply the answer. For example, short answer, essay, cloze procedure, and performance assessments. (See also selected response). The big ideas of the curriculum map. These may not correspond exactly with the strands in the related New Zealand Curriculum document. Classical Test Theory.

Curriculum level Curriculum map (e-asttle) Cut score Deep Dichotomous Differential item functioning (DIF) Distractors Domain Estimates Factor analysis Holistic scoring IEA The levels specified in the New Zealand Curriculum that students should progress through as they move through their schooling, from Level 1 (entry) to Level 8 (at the end of Year 13). An unpacking of the New Zealand Curriculum statement for a given subject by expert advisers. The logit value that marks the boundary between two different levels. Arrived at as a result of the standard setting exercise. Relational and extended abstract items, when assessed using the SOLO taxonomy. An item that is marked (scored) for one of two possible outcomes T/F, Y/N or Correct/Incorrect. Marks/scores are awarded as 0 (incorrect) or 1 (correct). A feature of an item that shows up in analysis as a group difference in the probability of answering that item correctly. The presence of differentially functioning items in a test has the effect of boosting or diminishing the total test score of one or another of the groups concerned. The supplied responses that are incorrect in multiple choice items. That portion of the total universe of subject matter that is being tested, and for which inferences can therefore be made. The term used for measures of person ability and item parameters produced in latent trait models (IRT). These estimates are given in logits, units of measurement on a logarithmic scale. A method of reducing the number of variables accounting for performance by identifying the underlying factor(s) shared by a set of test items. For example, a language test may have many items that have items related to the factors of listening, speaking, reading and writing. A marking procedure that judges a piece of work (writing, speech, etc) impressionistically according to its overall properties, rather than for the sum of its parts (see analytic scoring). International Association for the Evaluation of Educational Achievement. Based in Holland, and conducts several international benchmarking studies (e.g., PIRLS, TIMSS).

IRT Item Item bank Item calibration Item format Item Response Theory (IRT) Key (Answer) Key (Answer) rules Link items Logit Module See Item Response Theory. Osterlind (1990) offers this definition: a unit of measurement with a stimulus and prescriptive form for answering; and, it is intended to yield a response from an examinee from which performance in some psychological construct (such as knowledge, ability, predisposition, or trait) may be inferred (p3) Essentially, it is a question/stimulus/prompt to which a student provides some response that can be scored, and from which we can determine their ability in the content area being assessed. A relatively large and accessible collection of items with known properties. The process of estimating the position of an item along a continuum (or line of the variable) along which persons are being measured. Items at one end of the continuum will be more difficult than those at the other end. A description of the different ways in which items can be constructed. For example, multiple choice, true/false, clozeprocedure. Modern test theory that enables test items and students to be placed on the same scale of proficiency. Used in e-asttle as the basis for calculating the logits for each item, and hence student and group proficiency for the different reports. The correct answer for an item that will be awarded a score if present. It should include all possible variations and forms of acceptable answer. The rules to apply for scoring an item. For example, both required for one mark where two responses are required. Items that occur in more than one trial paper and are used as comparison points for analysis purposes. An IRT statistic for each item that indicates the difficulty of the item. Used by the test compiler to select items for a test. Also used to determine a student score on the relevant scale. The usual range for logits is -3 to 3, but can occur outside this range. A high level description that groups a number of components into logical groupings.

Polytomous Proficient (P) RUMM Scope and Sequence Score type Scoring Selected response SOLO Standard setting Stem Stimulus Surface Items that cannot be scored dichotomously (i.e., as a simple T/F or Y/N). Items where more than two values can be assigned as a score. Common on attitude and personality scales with multiple-response categories. In asttle, when an item has a score value of 2 or more, then it is scored polytomously, as it is possible to obtain a score of 0, 1 or 2 (up to the maximum score value). There is evidence that the student is controlling or mastering the criteria elements. They should correctly answer items at this level about two-thirds of the time. Rasch Unidimensional Measurement Model. An analytical software tool. Uses Item Response Theory (IRT) to provide the underlying performance figures that drive the item selection and reporting functions of e-asttle. In many curricula around the world there is what is called a "Scope and Sequence". This is where certain standards need to be taught in certain order. For example, there may be a group of standards that are taught "pre-march" and another set "post-march". If the timeframe is "pre-march" then students should only be assessed on those standards. Indicates whether an item is scored dichotomously or polytomously. The process of assigning a numerical value (score) to an answered item. An item where the test taker has to select or choose the correct answer from a set of answers that are provided. For example, multiple choice. (See also constructed response). Structure of Observed Learning Outcomes. A cognitive processing taxonomy (classification system) devised by Biggs and Collis, and used in e-asttle to identify items that are surface and deep in the cognitive processing required. The process of arriving at the cut scores that distinguish different levels of achievement within the curriculum. The part of the item that asks the question or sets up the situation for response. The material (text, diagram, graph, photo, etc) provided for which an item is written that requires a response from the test taker. Uni-structural and multi-structural items, when assessed using the SOLO taxonomy.

Technical reports Testlet Types of items Documents that describe the research base of e-asttle, and the reasons for the processes/actions taken. Probably the most misunderstood term we use. The term testlet was first used by Wainer & Kiely (1987), who defined it as an aggregation of items that are based on a single stimulus. For instance, a reading comprehension test, which has a passage and a set of (say) four to twelve items that accompany the passage. These items are not independent of each other, and therefore issues of misinterpretation, subject expertise, fatigue and so on are reasons for a test takers response to the items being more highly related to each other, than would occur with a set of totally independent items. For e-asttle, we use the term testlet to refer to the stimulus material that has several items that can appear together or independently. Testlet images are stored separately from the question/item, so that the items can access the testlet image independently. A description of the different ways in which items can be constructed. For example, multiple choice, true/false, clozeprocedure, short answer.