Kurt F. Geisinger Changes in Outcomes Assessment for Accreditation: The Case of Teacher Education in the United States
Overview of Accreditation 2
Accreditation 3 Three accreditation agencies for tests in the United States are: American National Standards Institute (ANSI) Institute for Credentialing Excellence (ICE) Buros Center for Testing ICE was formerly known as and the National Organization for Competency Assurance NOCA, and is a part of the National Commission for Certifying Agencies, NCCA.
Buros Center for Testing 4 Buros Center for Testing at the University of Nebraska Lincoln, is an independent not-for-profit organization that has identified essential components for proprietary testing programs and established a mechanism to determine if these organizations meet these standards. The Psychometric Consulting focus of the Buros Center for Testing has an accreditation program that is based directly on the Standards for Educational and Psychological Testing (1999). This program is revised periodically to be in conformance with the most current version of the Standards such as the newly released Standards (2014). The testing programs that Buros has served in this capacity (for accreditation) are proprietary in nature and generally offer certification and licensure tests. With the Buros protocol, there is considerably less emphasis on program management except where those processes reflect upon psychometric concerns.
Buros Standards for Proprietary Testing Programs 5 Purpose of the Testing Program Structure and Resources of the Testing Program Examination Development Job Analysis or Content Specification Item Development Pilot Testing Creation of Final Exam Forms Periodic Review Test Adaptation Examination Administration Application and Registration Accommodations Administration Sites Test Administrators and Proctors Procedures for Administration Scoring and Score Interpretation Scoring Scaling Standard Setting Norm Referenced Interpretation Equating/Comparability Score Reporting and Record Keeping Exam Security Responsibilities to Examines and the Public
Stage 1 Audit/Accreditation 6 Stage One of the accreditation process is a review of a testing program's practices and procedures. This includes, but is not limited to, examination of the testing program's organizational structure, procedures associated with test development, test administration, psychometric methods, analyses, test security, scoring, reporting, and record maintenance. Stage One of the Buros Psychometric Consulting audit process: The testing program submits materials documenting policies, procedures, and results of testing program(s) to Buros for review. Buros conducts a thorough review of all materials and evaluates policies and procedures against the Buros Standards for Proprietary Testing. Buros conducts a site visit to the testing program's office to observe processes and interview key personnel. The testing program key personnel are provided an opportunity to review the draft report for factual accuracy. Buros prepares the final Stage One audit report. Successful completion of Stage One results in a general accreditation of the provider's examination program, but does not accredit specific tests. Many clients opt to participate in Stage One as an audit of their program.
Stage 2 Test Accreditation 7 If Stage One accreditation is awarded, the program is eligible to move to Stage Two of the accreditation process. This stage involves review of the psychometric characteristics of specific tests that were developed using the stage one accreditation process. Stage Two of the Buros Psychometric Consulting test accreditation process: The testing program submits materials documenting processes and practices associated with the results and data analyses of each test to Buros for review. Buros conducts a thorough review of these materials against the Buros Standards for Proprietary Testing. This review included examining the psychometric properties of the test including the consistency of the test form with the table of specifications, reliability, decision consistency, equated passing score, item analyses, and other analyses. Buros prepares a report of findings detailing the extent that the characteristics of the test(s) adhere to the Buros standards. The second stage, if successfully passed, results in the accreditation of specific tests developed by the test provider.
Standards: AERA, APA, & NCME 8 The purpose of publishing the Standards is to provide criteria for the evaluation of tests, testing practices, and the effects of test use. Although the evaluation of a test or testing application should depend heavily on professional judgment, the Standards provide a frame of reference. The 2014 Standards for Educational and Psychological Testing is a collaborative publication of three organizations: American Educational Research Association (AERA), the American Psychological Association (APA), and the National Council on Measurement in Education (NCME).
Accreditation in Higher Education 9 According to the Commission on Higher Education, one of the criteria for accreditation is outcomes or institutional effectiveness. The Council for Higher Education Accreditation (CHEA) serves as a voice for colleges and universities on matters related to accreditation. Accreditation efforts in higher education settings is often placed within the portfolio of Student Affairs departments. Assessment is vital to demonstrate accountability and commitment to continuous improvement.
10 Accreditation and Higher Education in the United States of America
Assessment is typically situated in Student Affairs portfolio 11 Assessment can provide a connection between goals and outcomes, define quality, and determine whether services were delivered or perceived to be delivered successfully. National pressure to demonstrate effectiveness and accountability Importance of survival rates Policy development and strategic planning Accessibility Impact of non-academic on outcomes
Academic Accreditation: Universities, Schools/Colleges, and Programs 12 Once upon a time, accreditation was largely based upon inputs Now much more focus on outcomes Not how good the students are coming in Rather how good they are going out Much less emphasis upon resources However, more colleges and universities lose accreditation over a lack of fiscal resources than anything else After all, the purpose of accreditation is protecting the public
Years Ago 13 Talked about the following topics: Institutional context Faculty Students Curriculum Resources Everything was inputs Times have changed
Politics and Policy Questions 14 Politicians and Policy makers ask very tough questions of academic institutions for reasons that serve very different purposes, but which put demands on institutions to answer or suffer consequences. 1. What is your college s contributions to learning? 2. Do your graduates know what you think they know? 3. Can your graduates do what your degree implies? 4. How do you ensure #2 & #3? 5. At what level are students learning what you are teaching? 6. What combination of effort from institution and students would it take to increase the level of student learning?
Politics and Policy Questions 15 Because it is not fortuitous to ignore the questions colleges and universities develop assessments to provide a systematic approach and collect evidence to support their responses. Conducting assessments on a routine basis is encouraged to contribute to institutional improvement. Another major question being asked in the USA pertains to diversity and accessibility. Is your college accessible to all qualified students regardless of gender, aged, race, and demographic background variables?
US Accreditation 16 According to the Commission on Higher Education, one of the criteria for accreditation is outcomes or institutional effectiveness. Assessment Model in Higher Education: 1. Tracking 2. Needs Assessment 3. Satisfaction Assessment 4. Student Cultures & Campus Environment Assessment 5. Outcomes Assessment 6. Comparable Institution assessment 7. National Standards 8. Cost Effectiveness
Higher Education (USA) 17 Council of Higher Education Accreditation (CHEA) serves as a voice for colleges and universities on matters related to accreditation. There are 6 Regional Accrediting Bodies in Higher Education in the USA. Middle States Association of Colleges and Schools New England Association of Schools and Colleges North Central Association of Colleges and Schools North West Association of Schools and Colleges Southern Association of Colleges and Schools Western Association of Schools and Colleges
Example of General Criteria for Higher Ed. Accreditation 18 Standards for accreditation vary from region to region, and in some cases there are specific programs within an institution that go through their own accreditation process by means of a separate accreditation body (e.g. medicine, law, teacher preparation). Resources: The institution has appropriate resources to accomplish its purposes and deliver services. Capability: The institution has to demonstrate the capacity to accomplish its purposes and commitments. Sustainability: The institution gives reason to believe that it will continue to accomplish its mission.
Standards for Accreditation of Higher Ed. Institutions 19 Purposes: Clearly stated Publicly stated Consistent with its mission Appropriate to an institution of higher education Resources: Effectively organized (human, financial, and physical) Adequate to accomplish purposes Status: Currently accomplishing its educational purposes and other purposes Capacity to sustain current status and strengthen educational effectiveness Demonstrate integrity in its practices
University Accreditation 20 Focuses upon general issues E.g., What is the curriculum for all students? What are the resources for all students? How are faculty compensated? Do faculty have the appropriate degrees?
School and Program Accreditation 21 More specific, obviously Here at KFUPM, I understand that you have ABET and AACSB accreditation These can be more focused AACSB focuses on scholarly productivity as well as teaching
All Accreditation 22 Much more focused on what students are learning How do you know what students are learning? Used to be based upon the planned curriculum and the syllabi Now focused upon assessment outcomes
Assessment Strategies 23 Needs Assessments Outcome Assessments Institutional Support
Learner Outcomes 24 Skills Knowledge Ability Knowledge is more than information stored or memorized. There are many dimensions to the assessment of knowledge. Knowledge is important in building foundation for higher level thinking but testing exclusively in this domain will not establish the full range or capacity or extent to which a subject area was learned. Skills are complex acts that require knowledge and generally involves some degree of practice to achieve mastery. Ability, physical or mental (e.g. critical thinking, creativity), involves a very complex relationship with knowledge and skills as well. These typically take a long time to develop or train, and in some aspects it may be influenced by natural talent.
Needs Assessment/Need for the Program 25 Are there jobs for the graduates? What are the success rates of graduates getting positions? How do they do in the positions?
Assessment: Ethical Issues 26 Respecting Autonomy (informed consent) Doing No Harm (IRB) Benefiting Others Being Just Being Faithful Data Access Data Ownership Role Conflicts
Ethical Dilemma Exercise 27 You are offered a substantial fee to assess the student life office in a college in a neighboring municipality. Your college competes strongly with this college for students. Should you accept the invitation? Under the cover of confidentiality, you learn during an assessment of student government operations that certain financial irregularities have occurred. Your informant does not want the situation investigated, since the person s identity will be revealed. What do you do?
28 Case Study: Assessing the Assessments of Teacher Preparation
Teacher Education Accreditation 29 Voluntary (in some states) Two organizations NCATE and TEAC NCATE focused upon the older model: resources TEAC focused upon the newer model: plans and assessment They merged with two tracks but perhaps more focus on the latter I was on the commission that combined them and developed the new standards for accreditation
Task Force 30 The American Psychological Association s (APA) Education Directorate, with support from the Council for the Accreditation of Educator Preparation (CAEP), convened a Task Force in December 2012 to develop a practical, user-friendly resource based on the psychological science of program assessment. Frank C. Worrell, Chair Task Force - Professor in the Graduate School of Education, University of California, Berkeley Mary M. Brabeck Professor of Applied Psychology, Dean Emerita, New York University Carol Anne Dwyer Distinguished Presidential Scholar Emerita, Educational Testing Service, Educational Testing Service Kurt F. Geisinger Meierhenry Distinguished University Professor and Director, Buros Center for Testing, University of Nebraska, Lincoln Buros Center for Testing Ronald W. Marx Professor of Educational Psychology & Dean of Education. University of Arizona George H. Noell Professor, Department of Psychology, Louisiana State University Robert C. Pianta Dean of Education & Novartis Professor of Education, University of Virginia Rena F. Subotnik Director, Center for Psychology in Schools and Education & Associate Executive Director Education Directorate American Psychological Association The report, Assessing and Evaluating Teacher Preparation Programs (Worrell et al., 2014), informs teacher education practitioners and policy makers about how best to use data to make decisions about evaluating and improving educator preparation programs.
Concerns 31 The use of standardized assessments to judge the adequacy of teacher education programs The impact of institutional selectivity The interplay of labor market competition for scarce versus abundant openings in school districts around the nation The importance of recognizing the need for a high level of technical expertise in recognizing the precise estimate of bounds and the risk of confounded analyses Technical issues such as: The availability of nonacademic predictors Test quality Level of missing data
Focus 32 The potential utility of three methods for assessing teacher education programs: value-added assessments of student achievement gains, standardized observation protocols of teacher behavior, surveys of graduates, principals, and students views of teacher performance.
Basic Assessment Principles 1/2 33 Student learning is recognized as the most critical outcome of effective teaching and should be integrated into teacher preparation programs fidelity assurance; programs need such evidence both to assure the public and to improve their programs. Standards for Educational and Psychological Testing represent the consensus of the measurement field regarding the technical quality of assessment. recognizes the importance of using assessment appropriately in the service of improving learning outcomes should be consulted when building a program assessment system
Basic Assessment Principles 2/2 34 Standards identify validity as the most important assessment characteristic, the foundation for judging technical quality. Validity is a comprehensive concept encompassing: reliability intended and unintended consequences of the assessment fairness. Introduction of irrelevant variation (error) In cases where there are implications of considerable change, those changes will need to be phased gradually.
Value Added Assessment (In Principle) 35 Academic achievement can be assessed rigorously through standardized test scores. Assessment of new teachers impact on student learning is the most important information for accountability and continuous improvement of teacher preparation programs. Value Added Assessment (VAA): Using Student Learning Outcome Data to Assess Teacher Education Programs. VAA assesses changes in student learning over a time period, typically an academic year. Typical observational ratings by principals (supervisors) tend to lack rigor and utility. Achievement gains of pupils taught by a program s graduated teachers can be used to assess teacher preparation programs Data averaged over graduates of a program are more reliable than individual data
When using VAA to evaluate programs: 36 standardized assessments must be psychometrically sound reasonably related to one another across years aligned with instruction The available links between students across years must be sufficiently complete so that the analyses are not undermined by large-scale and/or selective attrition due to unmatched records.
When using VAA to evaluate programs: 37 VAA will provide information about only a limited and selective sample of new teachers: those who obtain public school teaching positions: (a) in the states where they were trained (b) those placed in tested grades and subjects. Data must be available that describe critical information about students that is anticipated to influence results
Using Standardized Observations to Evaluate Teacher Education Programs 38 Standardized observations of behaviors can be used to identify effective classroom practices and programs success in preparing teachers teachers interactional behaviors Management social interactions with students Evidence from large-scale studies demonstrates that scores obtained from standardized observations predict a range of student outcomes in areas of achievement, social behavior, and engagement.
Using Standardized Observations to Evaluate Teacher Education Programs 39 Although 95% of teacher candidates are observed during their teaching placements, fewer than 15% of these observation instruments meet sufficient levels of evidence for the reliability or validity of the data they collect to warrant the inferences being drawn (Ingersoll & Kralik, 2004).
To ensure the fair, accurate, and valid use of observation measures: 40 First and most important is the extent to which the observation instrument is standardized in its administration procedures. Does it offer clear directions for conducting observations and assigning scores? Does it ensure consistency and quality control? Are observers qualified, the length of observations uniform? Second, does the observation provide reliability information and training criteria? The third essential question concerns validity. Is there evidence that data from observations of teaching (scores, ratings) actually predict student learning? Are the data drawn from samples of teacher candidates or students similar to those in a specific teacher preparation program?
Using Surveys to Evaluate Teacher Education Programs 41 (a) surveys of teachers satisfaction with the training they received and their perceived competence (b) surveys of employers (e.g., principals) about the performance of graduates of particular programs (c) surveys of students regarding their teachers performance and behavior
Student Surveys 42 Student surveys of teacher effectiveness have considerable support in the empirical literature They are related to achievement outcomes Are more highly correlated with student achievement than are teacher self-ratings and ratings by principals Are more accurate in distinguishing between more and less effective teachers than other metrics Student surveys may be particularly useful in formative evaluation and the scores can isolate areas in which programs need to improve. Graduates can provide useful feedback about how well prepared they were to operate effectively in the classroom as a result of various aspects of their teacher preparation program. There is consensus that student surveys should not be used in isolation, and that data should be collected at multiple times and from multiple classes.
Conclusion of the APA Report 43 Urges teacher education practitioners to use procedures, data, and methods that are informed by well-established scientific principles. Such efforts - when they involve multiple sources of reliable and valid data - can help faculty evaluate programs and improve their efforts to educate highly effective teachers. Faculty who use valid methods of data collection, analysis, and interpretation in combination, rather than relying on any single method, will derive the most valid, fair, and useful profiles of program effectiveness.
Conclusion of the APA Report 44 Use of scientifically based procedures will provide the public with evidence that teacher education is rigorous and accountable. Encourage teacher education faculty to work in partnership with school districts and states to invest time and resources in the development of systems that allow them to demonstrate with confidence that candidates completing their programs are making substantive contributions as new teachers to the learning outcomes of all of the students that they teach.
In the context of your reality! 45 What is the mission of your program? What are expected outcomes? How do you/institution/community define success? Outline current organizational structure for: management, academics, program evaluation, accountability What measures / assessments are currently being used? What is missing or in need of improvement? What s next? How can you contribute?