NBEO FAQs Regarding the December 2, 2014 PAM/TMOD Exams. 1. What exactly occurred on December 2, 2014 during the PAM/TMOD exams?

Similar documents
Raw Score to Scaled Score Conversions

Sample Size and Power in Clinical Trials

CALCULATIONS & STATISTICS

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Battery Bias

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Paper title: SIMPLE TOOL FOR ANNUAL ESCALATING (AND DE-ESCALATING) INFRASTRUCTURE VALUE BASED ON THE CONSTRUCTION PRICE ADJUSTMENT FACTOR

ILLINOIS STATE BOARD OF EDUCATION MEETING March 19-20, Action Item: Routes to Paraprofessional Qualification

Development and Validation of the National Home Inspector Examination

Study Guide for the Special Education: Core Knowledge Tests

Registration Policy. Policy Number: R 13. New Policy X

The Official Study Guide

Candidate Information Booklet

Do Commodity Price Spikes Cause Long-Term Inflation?

oxford english testing.com

Speech-Language Pathology Study Guide

Study Guide for the Middle School Science Test

Partial Estimates of Reliability: Parallel Form Reliability in the Key Stage 2 Science Tests

HB 4150: Effects of Essential Learning Skills on High School Graduation Rates

Course Syllabus MATH 110 Introduction to Statistics 3 credits

PCAT FAQs What are the important PCAT test dates for ? Registration Opens 3/2/2015

The Official Study Guide

Study Guide for the English Language, Literature, and Composition: Content Knowledge Test

Study Guide for the Mathematics: Proofs, Models, and Problems, Part I, Test

National Disability Authority Resource Allocation Feasibility Study Final Report January 2013

Report on the Scaling of the 2014 NSW Higher School Certificate. NSW Vice-Chancellors Committee Technical Committee on Scaling

Assessment Centres and Psychometric Testing. Procedure

National Commission for Certifying Agencies. Standards for the Accreditation of Certification Programs

TESTPOINTSTM. Contents. National Center of Clinical Testing in Optometry (NCCTO) Open House. From the NBEO Board of Directors and Office Staff

estimated senior unsecured) issuer-level ratings rather than bond-level ratings? )

In the past decade, U.S. secondary schools have

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Quality Handbook. Part D: Regulations. Section 16c: Taught postgraduate courses. Section16c. Nottingham Trent University

Theatre Bay Area General Auditions FAQ (for actors)

Information Technology Services will be updating the mark sense test scoring hardware and software on Monday, May 18, We will continue to score

Content Sheet 7-1: Overview of Quality Control for Quantitative Tests

Section 1. Test Construction and Administration

Study Guide for the Music: Content Knowledge Test

Mark Scheme (Results) November Pearson Edexcel GCSE in Mathematics Linear (1MA0) Higher (Non-Calculator) Paper 1H

Defining and Measuring College Success

SYLLABUS MAC 1105 COLLEGE ALGEBRA Spring 2011 Tuesday & Thursday 12:30 p.m. 1:45 p.m.

Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration

The test uses age norms (national) and grade norms (national) to calculate scores and compare students of the same age or grade.

WHITE PAPER: Optimizing Employee Recognition Programs

Quality Handbook. Part D: Regulations. Section 16A: Common Assessment Regulations for Bachelor s and Integrated Master s degrees.

FileMaker Certification Policies

NAndtB/25 Training Pack for Personnel Required to Administer Near Vision Test per NANDTB/24. Compiled by the UK NANDTB

Study Guide for the Elementary Education: Content Knowledge Test

Types of Error in Surveys

MEASURES OF VARIATION

Comparative Reliabilities and Validities of Multiple Choice and Complex Multiple Choice Nursing Education Tests. PUB DATE Apr 75 NOTE

The University of Texas at Austin School of Social Work SOCIAL WORK STATISTICS

Testing Services. Association. Dental Report 3 Admission 2011 Testing Program. User s Manual 2009

Student Guide to NLN Testing Portal

Policy Capture for Setting End-of-Course and Kentucky Performance Rating for Educational Progress (K-PREP) Cut Scores

AP STATISTICS 2010 SCORING GUIDELINES

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

Maintaining the Relevance of the Uniform CPA Examination

Measures of WFM Team Success

Interpreting and Using SAT Scores

Study Guide for the Library Media Specialist Test Revised 2009

How Far is too Far? Statistical Outlier Detection

ACT Research Explains New ACT Test Writing Scores and Their Relationship to Other Test Scores

Evaluation Plan: Process Evaluation for Hypothetical AmeriCorps Program

Workplace Pension Reform: Multiple Jobholders

Confidence Intervals for One Standard Deviation Using Standard Deviation

The Revised Dutch Rating System for Test Quality. Arne Evers. Work and Organizational Psychology. University of Amsterdam. and

HFMA s 2011 Certification Program. Contents

ACE: After Effects CC

Fairfield Public Schools

Annual Report of Life Insurance Examinations Calendar Year 2010

Constructing a TpB Questionnaire: Conceptual and Methodological Considerations

Assessment Policy. 1 Introduction. 2 Background

A Procedure for Classifying New Respondents into Existing Segments Using Maximum Difference Scaling

Restructure, Redeployment and Redundancy

Advantages and Disadvantages of Various Assessment Methods

The University of Akron Department of Mathematics. 3450: COLLEGE ALGEBRA 4 credits Spring 2015

Chapter 3 The College- and Career-Readiness Component of the State of Texas Assessments of Academic Readiness (STAAR) End-of-Course (EOC) Program

Web Browsing Quality of Experience Score

Report on the Scaling of the 2013 NSW Higher School Certificate. NSW Vice-Chancellors Committee Technical Committee on Scaling

Chapter 4. Probability and Probability Distributions

History and Purpose of the Standards for Educational and Psychological Testing

AT-A-GLANCE NCLEX Examination Candidate Bulletin

Mode and Patient-mix Adjustment of the CAHPS Hospital Survey (HCAHPS)

NetSuite Certification FAQs April 2016

Maine EMS Exam Administration Manual Effective May 9, 2016

MATH 101 E.S. COLLEGE ALGEBRA FALL 2011 SYLLABUS

EMERGENCY NURSES ASSOCIATION

Analysing Questionnaires using Minitab (for SPSS queries contact -)

CPT CET CCMA CORST CPCT/A CMLA CMAA CEHRS CBCS

RESA Frequently Asked Questions

Statistical First Release

Study Guide for the Physical Education: Content and Design Test

DOING YOUR BEST ON YOUR JOB INTERVIEW

Department of Defense INSTRUCTION

Recommending a Nursing-specific. Passing Standard for the IELTS Examination. Thomas R. O Neill. National Council of State Boards of Nursing

Transcription:

NBEO FAQs Regarding the December 2, 2014 PAM/TMOD Exams 1. What exactly occurred on December 2, 2014 during the PAM/TMOD exams? According to Pearson VUE, the December 2, 2014 Part II exam publication at Pearson VUE testing centers contained an erroneous navigational setting that, unfortunately, prevented candidates from returning to their exam once they arrived at their final Review Screen to answer any items they may have missed or change their responses to previous questions. This erroneous navigation setting was due to a Pearson VUE coding error that was not caught during the Pearson VUE QA process. 2. What steps have been taken to prevent recurrence of the same type of testing irregularity? Pearson VUE has updated its Quality Assurance scripts to rigorously evaluate the Section Navigation switch which governs this trait. Pearson VUE has run this upgraded script on the new exams, has seen them work correctly in the field, and will continue to do so for all future exams. In addition, the NBEO will review all future computer-based examinations administered using software that will allow the staff to have unlimited attempts to test the exams in just the way the Candidate will view it. This software will give the NBEO professional staff the opportunity to take the exam in the testing conditions that are realistically available at the Test Center and NBEO can more accurately determine if the exam will be administered faithfully to our standards of high quality. In addition to correcting the coding error and updating its Quality Assurance practices, Pearson VUE has been very responsive and supportive of NBEO in providing follow-up remedies for affected exam candidates. 3. Were there scoring algorithm(s) that accommodated the various circumstances under which students took the examination? As indicated in Question 1 and according to Pearson VUE, the Part II PAM exam publication contained an erroneous navigational setting that did not allow candidates to go back and review flagged items. It is because of this navigational setting error that the NBEO has worked to provide candidates with various avenues of redress. The scoring algorithm is the same algorithm used in previous Part II PAM administrations (i.e., number correct score: a correct response earns 1 raw point). It was not feasible to design and develop a scoring algorithm that would account for the

different and various flagged questions on the Tuesday Part II PAM administration that would accommodate all impacted candidates. It is imperative to maintain a consistent scoring method for everyone who took the December 2 administration of Part II PAM. 4. Why is the TMOD pass rate, both locally and nationally, so much lower than in prior years? The national percentage of correct scores and standard deviations are almost identical over the last 3 years, but the pass rate is 25% lower in 2014. The lower pass rate on the TMOD portion of the test is possibly due to several factors: 1) A new standard setting took place on January 9-10, 2015 and a new cut score (89 raw points out of a total of 112 raw points) was established for the TMOD test embedded within Part II PAM. 2) The erroneous navigational setting did not allow Candidates to review flagged items on session 2 of the Part II PAM examination. 3) Candidate preparedness varies from year to year. The above three factors are probable causes for the reduction in the TMOD pass rate for the December 2014 administration of Part II PAM. NBEO could not have predicted there was going to be a glitch with the navigational setting nor could have predicted how 23 standard setters from various parts of the country were going to respond to the question of: Would a Minimally Qualified Candidate answer this item correctly or not? The definition of an MQC was agreed to by all of the standard setters before looking at a single item (full case, minicase, or solo items) on the examination. Hence, it is not the change in the items that established a new definition of MQC because the standard setters did not review any of the items while deriving the definition. All data presented are taken from the Institutional aggregate reports and provided to the institutions upon request. The table below provides data related to the TMOD portion of the test. The data are related to the December (targeted) administrations of TMOD. Dec-10 Dec-11 Dec-12 Dec-13 Dec-13 Form A A A A B % corr 81.09 76.87 81.76 83.99 79.98 Std Dev 9.09 9.64 8.88 7.89 9.61 Pass Rate(%) 91.02 90.12 89.69 97.37 90.87 One can conclude that the candidates performance did not change over the past years if, and only if, the candidates means and variances remain the same after adjusting for form difficulty (equating the different forms of the test). The data shows that the performance is not the same over the past 4 years, especially for the TMOD portion of the test. Furthermore, the TMOD test was not the same and the number of items on the TMOD has varied from one administration to another.

It is expected that the means and variances for the past four targeted administrations have varied and the pass rates also have varied. But that does not mean that the pass rates are dependent on the means and variances. It simply means that after adjusting for form difficulty, the equating produced an equated cut score that varies from administration-to-administration as a result of the variability in the ability of the pool of candidates taking the test. The December 2014 administration did not undergo equating and, therefore, the pass rates were determined based on the results of the standard setting study alone and cannot be compared to the previous four administrations pass rates. The TMOD national percent of correct scores for the years 2010, 2011, 2012, and 2013 are 81.09, 76.87, 81.76, and 83.99, respectively. The TMOD national standard deviations for the above mentioned years are 9.09, 9.64, 8.88, and 7.89, respectively. The NBEO does not expect the variations to be almost identical because they depend on the average TMOD score for each year as well as the number of Candidates taking the TMOD. It is normal for the average to change even when the same test is administered to the same group of Candidates. As a result, the standard deviation is expected to vary from administration to administration. Once the Candidates who received their December 2, 2014 results exercised their option to have their Part II PAM scores cancelled (and were given the opportunity for a free retest), the pass rate for first-time student candidates and all candidates were 91.7% and 89.2% respectively. In addition, the pass rate for the embedded TMOD was 75.4%. 5. Was the reduction in performance/pass rate on the TMOD related to the examination delivery problems experienced in December 2014? The erroneous navigational setting affected the December 2, 2014 administration of Part II PAM and embedded TMOD items which may have contributed to a reduced level of performance on PAM and TMOD items. It is because of this impact that the NBEO has worked to provide candidates with different avenues of redress. 6. How was the cut score established for the December 2014 PAM examination? When was it established, and how did the exam irregularities affect those final numbers? Unlike the previous Part II examination (12/2009 04/2014), the Part II PAM examination administered in December 2014 consisted of classic patient cases plus two additional item-type formats to include solo items and minicases. There was a need to add these items and updated existing items to the established equating block of items. The standard setting facilitated this need by rendering the expanded collection of equating material into psychometrically useful representatives of the exam in terms of content and statistical specifications.

Furthermore, the last Part II/TMOD standard setting event took place in 2009. A new standard setting study was deemed necessary to determine how much knowledge is just enough for safe and effective contemporary entry-level practice. Therefore, a standard setting meeting was held on January 9-10, 2015 at the NBEO headquarters in Charlotte, NC. A total of 23 diverse OD panelists participated in the meeting. Consistent with the 2009 standard setting event, the Angoff method was utilized to direct the standard setting process. The erroneous navigational setting that prevented Candidates from reviewing their flagged items on Session 2 of the Part II examination had no effect whatsoever on the final numerical conclusions drawn during the standard setting event. The NBEO did not change the definition of a Minimally Qualified Candidate (MQC) or entry-level Competency. The definition and characteristics of an MQC came from the standard setters themselves and NBEO had no role in establishing, modifying, or altering it. The primary point is that the definition is not instilled but rather the collective definition of the 23 standard setters combined. Consistent with the December 2009 standard setting study, the standard setting method, the judgment decision, the demographics of participants, and the training prior to conducting the study were the same. Consistent with the 2009 standard setting study, each standard setter was indicating whether an MQC would answer an item correctly or not. The sum of the 23 ratings divided by the total number of raters would be a measure of the difficulty of the item(s) on the test. Standard setting is conducted because of the items (multiple choice items based on minicases and standalone multiple choice items) that are included on the exam and to check if the standard has remained the same since December 2009. We know that a portion of the test has changed and also that NBEO has strived to develop examinations that reflect the contemporary practice of optometry. As a result, every year, the test development committees meet and develop, modify, and change the questions on the test to be consistent with its mission and to reflect contemporary changes to the practice of optometry. Therefore, the items are constantly being updated and undergo extensive reviews and the questions that have been administered in previous standard settings are no longer the same. As an example, let us assume that the same group of setters participated in two standard settings (i.e., 2009 and 2015) and the same items are used again in 2015. It is plausible that the group of participants, on average, assigns the same ratings, or change their assigned ratings. In our view, the ratings assigned by the standard setters are test-dependent and item-dependent. For example, if the item is related to the anatomy of the eye, then a group of standard setters would indicate that an MQC would answer this item correctly whether we ask this in 2009 or 2015. On the other hand, an MQC may not have been expected to answer a basic scleral lens question correctly in 2009, but perhaps would have had that expectation in 2015. Hence, the stability of the rating is item- and test-dependent. Consistency is better addressed in the way the

standard setting study was designed where the groups underwent two rounds of ratings and their ratings were compared in rounds one and two (See split panel design section below). The main reason that the psychometric best practices for holding the study postadministration is to provide the standard setters with empirical data (i.e., difficulty of the items) prior to making judgments on round two of the study. By providing real item performance data, the judges are better equipped to make realistic judgments about the performance of an MQC. 7. Has the NBEO studied the calibration and repeatability of using a panel of experts to set the pass-fail cutoff score for the PAM examination? Providing cut score ranges based on standard errors: If it were possible to positively identify a large group of Minimally Qualified Candidates (MQCs), they could be given the test and their test scores analyzed. Since such a group already would have been identified as a cohort of minimally qualified candidates, it would be reasonable to establish a cut score based on their average score on the Part II examination. Furthermore, if all possible, qualified and diverse participants had been involved in a standard setting panel, we could determine the true cut score (i.e., the average of all possible recommendations from qualified panelists). However, practical limitations force us to use the judgments of a sample of standardsetting panelists. Because a sample of diverse panelists was selected from the full population of optometrists, there is a sampling error associated with the recommendations of the panel. These ranges offer approximate 95% confidence intervals about the median panel recommendations. The interpretation of these intervals can be stated as, 95% of similarly constructed intervals will contain the true cut score. In other words, if 100 independent panels recommended cut scores, and we calculated confidence intervals from each of these panels feedback, approximately 95 of those intervals would contain the true cut score (as described above). The width of the confidence intervals are determined by: 1) the variability of the panelist s recommendations; and 2) the size of the panel; all things being equal, a larger panel will have a smaller interval. The confidence intervals are approximately 30 points wide for each of two assigned standard setting groups and approximately 20 points wide for the full panel. We do not know which (if any) of these ranges contain the true cut scores, but in order to maximize the likelihood, a cut score was selected from the score range where these ranges overlapped (see Figure 1, below).

Utilizing a Split-panel design: In keeping with recommendations in the standard setting literature (e.g., Hambleton & Pitoniak, 2006), the relatively large panel was divided into two smaller groups, each sufficiently large to produce an acceptable level of dependability, according to recommendations in the literature. This permits the evaluation of the stability of cut score recommendations across different, but similarly qualified and representative, standard setting panels. Participants were randomly assigned to one of the two groups by Alpine Testing Solutions, and then some panelists were switched between groups in order to achieve better balance regarding important demographic variables (e.g., mode of practice, optometry school or college, length of time since graduation, etc.). Both panels received common training and worked together during the development of the MQC definition and standard-setting practice activities. The groups then were divided and worked independently with different facilitators for the operational ratings and evaluations. The demographic composition of each group and the evaluation results were reviewed to see if there was any reason to favor the recommendations of one group over the other. Ultimately, no compelling reason was identified to justify this action. Therefore, the group results were combined to determine the recommended cut score. 8. We have received aggregate statistical reports for the December administration of Part II. However, a number of students who failed the Tuesday administration of the exam will retake it in April and have their results reported as December administration scores. Will we then receive new pass rate information for first-time takers that reflects this new data? How can we consistently report our pass rate for first-timers? Candidates had several options available to them. Candidates who opted to have their December 2014 Part II PAM and TMOD scores cancelled were able to do so by contacting the NBEO by February 11, 2015. Their April 2015 scores will be reported as April 2015, not December 2014 as these scores have been cancelled from both their official score report and the database. The updated score reports and the Aggregate Statistical reports for the December 2014 Part II PAM administration have been reissued. 9. Will the students who are retaking the exam in April for free due to the exam irregularities be considered first time takers? Will they be reported as first time takers in December or in April? If a December 2014 first-time Candidate chose to have their score cancelled after receiving their results on January 21, 2015, then they will be considered first-timers in April 2015 because they will not have pre-existing Part II PAM scores.

10. Several schools have students on international rotations in April. If their scores were affected by the computer problems, will their travel be reimbursed to fly to an approved testing city or are there PearsonVue centers internationally? The NBEO has never offered international computer-based testing at any site including the Pearson VUE testing centers. The Candidates who had international externships were encouraged to take the repeat Part II PAM on January 5, 2015. For those Candidates that did not take the January 5, 2015 administration will not be provided free travel or additional compensation. 11. What were the communications sent to students and administrators about matters related to the December tests, including but not limited to retakes, scoring, actual score reporting, and reversals of TMOD scores? The Deans and Presidents were provided copies of all of these communications to the Candidates. 12. Why did some Candidate scores change from pass to fail or fail to pass after the scores were released? What steps will be taken to prevent that from happening again? No scores were changed. There were a select few passing scores that were displayed incorrectly as a failing score. The issue was related to the unofficial scores that were displayed online for the Candidates to review. The scaled score of a 74.5 was not rounded up to a 75P as our in-house scoring system is designed. Instead, those Candidates with a scaled score of 74.5 observed a 74F while viewing their unofficial scores online. Once this display rounding was verified, the NBEO IT Department amended the coding online and Candidates were notified. 13. How will the optometric community be notified of the testing problems that occurred in December? The impact on the reputations of the schools and the individual students remain a concern for future job placements, residency positions and advancement. Will this be addressed as an open statement? Through the support of ASCO in developing this FAQ sheet as well as posting this information on the NBEO website, the information will receive broad distribution. Many Candidates have chosen to retest on April 7, 2015 and to have their previous scores cancelled.

14. Does the National Board utilize the services of an external auditor to ensure that best practices are followed in the development and administration of examinations and analysis of the results? The NBEO employs a full-time, in-house psychometrician and routinely consults with the psychometric team at Alpine Testing Solutions to ensure that best practices are being followed at all times, for all National Board exams. In fact, well before the December 2014 PAM exams were administered, all psychometricians involved had agreed that the Part II PAM was due an updated standard setting in January 2015 due to the introduction of the new PAM/TMOD item types (solo items and minicases). 15. Has this recent experience slowed the plans to move Part I to a computerized administration? Have other testing centers been considered? The time table has NOT been slowed in moving to a computer-based test for Part I ABS. However, the date of this transition has not been announced because a multitude of factors must be considered in this transition, most notably the number of different forms and the number of items on each form that will be required. 16. Are any of the NBEO staff, other than the Executive Director, authorized to respond to student questions about problems with the examinations? To answer the questions from the Candidates that were affected by the December 2, 2014 administration of Part II PAM, all appropriate members of the NBEO staff frequently were briefed on what had occurred and assisted in answering questions by both email and telephone. 17. When the states require a 75% on TMOD, why are students with raw score percentages between 75% and 79.46% receiving a failing grade? No optometric jurisdiction requires a 75% to pass TMOD. The requirement since the inception of TMOD has been a scaled score of 75. In other words, the standard setting determined that the cut score was 89 raw items to pass. The 89 is converted to a 75 scaled score.