Draft proposal Standards for assessment in RACP training programs

Draft proposal Standards for assessment in RACP training programs February 2014 Page 1

Contents Introduction... 3 Quick guide to the proposed RACP standards for assessment... 4 Standards for assessment in medical education... 6 Australian Medical Council (AMC) and Medical Council of New Zealand (MCNZ)... 6 International standards for assessment... 6 General Medical Council (GMC), UK... 6 Royal College of Physicians and Surgeons of Canada (RCPSC)... 6 Accreditation Council for Graduate Medical Education (ACGME), USA... 7 The need for RACP standards for assessment... 7 Principles of good assessment practices... 8 Proposed RACP standards for assessment... 9 Structure of the standards... 9 Plan... 10 Educational value and purpose of assessments... 10 Aligned... 10 Program of assessment... 11 Fit for purpose... 11 Plan Standards... 12 Implement... 13 Fair and transparent process and decision making... 13 Sustainable... 13 Feedback... 13 Communication and training... 13 Implement standards... 14 Evaluate... 16 Evidence informed and practice based... 16 Evaluate standards... 17 References... 18 Appendix 1 Commonalities of existing standards for assessment... 19 February 2014 Page 2

Introduction In November 2011, a panel of external experts was formed to review the RACP s assessment strategy. Their review was based on submissions from and consultation with RACP Fellows, trainees and staff across the Divisions, Faculties and Chapters in Australia and New Zealand. Following the review, the panel made a series of recommendations as documented in the April 2012 Report to RACP, the External Review of Formative and Summative Assessment. These recommendations were considered by a number of staff and Fellows with leadership roles in assessment, teaching and learning at the External Review of Assessments Planning Forum in November 2012. Forum participants identified the development of assessment standards for RACP training programs as a priority. This paper outlines the development process for the assessment standards, background information on existing local and international standards for assessments, and the draft proposal of Standards for Assessment in RACP training programs. The draft proposal of Standards for Assessment in RACP training programs will be used to guide the development, implementation and evaluation of assessments in all RACP training programs. It is intended that all RACP training programs adhere to the standards. It is recognised that implementation of the standards may take some time, and that any changes to existing assessment practices will need to be carefully planned and managed through the annual revision of training program requirements. These standards will inform the development of a College-wide policy on assessments, to be developed during 2014. February 2014 Page 3

Executive summary What are standards for assessment? Standards for assessment are the guiding principles for organisations when setting assessment tasks. Standards for assessment are used to define how assessment tools are chosen, implemented and evaluated. Standards for assessment clarify expectations of trainees and assessors regarding the purpose and use of assessments within a training program, and how information from those tools is translated into defensible decisions. Why does the RACP need standards for assessment? RACP standards for assessment will help to ensure that assessment methods are reliable, valid, cost efficient, acceptable, feasible, have educational impact and reflect the objectives of the training program. A common set of standards for assessment will help create consistency in the design and implementation of assessments across College training programs. What are the principles for assessment? The establishment of best-practice principles underpins all robust assessment processes. The following key principles have been identified based on a review of best practice from other specialist medical colleges and standard setting bodies both locally and internationally; and consideration of the literature on assessment design and implementation. Educational value and rationale Aligned Program of assessment Fit for purpose Fair and transparent processes and decision making Sustainable Feedback Communication and training Evidence informed and practice based February 2014 Page 4

What do the proposed RACP standards for assessment look like? The proposed RACP standards for assessment structures 32 standards within nine themes or subcategories, in a framework of three overarching categories. These standards will be used to develop new and review existing assessment practices. Plan 1. Educational value and rationale 2. Aligned 3. Program of assessment 4. Fit for purpose Implement 5. Fair and transparent processes and decision making 6. Sustainable 7. Feedback 8. Communication and training Evaluate 9. Evidence informed and practice based February 2014 Page 5

Standards for assessment in medical education Standards for assessment are the guiding principles for education providers when setting assessment tasks. Standards for assessment are used to define how assessment tools are chosen, implemented and evaluated, and serve to clarify expectations of trainees and assessors regarding the purpose and use of assessments within a training program. Such standards also show how information from those tools is translated into defensible decisions about progression through training. Many key bodies, both within Australia/New Zealand and internationally, have undertaken considerable work in defining standards for assessments. These bodies include the Australian Medical Council (AMC), the UK s General Medical Council (GMC), the Royal College of Physicians and Surgeons of Canada (RCPSC) and the Accreditation Council for Graduate Medical Education, USA (ACGME). A number of recommendations relating to standards for assessments were made following the RACP External Review of Formative and Summative Assessments in April 2012. Commonalities exist between these recommendations and the published standards for assessment of the AMC, GMC, RCPSC, and ACGME, as evidenced in Appendix 1 Commonalities of existing standards for assessment. Australian Medical Council (AMC) and Medical Council of New Zealand (MCNZ) The AMC s Standards for Assessment and Accreditation of Specialist Medical Education Programs and Professional Development Programs set out the standards for accreditation that must be met by specialist medical training colleges. These standards have been jointly agreed and applied by the AMC and the MCNZ. A number of these standards apply to the assessment of learning, and cover areas such as curriculum and assessment alignment, performance feedback and assessment quality. International standards for assessment The GMC, RCPSC and ACGME have each published a set of standards relating to the assessment of the training programs that they oversee. General Medical Council (GMC), UK The Standards for Curricula and Assessment Systems (GMC, UK) lists detailed standards under the five headings of planning, content, delivery, outcomes and review. Each of the 17 standards lists a number of mandatory requirements further clarifying the standard. Royal College of Physicians and Surgeons of Canada (RCPSC) The General Standards Applicable to All Residency Programs: B Standards (RCPSC) provides one standard for assessment of resident performance, detailed in a number of points relating to the interpretation of assessment data. February 2014 Page 6

Accreditation Council for Graduate Medical Education (ACGME), USA The ACGME Program Requirements for Graduate Medical Education in Internal Medicine (ACGME, USA) lists the program requirements for Graduate Medical Education in Internal Medicine, and includes requirements for formative and summative evaluation of residents. The GMC standards are comprehensive, with a number of standards stipulated for both curricula and assessments. In comparison, the RCPSC and the ACGME stipulate fewer standards for assessments. Regardless of these differences, a number of commonalities exist between these international standards (see Appendix 1). The need for RACP standards for assessment In November 2011, a panel of external experts was formed to review the assessment strategy of the RACP. The three external reviewers were Dr Tim Allen (Canada), Prof David Boud (Australia), and Dr Simon Newell (UK). Their review was based on submissions and consultation with a number of focus groups comprised of RACP Fellows, trainees and staff across the Divisions, Faculties and Chapters in Australia and New Zealand. Following the review, the panel made a series of recommendations as documented in the April 2012 Report to RACP, the External Review of Formative and Summative Assessment. Recommendations related to standards for assessment included: College-wide standards should be adopted for the design and administration (including marking and pass/fail decisions) of all summative examinations, and the College should provide support and direction to those who need help meeting these standards. Use the Van der Vleuten utility index (or a derivative) to review the assessment program and to justify future development or changes. Without neglecting the other factors in the index, consider the main purpose of the assessment program to be the driver of learning towards the achievement of competence and readiness for expert practice. February 2014 Page 7

Principles of good assessment practices Underpinning all robust assessment processes is the establishment of best-practice principles. The following key principles have been identified based on a review of best practice from other specialist medical colleges and standard setting bodies both locally and internationally; and consideration of the literature on assessment design and implementation. Educational value and rationale. It is widely accepted that assessment drives learning. It is vital that the assessments reflect the purpose and what is valued within the educational program, given that assessment shapes what trainees will learn. Aligned. Assessments should be aligned with all components of the training programs, including curriculum standards, teaching and learning activities, program requirements and the workplace setting in which they are conducted. Alignment is important so that conflicting messages are not sent and that what should be learnt is what is assessed. Program of assessment. Assessments should be designed as a whole program rather than isolated, ad hoc assessments. Developing a program of assessment, using a variety of well-chosen tools will ensure adequate sampling across the breadth and depth of the curriculum. Fit for purpose. Consideration should be given to whether the assessment methods are satisfactorily assessing the intended content. Consideration of how fit they are for purpose includes insights into their validity, reliability, feasibility and acceptability. While this applies to assessment tools, it also should apply to the program of assessment. Fair and transparent processes and decision making. Generally assessments are high stakes and can determine whether a trainee is able to continue in their training program and reach their career goals. Therefore it is vital that assessment standards and processes are fair and publicly available. Sustainable. Programs of assessment should be sustainable and achievable within existing resources. Feedback. Trainees should be provided with feedback following assessments to allow them to change and improve their practice. Communication and training. The implementation of assessments should be supported by clear communication and the training of relevant assessors. Evidence informed and practice based. Assessments should be evidence-based and subject to a process continuous improvement. The design, implementation and evaluation of assessments will be a collaborative process, including consultation with all impacted stakeholders and consideration of local needs. February 2014 Page 8

Proposed RACP standards for assessment The development of the proposed RACP standards for assessment has been informed by: a review of best practice from other specialist medical colleges and standard setting bodies both locally and internationally consideration of the literature on assessment design and implementation feedback from Fellows and trainees involved in the External Review of Assessments Planning Forum in November 2012. This background research has been incorporated to create the proposed RACP standards for assessment. Structure of the standards It is important to consider the structure and categorisation of standards in order to promote clarity and usability of the standards. This approach has been followed in the GMC s Standards for Curricula and Assessment Systems, which are separated into the five distinct categories of planning, content, delivery, outcomes and review. A similar approach has been taken for the proposed RACP standards for assessment, with standards grouped into the three categories of plan, implement and evaluate. These categories align with the development cycle for programs of assessment, commencing with a period of planning, prior to implementation followed by evaluation. Following evaluation it is expected that the program of assessment may be modified, bringing the process back to the planning phase (see Figure 1) Each of the three categories of the proposed RACP standards for assessment contains a brief description outlining the purpose of the category and key areas for consideration; this is followed by a number of standards listed for that category. Figure 1. Plan Evaluate Implement February 2014 Page 9

Plan This section contains standards for the development of assessments, including determining the educational value and rationale of assessments, creating alignment, developing programs of assessment, ensuring that assessments are fit for purpose, and that the development of assessments is informed by evidence, and the context in which it will be used. Miller s pyramid, shown below, provides a conceptual framework for the assessment of clinical competence. Assessment methods can be mapped against the various tiers of the pyramid: Knows. Tests factual recall, e.g. MCQs, EMQs, essay, oral exam Knows how. Assesses the application of knowledge within context to solve problems and make decisions, e.g. MCQs, EMQs, essay, oral exam Shows how. Assesses demonstration of competence, e.g., Long and short cases, OSCE, clinical simulations, standardised patient tests Does. Assesses actual performance in real life context, e.g. work-based assessments, entrustable professional activities, portfolios. Ideally assessments should evaluate performance in the context in which they are performed - assessing the does level of Miller s pyramid. This is an ongoing area of focus for medical education research and is of particular importance to issues such as certification and revalidation of practitioners. Workplace-based assessment strongly aligns with the does level, as it is authentic assessment of performance in the workplace context. Educational value and purpose of assessments It is widely accepted that assessment drives learning. To ensure that assessments direct the intended learning, the educational value and purpose of assessments should be clearly thought through when planning assessments. Users of assessments should also be aware of the intended educational value and purpose of assessments so that they can best reinforce that intention. Aligned Constructive alignment between learning outcomes, learning activities and assessments is based on the premise that assessment drives learning (Biggs, 2003; Wass, Bowden & February 2014 Page 10

Jackson, 2007).To create constructive alignment, a process of blueprinting is undertaken which involves reviewing each learning outcome and determining how it could best be learned and assessed (Holsgrove, Malik & Bhugra, 2009). Program of assessment Assessments should be considered as a whole program, as opposed to just a collection of isolated and individual assessment tools. Constructive alignment of learning outcomes, learning activities and assessments supports development of a program of assessment. While it is not feasible to assess each individual learning outcome within the curriculum, a program of assessment ensures that groups of learning outcomes are assessed. Ideally, each assessment should also provide feedback for a range of knowledge, skills and behaviours (Schuwirth & Van der Vleuten, 2011). In developing a program of assessment, focus is given to the collection of assessment data across the breadth and depth of training to ensure that an overall view of competency can be formed. Fit for purpose The purpose of each type of assessment should be aligned to the purpose of the learning outcome to ensure validity. For example, direct observations can be used to gain data on a specific encounter as it occurs in situ, while written examinations can be used to gain a broad picture of the breadth of clinical knowledge. Other factors such as educational impact, reliability, acceptability and feasibility should also be considered when evaluating the usefulness of assessments and Cees van der Vleuten s utility index is commonly used for this purpose. The index, shown below, defines utility (U) of an assessment as a product of several criteria (PMETB, 2007) = U = E x V x R x A x F Educational impact (E). What is the educational purpose of the assessment? What are you aiming to assess? Validity (V). Did the assessment measure what it was intended to measure? Reliability (R). What is the quality of the results of the assessment? Are they consistent and reproducible? Acceptability (A). Is the assessment going to be accepted by the trainees and assessors? Will it drive learning or detract from learning? Feasibility (F). Can this assessment be implemented? What are the practicalities, e.g. cost, resources, availability of assessors. February 2014 Page 11

Plan Standards A program of assessment includes a mix of assessment activities, with methods that are matched to the required purpose or intent of the assessment, and implemented at an appropriate stage of training. Integrated assessment programs, aligned to desired learning outcomes, are important to gain a more complete picture of competence. Programs of assessment are developed by the relevant Training Committee for each training program. The process of planning assessments also includes consideration of how changes will affect stakeholders and how assessments will be consulted on, implemented and evaluated (see Evaluate Standards). All programs of assessment should be approved by the College Education Committee prior to implementation. 1. Educational value and rationale Standard 1.1 Standard 1.2 2. Aligned Standard 2.1 Standard 2.2 Standard 2.3 3. Program of assessment Standard 3.1 Standard 3.2 4. Fit for purpose Standard 4.1 Standard 4.2 The educational value and purpose of the proposed assessment program should be stated and reviewed. The intended learning outcomes to be covered by each assessment should be identified. The purpose and context of assessment methods should be aligned to the learning outcomes. Programs of assessment should be developed by blueprinting assessments against the established curriculum framework for each RACP training program. Programs of assessment should be mapped against the RACP standards framework, curriculum standards, training setting outcomes, teaching and learning requirements and evaluations. A program of assessment should involve consideration of the intent and interaction of separate assessments to promote learning across the whole program. A mix of assessment methods should be utilised to ensure coverage of the depth and breadth of learning outcomes, including professional, clinical, procedural and field skills where appropriate. Recognised models should be used to evaluate the usefulness of both assessment tools and programs of assessment, for example Van der Vleuten s utility index, which considers the validity, reliability, feasibility, educational impact and acceptability of the proposed assessment. Authentic assessment across a range of domains should be used to inform decisions about progression through training. February 2014 Page 12

Implement This section contains standards for the implementation of assessments, including fairness and transparency, sustainability of assessments, the provision of feedback, and implementation using appropriate methods of communication and training where necessary. Fair and transparent process and decision making In order for assessments to be successfully implemented, their organisation and conduct should be well planned. Trainees and assessors should be provided with adequate notice prior to the introduction of new assessments or changes to existing assessments to ensure that they are sufficiently prepared and able to make best use of the assessments. It is also important for the processes and policies surrounding assessments to be transparent and defensible to allow openness and accountability to regulators and the public. Sustainable Assessments need to be designed in such a way that the input of resources required is sustainable over a period of time. Resources may include input from subject matter experts, for example assessment content from Fellows or assessment tool information from educationalists; costs to trainees or the College; and health system resourcing, such as arranging cover for Fellows or trainees completing College assessment work. Feedback The provision of feedback is an important aspect of both formative and summative assessments and should be provided according to performance on the task at hand, as well as overall progress through the curriculum and training program. Feedback should be used by trainees and their supervisors to plan for future learning, and amend their practice accordingly. Communication and training For successful implementation of assessments, stakeholders should have a high level of knowledge about the purpose and process of the assessments. This may involve training sessions or other appropriate forms of communication, such as emails or the availability of online resources. February 2014 Page 13

Implement standards A number of supporting structures should be put in place when implementing a high quality program of assessment. This includes: the use of fair and transparent assessment processes and fair and transparent decision making; sustainability of assessments and assessment processes; the provision of feedback to trainees as a result of assessments; and the development of communication and training resources to engage stakeholders. 5. Fair and transparent assessment processes and decision making Processes Standard 5.1 Standard 5.2 Standard 5.3 Standard 5.4 Standard 5.5 Standard 5.6 Standard 5.7 Decision making Standard 5.8 Standard 5.9 Standard 5.10 Standard 5.11 6. Sustainable Standard 6.1 7. Feedback The level of performance expected for each assessment should be determined according to the standards or milestones contained within each curriculum. Standards for summative assessments should be criterion referenced where appropriate, or developed using a recognised methodology for standard setting (e.g. the Angoff method for the determination of pass standards for written examinations). Publicly available PREP Program Requirement Handbooks and education policies should set out the requirements for assessments and programs of assessment, including requirements, consequences for failing to complete each assessment, and circumstances for special consideration. Implementation of new assessments and review of existing assessments and programs of assessments should follow the processes associated with the annual review of PREP program requirements. Adequate notice should be given prior to changes to assessments. Logistical measures should be appropriately managed and resourced, including resources for the development and maintenance of assessments. The roles of examiners and assessors should be clear. In determining formative workplace-based assessment outcomes, assessors should rate trainees using the performance standards outlined in the relevant training curriculum. In determining summative decisions for high stakes examinations such as the Clinical and Written Examinations, assessors should rate trainees using predetermined criteria which is available to the public through the RACP website. In making summative decisions relating to certification of training, assessors should gather evidence from multiple assessment points, including formative assessments and feedback from other assessors. The College s Reconsideration, Review and Appeals Process By-Law should provide an internal grievance mechanism to ensure that those affected by decisions of the College are treated fairly and consistently. The implementation of the assessment should be monitored in terms of its sustainability. February 2014 Page 14

Standard 7.1 Standard 7.2 Summative and formative assessments should generate feedback for trainees on their progress in the particular areas being assessed, as well as progress through the curriculum overall. Feedback obtained by trainees and their supervisors during the assessment process should be used to plan future learning needs and opportunities. 8. Communication and training Standard 8.1 Standard 8.2 Standard 8.3 Assessors should be provided with adequate information and training to conduct high quality assessments. Trainees should be provided with clear and transparent information about the purpose and processes of assessments. Resources to support trainees and assessors should be available online prior to the implementation of new assessments or revised assessments. February 2014 Page 15

Evaluate This section contains standards for the evaluation and continuous improvement of assessments. Evidence informed and practice based The process of developing quality assessments involves two important sources of information. Firstly, assessments should be developed on the basis of evidence of their effectiveness in promoting learning and quality assessment. Secondly, development of assessments should be guided by local content experts, such as Fellows and trainees who work in the settings in which the assessments are to be used, and who will be required to use the assessments. This involves careful piloting of new assessments in the planning phase and reference to the literature on assessment. Regular evaluation of assessments and assessment programs is essential to maintain quality assessments that sit well within the workplace setting. Evaluation should be conducted using published research and feedback from trainees, Fellows and other relevant stakeholders. February 2014 Page 16

Evaluate standards Regular evaluation of assessment tools and programs of assessment is essential to maintain quality assessments that sit well within the workplace setting. Evaluation should be conducted using published research and feedback from trainees, Fellows and other relevant stakeholders. Evaluation underpins the planning and implementation of assessments and programs of assessment. 9. Evidence informed and practice based During the plan stage Standard 9.1 Standard 9.2 Published literature and existing assessment approaches from other medical and health training bodies, both nationally and internationally should be reviewed during the process of assessment development. The development of all assessments and programs of assessment should include a period of consultation with relevant trainees, Fellows and additional stakeholders, prior to ratification by the College Education Committee. During the implement stage Standard 9.3 Standard 9.4 Programs of assessment should be reviewed annually as a part of the PREP Program Requirement Handbook development process. Major review of programs of assessment should occur during the curriculum review process. During the evaluation stage Standard 9.5 Standard 9.6 Data gained from standard setting processes and psychometric evaluation should be used to inform the continuous improvement of summative assessments. Assessments should be evaluated regularly using de-identified data collected from formative and summative assessments and relevant published research. Evaluation data should be used to review the efficacy of various types of assessment and increase utility (validity, reliability, acceptability, educational impact, and feasibility). February 2014 Page 17

References Accreditation Council for Graduate Medical Education (2013). ACGME Program Requirements for Graduate Medical Education in Internal Medicine. Retrieved July 30, 2013 from http://www.acgme.org/acgmeweb/portals/0/pfassets/2013-pr-faq- PIF/140_internal_medicine_07012013.pdf Allen, T., Boud, D., & Newell, S. (2012). Report to RACP, the External Review of Formative and Summative Assessment, Final Report, April 2012. Retrieved July 30, 2013 from http://www.racp.edu.au/index.cfm?objectid=ab250bcb-aadd-850d- 2CB7F04F332332B1 Australian Medical Council (2010). Standards for Assessment and Accreditation of Specialist Medical Education Programs and Professional Development Programs by the Australian Medical Council 2010. Kingston: Australian Medical Council Limited. Biggs, J. (2003). Teaching for quality learning at university: What the student does. (2 nd ed). Maidenhead, UK: Open University Press. Epstein, R. (2007). Assessment in medical education. The New England Journal of Medicine, 356:4, 387-396 General Medical Council (2010). Standards for curricula and assessment systems. Retrieved July 30, 2013 from http://www.gmcuk.org/education/postgraduate/standards_for_curricula_and_assessment_systems.asp Hodges, B. (2013). Assessment in the post-psychometric era: Learning to love the subjective and collective. Medical Teacher, 35, 564-568. Holsgrove, G., Malik, A., & Bhugra, D. (2009). The postgraduate curriculum and assessment programme in psychiatry: the underlying principles. Advances in psychiatric treatment, 15, 114-122. PMETB. (2007). Developing and maintaining an assessment system a PMETB guide to good practice. Retrieved November 6, 2012 from http://www.gmcuk.org/assessment_good_practice_v0207.pdf_31385949.pdf. Schuwirth, L. W. T, & Van der Vleuten, C. P. M. (2011). Programmatic assessment: From assessment of learning to assessment for learning. Medical Teacher, 33, 478-485. The Royal College of Physicians and Surgeons of Canada (2011). General Standards Applicable to All Residency Programs, B Standards. Canada: Author. Wass, I., Bowden, R., & Jackson, N. (2007) The principles of assessment design. In N. Jackson, A. Jamieson & A. Khan (Eds), Assessment in medical education and training: a practical guide (pp. 11-26). Oxford: Radcliffe. Jackson, A. Jamieson & A. Khan (Eds), Assessment in medical education and training: a practical guide (pp. 11-26). Oxford: Radcliffe. February 2014 Page 18

Appendix 1 Commonalities of existing standards for assessment The following table shows commonalities between a range of existing standards for assessment and the recommendations from the External Review of Formative and Summative Assessments. AMC Standards for Assessment and Accreditation of Specialist Medical Education Programs and Professional Development Programs by the Australian Medical Council 2010 GMC The Standards for Curricula and Assessment Systems RSPSC The General Standards Applicable to All Residency Programs: B Standards (Royal College of Physicians and Surgeons of Canada) ACGME The ACGME Program Requirements for Graduate Medical Education in Internal Medicine (Accreditation Council for Graduate Medical Education USA) External review Recommendations from the External Review of Formative and Summative Assessments Standard / Recommendation AMC GMC RCPSC ACGME External review Alignment between assessments and the goals and objectives of the curriculum Systematic planning of assessments helps to collect assessment data Performance standards for assessments are clear and identified Clear purpose for all assessments that is and available to trainees, educators, employers, professional bodies including the regulatory bodies, and the public Valid forms of assessment meaning that each different type of knowledge and skill is assessed using a different form of assessment (clinical, procedural, communication, professionalism assessed using methods such as written examinations, direct observation, multi-source feedback etc.) Measures of validity, reliability, feasibility, cost effectiveness, opportunities

Standard / Recommendation AMC GMC RCPSC ACGME External review for feedback and impact on learning used in the selection of assessments Use of lay and patient feedback in the development, implementation and feedback during assessments Regular assessments provide relevant feedback to trainees Feedback to supervisors on trainee s progress is facilitated through assessments Documentation of assessment outcomes throughout the training program Evidence from multiple sources is used to determine summative assessment decisions Training for assessors, trainees and examiners in the use of assessments Clear role of assessors/examiners that is used as the basis for recruitment and appointment Trainees are informed when serious concerns exist with their training and given the opportunity to correct their performance Policies relating to disadvantage and special consideration for trainees during assessment are in place A final report confirming sufficient competence for independent practice will be completed for trainees at the completion of their training Regular evaluation and improvement of education programs occurs