Developing professional standards for EFL testing in China: Contexts, considerations, and challenges

Similar documents
Examining Washback in Multi exam Preparation Classes in Greece: (A Focus on Teachers Teaching practices)

Responsibilities of Users of Standardized Tests (RUST) (3rd Edition) Prepared by the Association for Assessment in Counseling (AAC)

Result Analysis of the Local FCE Examination Sessions ( ) at Tomsk Polytechnic University

Horizontal and Vertical Alignment ...

UNIVERSITY OF JYVÄSKYLÄ Centre for Applied Language Studies

Effects of Different Response Types on Iranian EFL Test Takers Performance

Classroom and Formative Assessment in Second/Foreign Language Teaching and Learning

On the Causes of Negative Impact of CET-4

The impact of high-stakes tests on the teachers: A case of the Entrance Exam of the Universities (EEU) in Iran

EALTA GUIDELINES FOR GOOD PRACTICE IN LANGUAGE TESTING AND ASSESSMENT

Considerations of Conducting Spoken English Tests for Advanced College Students. Dr. Byron Gong

Twenty years of the Studies in Language Testing series: A thematic categorisation from a reader s perspective

Authenticity in language testing: some outstanding questions

The Official Study Guide

Web-based tests in Second/Foreign Language Self-assessment

The Challenge of Large-Scale English Language Testing in China. Yan Jin, Shanghai Jiao Tong University

The Washback Effect Of The Iranian Universities Entrance Exam: Teachers Insights

THE UNIVERSITY ENTRANCE EXAM IN SPAIN: PRESENT SITUATION AND POSSIBLE SOLUTIONS 1

4th CBLA SIG Symposium Programme Language Assessment Literacy - LAL

PRESENTATIONS AT PREOFESSIONAL CONFERENCES

CI 6337 Leadership in Higher Education COURSE SYLLABUS Spring 2010

INDIVIDUAL MASTERY for: St#: Test: CH 9 Acceleration Test on 09/06/2015 Grade: A Score: % (38.00 of 41.00)

INDIVIDUAL MASTERY for: St#: Test: CH 9 Acceleration Test on 29/07/2015 Grade: B Score: % (35.00 of 41.00)

Study Guide for the Mathematics: Proofs, Models, and Problems, Part I, Test

New language test requirements for UK visas and immigration. Visit for more information.

The NEPTON test: System Overview and functionality

Graduate Handbook of the Mathematics Department. North Dakota State University May 5, 2015

A Pilot Study of Some ROCMA Cadets Difficulties in English Speaking

GUIDELINES FOR M.P.H. STUDENTS

January 6, The American Board of Radiology (ABR) serves the public.

MOVING FROM GOOD TO GREAT IN NEW HAMPSHIRE:

British Council East Asia Assessment Research Grants 2015

The Licensing of I-O Psychologists

Study Guide for the Special Education: Core Knowledge Tests

HKIHRM HR PROFESSIONAL STANDARDS MODEL

Infusing Constructivism into a Curriculum Development Course: A Constructivist Approach in the ESOL Teacher Education Classroom

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

Ethics and Validity Stance in Educational Assessment

Assessing speaking in the revised FCE Nick Saville and Peter Hargreaves

Research Note: Applying EALTA Guidelines: A Practical case study on Pearson Test of English Academic

SUMMARY ACCREDITATION REPORT

A/E Joint Ventures: A Liability Challenge

NEW DEVELOPMENTS IN TEACHING READING COMPREHENSION SKILLS TO EFL LEARNERS

Dr. Wei Wei Royal Melbourne Institute of Technology, Vietnam Campus January 2013

Language Testing in Colombia: A Call for More Teacher Education and Teacher Training in Language Assessment

Fall 2009 Fall 2010 Fall 2011 Fall 2012 Fall 2013 Enrollment Degrees Awarded

Graduate Student Handbook of the Mathematics Department

Study Guide for the Library Media Specialist Test Revised 2009

By the end of the MPH program, students in the Health Promotion and Community Health concentration program should be able to:

A framework for validating post-entry language assessments (PELAs)

Using Value Added Models to Evaluate Teacher Preparation Programs

Difficulties that Arab Students Face in Learning English and the Importance of the Writing Skill Acquisition Key Words:

IDOL Outcomes and Competencies (ibstpi) 2

Study Guide for the Music: Content Knowledge Test

Student Manual. Ph.D. in International Business Administration. A. R. Sanchez, Jr. School of Business

The Official Study Guide

Master of Public Management: Chinese Government and Governance(MPM-CGG)

PROGRAMME SPECIFICATION POSTGRADUATE PROGRAMMES

Information for candidates

LANED-GE SECOND LANGUAGE THEORY AND RESEARCH. Spring 2013

Marketing Plan The University of Hawaii at Manoa and Sun Vat-Sen University China International MBA (CHIMBA) ( ) Executive Summary

EXECUTIVE SUMMARY. List all of the program s learning outcomes: (regardless of whether or not they are being assessed this year) Learning Outcome 1

History Graduate Program Handbook

PSAT/NMSQT Indicators of College Readiness

CHALLENGES OF NON-NATIVE SPEAKERS WITH READING AND WRITING IN COMPOSITION 101 CLASSES. Abstract

Transadaptation: Publishing Assessments in World Languages

STRATEGIC COMMUNICATION MANAGEMENT

JAPANESE GOVERNMENT (MONBUKAGAKUSHO: MEXT) SCHOLARSHIP FOR 2015 YOUNG LEADERS' PROGRAM (YLP) IN BUSINESS ADMINISTRATION

Assessing Adult English Language Learners

Subject Description Form

Study Guide for the Middle School Science Test

Development and Validation of the National Home Inspector Examination

REGULATIONS FOR THE POSTGRADUATE DIPLOMA IN INTERNATIONAL AFFAIRS (PDIPIA) AND THE DEGREE OF MASTER OF INTERNATIONAL AND PUBLIC AFFAIRS (MIPA)

Policy on the Accreditation of Continuing Education Certificate Programs. Policy PR-03

The Ph.D. program in Computer and Information Sciences

Evaluation of Practicum Student Competencies SIU Counseling Psychology Program

ASSESSING MIDDLE SCHOOL STUDENTS CONTENT KNOWLEDGE AND SCIENTIFIC

Equity for all Students Economic Development

Metzger CV 1. Scott Alan Metzger

Interdisciplinary Information Science PhD Program QUALIFYING EXAM PROCEDURES STUDENT ROLES HIGHLIGHTED

Speech-Language Pathology Study Guide

Ph.D. DEGREE IN ORGANIZATIONAL LEADERSHIP

Setting Individual RTI Academic Performance Goals for the Off-Level Student Using Research Norms

1. Modernizing Ohio Classrooms and Curriculum

Multiple-choice and Error Recognition Tests: Effects of Test Anxiety on Test Performance

Final Project Report

Ed.S. School Psychology Program Guidebook

SUMMARY ACCREDITATION REPORT

THE USE OF INTERNATIONAL EXAMINATIONS IN THE ENGLISH LANGUAGE PROGRAM AT UQROO

Technology and Writing Assessment: Lessons Learned from the US National Assessment of Educational Progress 1

INTENSIVE READING INTERVENTIONS FOR STRUGGLING READERS IN EARLY ELEMENTARY SCHOOL. A Principal s Guide

aprogramme SPECIFICATION POSTGRADUATE PROGRAMMES

Curriculum Vitae Lei WANG. Ph.D Candidate of Educational Leadership and Policy Studies Indiana University

Programme Specification. MSc Accounting. Valid from: September 2014 Faculty of Business

COURSE SYLLABUS FOR ARE 363: CURRICULUM AND TEACHING METHODS IN THE ELEMENTARY SCHOOL

Downey Unified Online Learning

PROGRAMME SPECIFICATION POSTGRADUATE PROGRAMMES

Teacher Effectiveness Webinar Series

Transcription:

Developing professional standards for EFL testing in China: Contexts, considerations, and challenges Presenter: Jinsong Fan, Ph.D. Fudan University Email: jinsongfan@fudan.edu.cn

Overview 1. Standards in language testing and assessment 2. Language testing in China: What s special? 3. Standards development and implementation in other testing contexts 4. Fundamental considerations for developing professional standards in China 5. Challenges facing standards development and validation 6. Conclusions and future studies 1

1. Standards in language testing and assessment A dictionary definition of standard : Standard refers to a level of quality, skill, ability or achievement by which someone or something is judged, that is considered necessary or acceptable in a particular situation. (Longman Advanced American Dictionary, 2000, p. 146) 2

Two interpretations of standards in language testing Definition 1: The skills and/or knowledge required to achieve mastery and proficiency levels leading to mastery, along with the measures that operationalize these skills and/or knowledge and the grades indicative of mastery at each level ( Davies, 2008, p. 437) e.g. cut-score, CEFR, ACTFL Definition 2: An agreed set of guidelines which should be consulted and, as far as possible, heeded in the construction or evaluation of a test (Alderson, Clapham, & Wall, 1995, p. 236) e.g. Standards for Educational and Psychological Testing (AERA, APA, & NCME, 1999), ETS Standards for Quality and Fairness (ETS, 2002) 3

Why are standards important? 1. Developing a language test and ensuring that it is valid or useful is extremely difficult (see e.g. Bachman, 1990; Alderson, Clapham, & Wall, 1995; Bachman & Palmer, 1996; Henning, 2001; Fulcher & Davidson, 2007, etc.). 4

2. The important role that language tests play in modern society: Results on language tests are often used to make a variety of high-stakes decisions such as admissions, employment, promotion, immigration, and citizenship (see, e.g. Spolsky, 1995; Shohamy, 2001a, 2001b; McNamara, 2005, etc.). 5

3. The call for better accountability, transparency and fairness in language testing practices (see e.g. Kunnan, 2000, 2004; Shohamy, 2001a, 2001b; Bachman, 2005; Bachman & Palmer, 2010; Xi, 2010, etc.). 6

Accountability Responsible professionalism (Boyd & Davies, 2002) The need for those involved in the testing act to assume responsibility for the tests and their uses (Shohamy, 2001b) (Accountability) requires shared authority, collaboration, involvement of different stakeholders test takers included as well as meeting the various criteria of validity The cost of this approach is high it takes more time, it involves more people, it requires greater resources But the cost is worth paying in order to demonstrate the ethicality of the profession (Shohamy, 2001b, p. 161-2) 7

Transparency Is test-related information equally accessible to all test candidates? Can test users access relevant information to make informed decisions about test candidates? Is information about the test quality available to all stakeholders? Can independent researchers access relevant test data to investigate the quality of language tests? 8

Fairness Traditional conceptualization of test fairness: Fairness as absence of bias The relationship between test validity and test fairness: A test has to be valid to be fair or A test has to be fair to be valid? Theoretical frameworks of test fairness (Kunnan, 2000, 2004) and approaches to investigate test fairness (AERA, APA, & NCME, 1999; Xi, 2010) 9

2. Language testing in China: What s special? 1) A huge testing country with a large number and variety of English language tests developed, administered and used at different levels and for different purposes 10

2) Large scale, high stakes, and strong washback effects on English teaching and learning Large scale: Huge test population High stakes: The results of English language tests are often used to make important decisions such as admissions into higher education, employment, the awarding of academic degrees, application for the permanent residential permit in major cities, etc. Strong washback effects: What is tested becomes what is taught: the assessment tail wagging the educational dog (Briggs, 1992, p. 11) 11

3) A highly centralized testing system Almost all tests are developed and administered by the relevant examinations authorities. The quality and authority of language tests are seldom questioned or challenged. Language testing is considered as more administrative than academic behavior (Yang & Gui, 2007). Test developers are believed to be solely responsible for the fairness of all testing operations. Stakeholders such as test takers and teachers cannot participate in the testing process as equal partners (see also Shohamy, 2001a). 12

3. Standards developed and implemented in other testing contexts Report of the Task Force on Testing Standards (TFTS) to the International Language Testing Association (ILTA) (ILTA TFTS, 1995): A survey of language testing standards worldwide A collection of 110 standards from all over the world with 58 standards identified as guidelines of good testing practice. Many standards in the collection have been revised or updated since the completion of the project (e.g. AERA, APA, & NCME, 1985). 13

Language testing standards in use: A few examples International Language Testing Association (ILTA) ILTA Code of Ethics (ILTA, 2000; see also Davies, 1997; Boyd & Davies, 2002) ILTA Guidelines for Practice (ILTA, 2007) Standards for Educational and Psychological Testing (AERA, APA, & NCME, 1999) Code of Fair Testing Practices in Education (JCTP, 2004) ETS Standards for Quality and Fairness (ETS, 2002) 14

Association of Language Testers in Europe: The ALTE Code of Practice (ALTE, 1994; see also Avermaet, Kuijper, & Saville, 2004) European Association of Language Testing and Assessment: EALTA Guidelines for Good Practice in Language Testing and Assessment (EALTA, 2006) Japan Language Testing Association: The JLTA Code of Practice (JLTA, 2007; see also Thrasher, 2004) 15

4. Considerations for developing standards in the Chinese context Consideration 1: Why a new set of standards? Why reinvent the wheel, particularly if there is an excellent wheel already? (McNamara & Roever, 2006, p. 146; see also Alderson, Clapham, & Wall, 1995). 16

Towards a new set of standards 1) The special features of language testing in China: Applicability and validity of other sets of standards 2) The product and process perspectives The product perspective: A set of guidelines for language testing in the Chinese context; The process perspective: Awareness-raising, more discussions about quality and professionalism, transparency and fairness, etc. 17

Consideration 2: Who are the targeted audiences/users of the standards? In any testing context, be it centralized or decentralized, test validity and fairness will eventually be compromised without the collaboration of all stakeholders in the testing process, including test design, administration, and use (Fan & Jin, 2013). 18

Test developers Other stakeholders Testing standards Test takers Test users EFL teachers Figure 1: The targeted audiences of the standards 19

Consideration 3: What purpose(s) do the standards serve? Primary Purposes (Educational & Aspirational) To enhance the awareness of quality among test developers To promote among the stakeholders the basics of language testing Targeted Audiences Test developers Test takers EFL teachers Test users Educational officials and administrators and other stakeholders Expected Outcomes Provide a benchmark of good testing practices Enhance professional awareness Promote dynamics between test developers and the other stakeholders Pursue better washback effects Figure 2: Purposes, audiences, and expected outcomes 20

Consideration 4: How to generate the standards? Three possible approaches to standards development The top-down approach: An organization develops the standards and impose the standards on all. The bottom-up approach: The standards reflect the consensus of the individual test developers. The interactive approach: A combination of the top-down and bottom-up approaches 21

Considerations for validity Considerations for validity Theoretical Frameworks Good practices in language testing Validity and validation Test administration and use Test fairness, etc. Language Testing Standards Contextual Features Identify the features of the local context Investigate the current situation of language testing practices 22

5. Challenges facing standards development and validation Challenge 1: The collection, review, and critique of the standards in existence What standards to include, and what standards to exclude: The criteria for standards selection How to review the standards in the collection: Theoretical frameworks and research methodologies How to build upon the existent research on language testing standards in the generation of our own standards? 23

Challenge 2: The reflection of local features in the standards How to determine the macro-structure of the standards? If some qualities (e.g. practicality) are more relevant to the local contexts, shall these qualities be prioritized over others in the standards? If some areas are identified as particularly problematic in the current language testing practices, shall these areas be emphasized in the guidelines? If requirements of the local features run into conflict with what is generally held as proper in language testing (e.g. protecting the privacy of test candidates scores), how shall these requirements be reflected in the guidelines? 24

Challenge 3: Enforceable or not enforceable? The ILTA Code of Ethics (ILTA, 2000): The failure to uphold the Code may have serious penalties, such as the withdrawal of the ILTA membership on the advice of the ILTA Ethics Committee. ETS Standards for Quality and Fairness (ETS, 2002): The audit requirements help to ensure that ETS products and services will be evaluated with respect to a uniform, rigorous set of standards through a well-documented process. 25

Are enforcement mechanisms feasible? Are standards without enforcement mechanisms not meaningful/valuable? Can the standards without enforcement mechanisms be adopted and adapted by individual test developers so that effective local enforcement mechanisms can be built? How to develop the enforcement mechanisms and to ensure their implementation in language testing practices? 26

Challenge 4: Standards validation Are the standards applicable to the testing context? How to validate the standards? How to continuously improve the standards based on the validation research? Shall standards validation include the investigation of the impact and consequences of the standards? Who shall be held responsible for investigating the validity of the standards? Who shall get involved in the process of standards validation? 27

6. Conclusion and future research The language testing standards developed and implemented in different parts of the world have clearly indicated the pursuit for better quality and professionalism in language testing and assessment (see also Davies, 1997, 2004; Boyd & Davies, 2002). The salient features of language testing in the China call for a set of professional standards which can cater to the needs and circumstances of language testing in the Chinese context (see also Yang & Gui, 2007; Fan & Jin, 2011, 2013). 28

29 A new set of standards will help to raise the awareness of professionalism among test developers and enhance the involvement of other stakeholders in the process of language testing. The standards shall be targeted at both test developers and the other stakeholders, including test takers, EFL teachers, test users, publishers, curriculum designers, and officials with educational or examinations authorities, etc. An interactive model shall be adopted in the generation of the standards with a view to both reflecting the theoretical frameworks in language testing and local features.

Future studies Building a corpus of language testing standards: Standards selection, review and critique; distilling useful experience and avoiding pitfalls (see Fan & Jin, 2010, 2012). Investigating the current language testing practices in the Chinese context: Identifying the gap between the current practices and the practices as reflected in the professional standards (see Fan & Jin, 2013). Investigating stakeholders views and perceptions of good testing practices: Involving stakeholders in standards development. Investigating the validity of the standards: Applicability, usefulness, impact and consequences, etc. 30

7. References 1. Alderson, J. C., Clapham, C. & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press. 2. Avermaet, P. V., Kuijper, H., & Saville, N. (2004). A code of practice and quality management system for international language examinations. Language Assessment Quarterly 1 (2 & 3), 137-150. 3. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press. 4. Bachman, L. F. (2005). Building and supporting a case for test use. Language Assessment Quarterly 2 (1): 1-34. 5. Bachman, L. F. & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press. 6. Bachman, L. F. & Palmer, A. S. (2010). Language assessment in practice. Oxford: Oxford University Press. 31

7. Boyd, K. & Davies, A. (2002). Doctor s orders for language testers: The origin and purpose of ethical codes. Language Testing 19 (3), 296-322. 8. Briggs, J. (1992). The psychology of educational assessment and the Hong Kong scene. Bulletin of the Hong Kong Psychological Society 28/29 (4), 5-26. Davies, A. (Guest editor.) (1997). Special issue: Ethics in language testing. Language Testing 14. 9. Davies, A. (Guest editor.) (2004). Special issue: Ethics in language testing. Language Assessment Quarterly 1 (2/3). 10. Davies, A. (2008). Ethics, professionalism, rights and codes. In Shohamy, E. & Hornberger, N. H. (Eds.) Encyclopedia of language and education (2 nd edition). Vol. 7: Language testing and assessment, 429-433. 11. Fan, J. & Jin, Y. (2011). The way towards a code of practice: A survey of EFL testing in China. Research paper presented at the 33 rd Language Testing Research Colloquium (LTRC). Ann Arbor: The University of Michigan. 32

12. Fan, J. & Jin, Y. (2012). Developing a code of practice for China s EFL tests: A data-based approach. Research paper presented at the 34 th Language Testing Research Colloquium (LTRC). Princeton, New Jersey: Educational Testing Service. 13. Fan, J. & Jin, Y. (2013). A survey of EFL testing in China: The case of six examination boards. Language Testing in Asia (3), 7. 14. Fulcher, G. & Davidson, F. (2007). Language testing and assessment: An advanced resource book. Routledge: London and New York. 15. Henning, G. (2001). A guide to language testing: Development, evaluation and research. Beijing: Foreign Language Teaching and Research Press. 16. ILTA-TFTS. (1995). Report of the task force on testing standards (TFTS) to the International Language Testing Association (ILTA). Retrieved from http://www.iltaonline.com/images/pdfs/tfts_report.pdf. 17. Kunnan, A. J. (Ed.). (2000). Fairness and validation in language assessment. Cambridge: Cambridge University Press. 33

18. Kunnan, A. J. (2004). Test fairness. In Milanovic, M. & Weir, C., (Eds.), European language testing in a global context: Proceedings of the ALTE Barcelona Conference (pp. 27-48). Cambridge: Cambridge University Press. 19. Longman. (2000). Longman advanced American dictionary. The author. 20. McNamara, T. (2005). 21 st century Shibboleth: Language tests, identity and intergroup conflict. Language Policy 4 (4), 351-370. 21. McNamara, T. & Roever, C. (2006). Language testing: The social dimension. Oxford: Blackwell Publishing. 22. Shohamy, E. (2001a). The power of tests: A critical perspective of the uses of language tests. London: Pearson Education. 23. Shohamy, E. (2001b). Democratic assessment as an alternative. Language Testing 18 (4), 373-91. 24. Spolsky, B. (1995). Measured words. Oxford: Oxford University Press. 25. Thrasher, R. (2004). The role of a language testing code ethics in establishment of a code of practice. Language Assessment Quarterly 1 (2 & 3): 151-160. 34

26. Xi, X. (2010). How do we go about investigating test fairness? Language Testing 20 (10), 1-24. 27. Yang, H. & Gui, S. (2007). The social dimensions of language testing. Modern Foreign Languages, 4, 368-378. 35

8. Appendix: Standards in this presentation 1. AERA, APA, & NCME. (1985). Standards for educational and psychological testing. Washington D.C.: AERA. 2. AERA, APA, & NCME. (1999). Standards for educational and psychological testing. Washington D.C.: AERA. 3. ALTE. (1994). The ALTE code of practice. Retrieved from http://www.alte.org/attachments/files/code_practice_eng.pdf. 4. EALTA. (2006). EALTA guidelines for good practice in language testing and assessment. Retrieved from http://www.ealta.eu.org/documents/archive/guidelines/english.pdf. 5. ETS. (2002). ETS standards for quality and fairness. Princeton, New Jersey: Author. 6. ILTA. (2000). Code of ethics. Retrieved from http://www.iltaonline.com/images/pdfs/ilta_code.pdf. 36

7. ILTA. (2007). The ILTA guidelines for practice. Retrieved from http://www.iltaonline.com/images/pdfs/ilta_guidelines.pdf. 8. JCTP. (2004). Code of fair testing practices in education. Washington D.C.: Author. 9. JLTA. (2007). The JLTA code of good testing practices. Retrieved from http://www.avis.ne.jp/~youichi/cop.html. 37

9. Acknowledgement The preparation of this presentation was supported by the National Social Sciences Fund (Guojia Sheke Jijin) under the project of The development and validation of standards in language testing (Project No. 13CYY032). 38

39

Jinsong (Jason) FAN, Ph.D. jinsongfan@fudan.edu.cn 40