1 Language assessment for migration and social integration: a case study Paola Masillo Università per Stranieri di Siena
2 Scope of the study Partial results of a PhD research project dealt with a comparability study of two Italian language proficiency tests delivered in Italy in order to assess language competence of migrants. Legislative decree 286/1998, art.9, Immigration Act as modified by law 94/2009; Ministerial decree June 4 th 2010 on Methods of implementation of the Italian language test
3 Scope of the study Table 1 Source: Ministry of the Interior - Department for Civil Liberties and Immigration
4 Context of research I. Increasing mobility and migration and easier international and intra-national communication: direct implications on language uses and on language policy (Council of European Union, 2002; European Commission, 2008; Blommaert, Leppänen & Spotti, 2012) Not all kinds of multilingualism are considered as having the same value: not all forms of multilingualism are productive, empowering and nice to contemplate (Blommaert, Leppänen & Spotti, 2012: 1).
5 Context of research II. Shift in the understanding of the functions, status and roles of language tests. From tools used to measure language knowledge, they are viewed today more and more as instruments connected to and embedded in political, social and educational contexts (Shohamy, 2007: 117).
6 Context of research III. Publication in 2001 of the CEFR: the most important reference document and operational tool in the fields of language learning, teaching and assessment (CoE, 2001). Language tests are shielded by policies based on the ideological premise that success in language tests not only shows a clear willingness to integrate on the part of the person taking the test, but also provides the key to success in the workplace and in society as a whole. The power of tests becomes even stronger when test criteria such as rating scales affect language policy and are used to make decisions about people s lives (Shohamy, 2006).
7 Overview of the study Legislative decree 286/1998; Ministerial decree June 4 th 2010: Italian language proficiency test for migrants asking for EC residence permit of long-term. Proficiency level: CEFR A2 VADEMECUM, Indicazioni tecnico-operative per la definizione dei contenuti delle prove che compongono il test, criteri di assegnazione del punteggio e durata del test (MIUR, 2010): Test design: Listening, Reading and Written Interaction Assessment criteria
8 Written interaction test One task CEFR Level A2 Correspondence Notes, messages and forms 10 minutes. Table 2 Can write very simple personal letters expressing thanks and apology. Can take a short, simple message provided he/she can ask for repetition and reformulation. Can write short, simple notes and messages relating to matters in areas of immediate need. The candidate is required to reply to s, postcards, invitations or he is required to fill out a form (enrollment in courses or schools, personal data, requests for permits, grants, bank accounts,...).
9 Assessment criteria Table 3 Test is performed in a complete and correct way Test is performed in a partial way Test is not evaluable answers are provided consistently and appropriately to the information required or the form is filled in all its parts (up to 35 points) answers are not always provided consistently and appropriately to the information required or the form is filled in partially (up to 28 points) answers are not provided or the form is not completed (0 point)
10 CEFR Level A2 Can perform and respond to basic language functions, such as information exchange and requests and express opinions and attitudes in a simple way. Can socialize simply but effectively using the simplest common expressions and following basic routines. Can use some simple structures correctly, but still systematically makes basic mistakes for example tends to mix up tenses and forget to mark agreement; nevertheless, it is usually clear what he/she is trying to say. (CoE, 2001)
11 Research questions 1. Can the imposed assessment criteria describe and measure the language competence of migrants? 2. Can we recognize validity and reliability in measuring and assessing a basic proficiency level on the basis of appropriateness or consistency?
12 The sample N. 157 test takers N. 314 scripts 40 : 20 scripts for test format 1 20 scripts for test format 2 CEFR level: A1 + - B1 - Using the ministerial 35-point scale 11 raters
13 The sample Rater n. Profile Table 4 Experience in the field of LT and Assessment (n. years) CEFR levels of competence assessed so far (mostly) test developer and 1 teacher trainer 11 A1, A2, B1, B2, C1, C2 2 teacher and rater 4 A1, A2, B1 3 test developer and teacher trainer 20 A1, A2, B1, B2, C1, C2 4 teacher and rater 7 A1, A2, B1, B2, C1 5 rater 1 A2 6 teacher and rater 3 A2, B1 test developer and 7 teacher trainer 18 A1, A2, B1, B2, C1, C2 8 teacher and rater 4 A1, A2, B test developer and teacher trainer test developer and teacher trainer test developer and teacher trainer 19 A1, A2, B1, B2, C1, C2 20 A1, A2, B1, B2, C1, C2 18 A1, A2, B1, B2, C1, C2
14 Test n. 1 INTERAZIONE SCRITTA QUESTO È IL MODULO PER LA RICHIESTA DI LAVORO. DEVI COMPILARE IL MODULO logo di m0rula 1) Fill in the form ALL AGENZIA DI LAVORO IL SOTTOSCRITTO - LA SOTTOSCRITTA: (NOME)... (COGNOME)... NATO A (CITTÀ DI NASCITA)...(STATO). INDIRIZZO: VIA.. NUMERO. CITTÀ... NUMERO DI TELEFONO... PATENTE ITALIANA SÌ [ ] NO [ ] CHIEDE DI LAVORARE COME: SPIEGA PERCHÉ ) Declare what kind of job you are asking for and explain why FIRMA DATA / / 2013
15 Test n. 2 Send a message to a friend talking about: - Job - Market - Study
16 The method Step 1: Familiarization Traditionally, rater training attempted to reduce both variability associated with the differences in overall severity and randomness (Lumley, McNamara, 1995: 56). Rater training can reduce but by no means eliminate the extent of rater variability. Rater training has the effect of reducing extreme differences outliers in terms of harshness or leniency are brought into line. [ ] Rater training is successful in making raters more self-consistent (Lumley, McNamara, 1995: 57).
17 The method Step 2: Task design Level of clearness Totally unclear Level of appropriateness Absolutely inappropriate Level of importance for the test taker Not at all important Slightly clear Slightly inappropriate Slightly important Clear Appropriate Important Absolutely clear Absolutely appropriate Very important
18 The method Step 3: Rating process Step 4: Can the evaluation criteria imposed by the Handbook of the Ministry of the Education accurately reflect the construct that the test is intended to measure? Level of acceptability Totally unacceptable Slightly acceptable Acceptable Perfectly acceptable
19 Findings and Discussion Step 2: Task design Table 5 Mode Clearness Appropriateness Importance TASK Test TASK Test , 3
20 Findings and Discussion Step 3: Rating process Table 6 Example 1, Test 1, Candidate 07 Chiede di lavorare come commesa You ask to work as a Saleswoman Spiega perché Please explain why Example 4, Test 2, Candidate 9 Mercato Market perché ho già lavorado in passato como commesa in un negozio di abbigliamento. Per tanto tempo credo di aver la speriencía per il posto. because I previously worked as a saleswoman in a clothing store for so long. I think I have the experience for this job. Celine te scrivo questo messaggio x andare il mercoledì al mercato per comprare abigliamenti de questa temporada, anche puoi dire a tua sorella Vilma. Te saluto y aspetto tu messaggio. Civediamo verso le 9.00 am. Baci. Maribel Celine I am writing this message to ask you to go Wednesday to the market to buy clothing for this season. Tell also your sister Vilma. I say good-bye and I wait your message. See you at about 9.00 am am. Kisses. Maribel
21 Findings and Discussion Step 3: Rating process Table 7 Statistics Writing T1 totscore N Valid Missing 0 0 Mean Mode Standard deviation Range Minimum Maximum Writing T2 totscore (Green, 2013; Weir, Chan, Nakatsuhara, 2013)
22 Findings and Discussion Step 4: Validity of assessment criteria 11 out of 11 raters slightly acceptable Inter-rater reliability: 16 out of 55 are good in both cases (.7 or upwards) i. The main difficulty I found was the choice of the appropriate score with a grid of this kind. It seems shortly interpretable and at the end there is the risk of choosing a scoring by chance. ii. The evaluation criteria of the Ministry of Education seem to me very little detailed and lend themselves to multiple interpretations. An assessment scale from 0 to 28 is very vague. iii. The evaluation criteria of the Ministry of Education are not adequate, the margins are very wide and leave room for subjective evaluation. It lacks a grid to be followed.
23 Final considerations Results obtained by both analyses show that while the inter-reliability remains quite high, raters were also strongly influenced by their intuitive impression, expertise and previous experience (Wu, Ma, 2013). This approach suggests that the rating scale does not address all the elements that influence raters decisions (Wu, Ma, 2013) and does not reflect the criterion and the construct that the test intends to measure.
24 Final considerations The findings we obtained can be summarized in the following comments, which suggest the need of a revision of the assessment procedure, particularly with regards to the criteria, in order to enhance scoring validity (Wu, Ma, 2013): i. the message is not completely effective ii. I find it very difficult to attribute a score with these parameters, since it does not refer to [...] and efficacy, but only to the completeness of the information. iii. I choose this mark because the effectiveness is not good due to a wrong grammar the message is clear iv. lack of effective communication v. the production is not very effective
HANDBOOK ENGLISH # 5160-LZB-010101 A2 B1 ABBREVIATED VERSION free download at www.telc.net www.telc.net All rights reserved; no part of this publication may be reproduced, stored in a retrieval system,
Using the CEFR: Principles of Good Practice October 2011 What [the CEFR] can do is to stand as a central point of reference, itself always open to amendment and further development, in an interactive international
SIOP Employment Testing Guide Adapted from The following guide to Employment Testing comes from the Society for Industrial and Organizational Psychology s website, SIOP.org. This guide will answer questions
THE ALTE CAN DO PROJECT English version Articles and Can Do statements produced by the members of ALTE 1992-2002 Contents Page Part 1 Appendix D to Council of Europe Common European Framework of Reference
OVERVIEW Brief description This toolkit deals with the nuts and bolts (the basics) of setting up and using a monitoring and evaluation system for a project or an organisation. It clarifies what monitoring
JCMS 2006 Volume 44. Number 2. pp. 391 417 Who Has Power in the EU? The Commission, Council and Parliament in Legislative Decisionmaking* ROBERT THOMSON Trinity College, Dublin MADELEINE HOSLI Leiden University
A Self-Directed Guide to Designing Courses for Significant Learning L. Dee Fink, PhD Director, Instructional Development Program University of Oklahoma Author of: Creating Significant Learning Experiences:
HANDBOOK ENGLISH www.telc.net All rights reserved; no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying,
Business English Certificates Handbook for teachers www.cambridgeesol.org EMC/1525/7Y2 UCLES 28 Preface The BEC examinations assess language skills in a business context. BEC is suitable for both professionals
Business Certificates Business English Certificates (BEC) CEFR Levels B1, B2, C1 Handbook for teachers contents Preface This handbook is for teachers who are preparing candidates for Cambridge English:
Business Certificates Business English Certificates (BEC) CEFR Levels B1, B2, C1 Handbook for Teachers CONTENTS Preface This handbook is for teachers who are preparing candidates for Cambridge English:
= ===================== bêñìêí=påüççä=çñ=mìääáå=mçäáåó= Evaluation of the Carlo-Schmid- Program for internships in international organizations and EU institutions PR FI Professional Education for International
G3658-11 Collecting Evaluation Data: End-of-Session Questionnaires Ellen Taylor-Powell Marcus Renner Program Development and Evaluation University of Wisconsin-Extension Cooperative Extension www.uwex.edu/ces/pdande
Martin Humburg, Rolf van der Velden Skills and the Graduate Recruitment Process: Evidence from two discrete experiments RM/14/003 Skills and the Graduate Recruitment Process: Evidence from two discrete
Consumer Voices for Coverage: Advocacy Evaluation Toolkit Robert Wood Johnson Foundation Consumer Voices for Coverage Evaluation Prepared by: Debra A. Strong Todd Honeycutt Judith Wooldridge Mathematica
TABLE OF CONTENTS INTRODUCTION... 1 WHAT HAPPENS IF MY CHILD IS HAVING TROUBLE LEARNING IN SCHOOL?... 2 STEPS TO GETTING SERVICES... 3 ANSWERS TO FREQUENTLY ASKED QUESTIONS... 9 REQUEST FOR ASSISTANCE...
Information for registrants Continuing professional development and your registration Contents Introduction 2 About this document 2 CPD and HCPC registration: A summary of CPD and the audit process 2 CPD
Table of Contents THE MASTERY LEARNING MANUAL CHAPTER 1 INTRODUCTION TO MASTERY LEARNING Introduction by Carla Ford 1 Perspective on Training: Mastery Learning in the Elementary School 2 CHAPTER 2 AN OVERVIEW
State-Funded PreK Policies on External Classroom Observations: Issues and Status Policy Information Report This report was written by: Debra J. Ackerman Educational Testing Service Author contact: firstname.lastname@example.org
BUSINESS GROUP FORMATION Empowering Women and Men in Developing Communities TRAINER S MANUAL Subregional Office for East Asia Trade Union Manual for Organizing Informal Economy Workers BUSINESS GROUP FORMATION
HM Treasury Cabinet Office National Audit Office Audit Commission Office For National Statistics FOREWORD The business of government is complex. What matters to the public about performance of government
3 Common Reference Levels 3.1 Criteria for descriptors for Common Reference Levels One of the aims of the Framework is to help partners to describe the levels of proficiency required by existing standards,
Can Do Statements Copyright 2013 Centre for Centre for 294 Albert Street, Suite 400 Ottawa, ON K1P 6E6 For information on the or Niveaux de compétence linguistique canadiens, visit: www.language.ca Can