THE UIVERSITY ETRACE EXAM I SPAI: PRESET SITUATIO AD POSSIBLE SOLUTIOS 1 Dr. Inmaculada Sanz (University of Granada Spain) 2 Miguel Fernández (Cicero Public Schools U.S.A.) Any official language test in any of the countries belonging to the European Union should follow the criteria established on the guidelines from the Common European Framework (CEF) of the Council of Europe or in the projects done by any of the associations on language testing in Europe, such as EALTA or ALTE. This implies professionalism in both the construction and the use of the test. However, there are still some examples of tests which do not follow any professional criteria; this is the case of the Spanish University Entrance Exam, called Selectividad; this test, apart of the problems related to its design and implementation, is performing two different functions: on one hand, it is designed as school-leaving examination, but, on the other hand, its actual purpose is to serve as a selection criterion to enter Spanish University. The English component of the Selectividad exam has been used for more than 25 years, and, over these 25 years, only a few changes have been done to this test. From our point of view, the test must be redesigned, new specifications should be developed, including a clear definition of contents and aims, and a new test construction protocol should be set to guarantee the quality of the test.the first step of the research we are conducting has been to collect empirical evidence of the vagueness of Selectividad results. We have compared our students results (: 95) when taking a commercial test and those obtained using the Selectividad exam. Our tentative conclusion is that the commercial test we have applied, with its limitations and acknowledging that it is not appropriate for the purpose of selecting students to enter the university, is a better predictor of first year students achievement, and therefore represents the candidates language ability in a better way than Selectividad results do. In this poster we will show (1) our findings on this research, and (2) the project on which we are currently working, whose objective is the development of a more valid and reliable Selectividad test, based on the appropriate methodology 1 This material was presented as a poster in the 2005 EALTA Conference. For that reason, the information here posted might not give many details and look somehow schematic. 2 Inmaculada Sanz lectures at the University of Granada (Spain) in Applied Linguistics. Her current research focuses on quality management systems at University levels, language testing and assessment and ESP. Email: isanz@ugr. Miguel Fernández is a PhD student (English Language Teaching) at the University of Granada. His research interests include the development of a methodology to design high-stake tests. Email: miguelfer77@hotmail.com 1
proposed by authors in language testing, like Alderson, Clapham, Wall and Hughes, among others. PRELIMIARY COSIDERATIOS The Selectividad exam is a high-stakes test: High-stakes tests are the basis for determining admission to the next layer of education (Cheng, 1999: 254). Because this exam affects the lives of many young people, the Spanish government needs to create a test which will consistently provide accurate measures of precisely the abilities in which we are interested; have a beneficial effect on teaching (in those cases where the tests are likely to influence teaching); be economical in terms of time and money (Hughes 2003: 6). If we analyse the present English test we find that: It lacks content validity because it does not include a representative sample of what it supposedly intends to measure: proficiency in English at a lower intermediate level. Two skills are missing: listening and speaking. To measure use of English, there are only four grammar items and four lexis items. There is only one short reading passage. It lacks construct validity: the items do not present candidates with meaningful, purposeful activities; the test does not measure students communicative language ability. It is not reliable; the composition requires subjective marking and therefore is the main source of extrisic unreliability; unfortunately this item is not doublemarked; besides, scorers are not consistently trained and a thoroughly constructed scoring key has not been developed. Last year in Granada province the range between the most lenient scorer (mean: 3.5) and the severest scorer (mean: 7) was of 3.5 points out of 10. 2
THE PRESET SELECTIVIDAD EXAM Only one reading passage Probably adapted Unclear text type Testing reading and writing at the same time Confusing rubrics a) Use your own words b) Words or phrases from the text ot a representative sample: 4 grammar items 4 lexis items Mechanical activities, not within a meaningful context Uncomparable outputs produced due to general topics with the possibility of choosing between two different ones o double marking; main source of extrinsic unreliability 3
CRITERIO-RELATED VALIDITY: CA WE VALIDATE THE SELECTIVIDAD EXAM? RESEARCH QUESTIOS 1.Do the Spanish Selectividad exam and the QPT (Quick Placement Test Oxford University Press) place students at the same level? 2.Can the Selectividad exam predict first year students academic achievement? METHOD : 95 Criterion measure First year students at English Philology Selectividad exam passed in June or in September QPT (Quick Placement Test Oxford University Press) A 60-item placement test Taken in mid October Candidates future performance to be predicted Academic achievement: umber of credits passed Average mark obtained RESULTS 1.The Selectividad exam and the QPT do not place students at the same level. There are students that failed the Selectividad and are placed at a lower intermediate level by the QPT. Even worse, students at a QPT elementary level were given a B at Selectividad. It seems that Selectividad has got a low discrimination strength. Recuento EGLISH ETRACE EXAM Total Fail C B A QPT level Lower Upper Beginner Elementary Intermediate Intermediate Total 1 5 4 0 10 0 15 13 1 29 0 12 21 10 43 0 0 1 3 4 1 32 39 14 86 2.There seems that students with a higher QPT level obtain a higher average mark at the end of their first English Philology year (see 2.1 below). The regression analysis and the scatterplot with the fit line show this predictive ability. This is not the case with the Selectividad mark (see 2.2 below). 4
2.1 QPT predictive ability QPT level Beginner Elementary Lower Intermediate Upper Intermediate Very advanced Total Credits First year passed mean 36,0000 1,00000.. 1 1 33,7714 1,07804 22,00145,620056 35 35 45,2195 1,36334 19,59657,484309 41 41 51,7059 1,85582 17,66976,860186 17 17 43,0000 1,55814.. 1 1 42,0421 1,34459 20,94771,663656 95 95 4,000 3,000 First year mean 2,000 1,000 R2 lineal = 0,146 0,000 0 1 2 3 4 5 QPT level Model 1 (Constant) QPT level Coefficients a Unstandardized Coefficients a. Dependent variable: First year mean Standardized Coefficients B Std. Error Beta t Sig.,776,156 4,972,000,312,078,382 3,987,000 5
2.2 Selectividad predictive ability EGLISH ETRACE EXAM Fail C B A Total Credits First year passed mean 32,1000 1,21667 24,49694,738241 10 10 42,0000 1,28037 16,64546,475620 29 29 44,5116 1,30522 23,71193,754831 43 43 49,5000 1,95962 12,12436,570386 4 4 42,4535 1,31698 21,33026,668120 86 86 4,000 3,000 First year mean 2,000 1,000 R2 lineal = 0,017 R2 lineal = 0,017 0,000 0 0,5 1 1,5 2 2,5 3 EGLISH ETRACE EXAM Model 1 (Constant) SELECTIVIDAD a. Dependent variable: First year mean Coefficients a Unstandardized Coefficients Standardized Coefficients B Std. Error Beta t Sig. 1,149,157 7,302,000,114,095,130 1,203,232 6
COCLUSIO Although the results of this analysis cannot be generalized, it is obvious, just taking into account the way in which the exam is organized, that this is not the best test that the Spanish government can produce: it seems that it does not measure students abilities in an accurate way. Therefore, it should be changed to ensure that the mark that students get reflects what their true command of English is. CREATIO AD PILOTIG OF THE EW TEST The two models considered in the creation of the new test were the principles proposed by Alderson, Clapham and Wall (1995) on the one hand, and by Hughes (2003) on the other. Their theories made us follow these steps in the process of the design of the new test: 1. Presentation of the problem The essential first step in testing is to make oneself perfectly clear about what it is one wants to know and for what purpose (Hughes: 2003, 59).It has been demonstrated that a change in the Selectividad exam is necessary, based on all the research done, and the data compiled. 2. Design of the test a. Specifications: Based on Bachman and Palmer s framework (1990), specifications were written, taking into account the (1) characteristics of the setting, (2) characteristics of the test rubrics, (3) characteristics of the input, (4) characteristics of the expected response and (5) relationship between input and response. These specifications were also written based on the present curriculum for Bachillerato (the last two years at high school in Spain). The recommendations in the Common European Framework of Reference have been followed to develop a test at B1 level, which is the intended level for secondary school leavers. The table below gives specific information about the organisation of the new test in preparation and its components: 7
Section ame Time Content Focus 1 Reading 40 min. 2 Listening 10 min. 3 Writing 30 min. 4 Speaking 10 min. 4 parts in which different skills and subskills are assessed through texts of different extension 3 parts in which different skills and subskills are assessed through recordings of different length 2 parts in which different skills and subskills are assessed through the writing of a text based on some pictures and the correction of a text with inaccurate information 2 parts in which different skills and subskills are assessed through the use of pictures and situations given to the candidates Assessment of candidate s ability to understand written material at text level Assessment of candidate s ability to un derstand everyday spoken language Assessment of candidate s ability to write coherent and cohesive texts. Assessment of candidate s ability to narrate personal experiences and to respond orally to different situations b. Item development: Based on the specifications, items were written after a thorough compilation of materials (texts for the reading part and recordings for the listening component). Techniques such as text mapping were used in the exploitation of the texts for the reading section. Multiple tasks and items were also included in all sections, with the objective of avoiding the method effect. These are some of the type of items presented: Multiple choice Gap-filling on a summary Multiple matching Composition Event ordering Editing C-test Oral report True/false questions Role-play c. Piloting: This is the phase in which we are currently working. The test will be piloted with a group of candidates with similar characteristics to those for whom it is designed (pre-university students). We are hoping to pilot the test with a number of test takers ranging from 30 to 200 (the optimal number to be able to run good statistical analyses). d. Item analysis: Once the results from the piloting are collected, we will analyse them in order to establish its validity and reliability. We will be using the Statistical Package for Social Sciences (SPSS), which will help us identify the facility value and the discrimination index of the test. Further studies such as the analysis of distractors in the multiple choice items will be carried out. 3. Test validation: Our last step in this project will be the validation of the test, with all the necessary changes after the analyses. It will finally be proposed to the 8
Spanish Education Office as the new model to follow for University entrance in the future. REFERECES Alderson, J.C., C. Clapham and D. Wall. 1995. Language Test Construction and Evaluation. Cambridge: Cambridge University Press. Bachman, L.F. 1990 Fundamental Considerations in Language Testing. Oxford: Oxford University Press. Cheng, L. 1999. Changing assessment: washback on tearcher perspectives and actions. Teaching and teacher education. 15: 253-271. Council of Europe. 2001. Common European Framework of Reference for Languages. Learning, Teaching, Assessment. Strasbourg: Council of Europe. Hughes, A. 2003. Testing for Language Teachers. Cambridge: Cambridge University Press. UCLES. 2001a. Quick Placement Test. Oxford: Oxford University Press. (CD-ROM version). UCLES. 2001b. Quick Placement Test. Oxford: Oxford University Press. (Paper and pencil version). Poster background image: Sierra Elvira (Granada) at dawn, watched by the moon 9