School Value Added Measures in England

Transcription

1 School Value Added Measures in England A paper for the OECD Project on the Development of Value-Added Models in Education Systems Andrew Ray Department for Education and Skills October 2006

2 School Value Added Measures in England Contents Page 1. Introduction 3 2. The objectives and development of value added models 5 3. Data sources for the calculation of value added The value added models The use and presentation of value added models 54 References 67 Annex A The National Curriculum and Key Stage tests 71 Annex B MLM residuals and the shrinkage factor 73 2

3 1. Introduction This paper has been prepared for the OECD Project on the Development of Value- Added Models in Education Systems. It is structured according to the requirements of the OECD group, with an emphasis on a description of the methodology for the value added models currently in use. Although it provides some explanation of the development and objectives of value added modelling, the paper does not discuss in detail the rationale for using value added models or similar data within schools or as part of initiatives to raise standards in attainment. Nor does it cover the wider academic debates concerning approaches to school improvement and school effectiveness and the ways these have affected education policy. The paper discusses the value added models in England it does not cover the other countries of the UK where there are no national value added systems. The focus here is on pupils of compulsory school age (age 4 to 16) and the paper does not discuss value added for older students at colleges. Many of the same issues would apply for measurement of post-16 value added but there are differences in terms of the qualification structure, the available data on contextual factors, the relationships between prior attainment and outcome, the types of institutions where study takes place, the extent of full and part-time study, and differences in the ages at which courses are taken 1. This paper also does not discuss directly the issues in providing value added models for special schools, a type of school that makes provision for pupils with statements of special educational needs. Before discussing value added models, a few background facts are useful in order to put the English school system into an international context: There are approximately 8.2 million pupils in 25,200 state-maintained and independent schools (DfES 2006e) - 7% of pupils were in the independent sector. Some pupils with special educational needs are educated in maintained schools; others are educated separately in special schools. There are about 17,500 primary schools, which generally cover ages 4-11, and about 3,400 secondary schools, normally covering ages (some have sixth forms covering post-16 as well). 1 For more information on current plans see the Learning and Skills Council Site: ult.htm. For an example of post-16 value added analysis, see O Donoghue et al (1997). 3

4 The average size of a secondary school is 980 pupils (approximately 140 pupils per year group on average); primary schools have about 240 pupils on average, 40 per year. Maintained schools are funded through local government: there are 150 Local Authorities covering England. Local Authorities vary considerably in size and characteristics. The smallest is the Scilly Isles with just one school and the largest is Kent with 103 secondary schools and 470 primary schools. The school year runs from September July (dates vary), divided into three terms. Assessment of attainment is made at the end of the school year, so data for the calculation of value added becomes available in the following autumn. This means that, for example, the model shown in Table 4.3 below for 2005 relates to assessments made in summer

5 2. The objectives and development of value added models Summary In England value added models were developed first in various projects for particular groups of schools. The establishment of a national curriculum and new sources of linked national data enabled consistent value added models to be based on pupillevel data. There have been two main phases in the development of value added models, both of which are discussed here: (1) simple value added scores based on prior attainment only; (2) more complex contextualised value added scores based on a range of factors and calculated using multilevel models. In addition to school level scores, value added and pupil progress information more generally has also been used and presented in graphs and tables. Value added modelling is now used: (1) In Performance Tables to provide information to parents and hold schools to account (2) In systems for school improvement, where data is used for self-evaluation and target setting (3) To inform school inspections, which are now tied into the school improvement process (4) To help select schools for particular initiatives (5) To provide information on the effectiveness of particular types of school or policy initiatives This section provides a brief overview of the development of value added models (other accounts can be found in the literature, e.g. Schagen and Hutchison, 2003). The key objectives of the current system are then outlined. 2.1 Local models for school improvement Before the introduction of the National Curriculum (see Annex A) the only examinations sat by all pupils in maintained schools were the GCSEs and similar qualifications at age 16. Although it was possible to benchmark or adjust these results using data about the school 2, there was no information on pupil level 2 See for example Sammons et al (1994) which discussed ways of benchmarking results in the absence of national pupil level data on prior attainment. 5

6 characteristics and no earlier test data with which to calculate value added scores. However, it was possible to provide value added analysis where pupils in a group of schools took specific tests. Some of these analyses were undertaken by academics, such as Mortimer et al (1988) and Goldstein et al (1993), others were carried out by analysts in Local Authorities. They varied in complexity and purpose: some involved the provision of specific feedback to promote school improvement, others were more concerned with developing value added methods and the evidence on school effectiveness. There is a summary of these early studies in SCAA (1994). Another model for local value added analysis in the absence of national data is the specialised centre. For example, starting in the 1980s, the CEM centre at Durham University has provided value added analysis for an increasing number of schools and continues to use its own tests, which are not based on the National Curriculum (Tymms and Coe, 2003). Another example is the National Foundation for Education Research s school improvement work using their QUASE data. Some analysis of the use made by schools of these kinds of value added data is in Saunders (2000). A more recent development has been the Fischer Family Trust s provision of value added analyses to a growing number of Local Authorities 3. This has utilised the National Curriculum test data (which will be discussed below) rather than alternative types of test. 2.2 Developments in school accountability and improvement data In the years leading to the introduction of the first national value added models, several significant changes were made in the English school system. In 1988 the National Curriculum was introduced in England, setting out the subjects and programmes of study which maintained schools are obliged to cover from ages For Key Stages 1-3, covering 5 to 14 year olds, a national system of testing and teacher assessment was established: attainment is assessed against criterionreferenced national curriculum levels at the end of each key stage (see Annex A). The testing system is run by the independent Qualifications and Curriculum Authority (QCA) and National Assessment Agency (NAA). 3 The Fischer Family Trust is an independent, non-profit organisation which, among other projects, provides analyses and data to support the processes of self-evaluation and targetsetting. From a total of 55 authorities in July 2001, the project now covers all LAs in England and Wales. 6

7 In 1992 the Performance Tables 4 for schools were introduced with the aim of informing parents in their choice of school and providing schools with an incentive to raise standards. The first tables showed results in the GCSE exams taken by 16 year olds (along with one indicator for A-levels taken by 18 year olds). In 1996 the first tables for primary schools were produced with results for the new Key Stage 2 tests taken by 11 year olds. Over time the tables have included more indicators, partly as a result of the greater quantity of information available at national level. The first value added scores for all secondary schools were included in 2002, with value added for primary schools following a year later. In the same year that Performance Tables were introduced, school inspection was reformed with the creation of Ofsted (the Office for Standards in Education). Ofsted inspects all maintained schools and Local Authorities in England and its inspectors have access to school attainment data, in the form of the Performance AND Assessment (PANDA) Reports 5. The data in these reports has therefore played an important part in school accountability as they form part of the evidence base used by inspectors to make judgements about school effectiveness. Ofsted s overall inspection reports are published. 6 Schools are graded as Outstanding, Good, Satisfactory or Inadequate; schools in this last category may be put into special measures or given a Notice to Improve. The development of national data on attainment and the ability to link outcomes with prior attainment facilitated the provision of a greater range of information to promote school improvement. Although, as noted above, school improvement analysis could already be provided by Local Authorities and academic institutions, not all schools had access or made use of this kind of data. To fill this gap, the first Autumn Package was produced in 1998, with national patterns of value added figures and statistics for groups of schools, allowing individual schools to benchmark their performance and set targets 7. The Autumn Package supplemented the Ofsted PANDA, which already contained data for specific schools (although no value added measures until they were introduced into Performance Tables). In recent years the Autumn Package has evolved into an interactive software system, called the Pupil 4 Now called the School and College Achievement and Attainment Tables, but for brevity referred to in this paper as Performance Tables. 5 Formerly the Pre-inspection Context and School Indicator (PICSI) Report. 6 Inspection reports can be seen at 7 See DfEE (1998a). Some limited value added information was actually made available one year earlier (QCA, 1998); prior to that there was no possibility of matching pupil level prior attainment to outcomes for the key stages in schools. 7

8 Achievement Tracker 8. In 1997 the government had set in train the development of better pupil-level data and consulted on the introduction of a unique pupil identifier that would help data to be matched throughout the school system. These UPNs were introduced in 1999, following work to consider the practical and data protection issues. Another key development was the move to an annual pupil-level census of schools, which would collect background characteristic data that schools recorded for administrative purposes. The Pupil Level Annual Schools Census (PLASC) was introduced in Underpinning it there had been discussion of a common basic dataset, i.e. a core set of agreed definitions of all variables that would be collected, to ensure consistency. The most recent key developments in the systems that use value added are part of an overarching policy called the New Relationship with Schools (NRwS). One of the aims of NRwS was to improve use of data, taking advantage of the new possibilities opened up by the introduction of UPNs and PLASC. This has led to the introduction of new products and more complex value added analyses. The PANDA and Pupil Achievement Tracker are being merged to create a new software system, RAISEonline, due to be released this autumn. The analysis in RAISEonline, which includes the results of multilevel value added models, will be used by school inspectors and the new School Improvement Partners established under NRwS. Another strand of NRwS is the introduction of School Profiles, which supplement the data in Performance Tables (including value added information) with more background information on schools. 2.3 National value added based on prior attainment In the early 1990s, the new emphasis on performance data to hold schools accountable gave rise to concerns that schools could not be judged fairly in the absence of value added measures. At the same time, the development of the Key Stage tests offered the possibility of calculating value added scores for each school based on progress between each Key Stage, once national data was available for the relevant cohorts of pupils. In 1994 the School Curriculum and Assessment Authority (forerunner of the QCA, the organisation that administers the Key Stage tests) published a report on value added 8 See 8

9 performance indicators (SCAA, 1994). Following this report, the Department for Education clarified in a briefing paper its aims in developing school and college value added measures: (1) to compare these institutions consistently at national level; (2) to allow detailed planning and targeting at local level; (3) to help school inspectors make more informed judgements (DfE, 1995). The Department also stated that value added measures should be based on prior attainment. Including additional socio-economic factors would be a challenge because of the difficulty in measuring them, with a danger that adjustments to value added measures on these grounds which were not sufficiently rigorous could justify poor performance or legitimise low expectations. A further study, the Value Added National Project, was commissioned from researchers at the University of Durham (Fitz-Gibbon, 1997). The remit of this National Project was to advise government on the development of a national system of value-added reporting for schools based on prior attainment, which will be statistically valid and readily understood. It defined the two major applications of such a system as internal school management and public accountability. The National Project concluded that simple statistical approaches were preferable to more complex multi-level models. It reported that headteachers were concerned about publication of value added measures, but that nevertheless a majority wanted to see value added in the Performance Tables. The researchers noted concerns about year-on-year variability in value added measures and the potential difficulty of taking into account pupil mobility, i.e. the movement of pupils between schools during the key stages. Taking the findings of the National Project into account, a simple value added method was piloted in the 1998 Performance Tables, making use of the fact that it was now possible to link Key Stage 4 outcomes to Key Stage 3 prior attainment (DfEE 1998b). This method compared each pupil s expected outcome, based on the national median GCSE result for each level of Key Stage 3 prior attainment, with their actual outcome. Value added scores for schools were then taken as the average of these differences for all their pupils. This method will be referred to here as the median method and its pros and cons are discussed in Section 4. This method for deriving schools scores was consistent with the approach taken in presenting national charts and information on value added in the new Autumn Package (DfEE, 1998a). 9

10 Although there were positive responses to this 1998 pilot, a major problem identified was that in covering only Key Stage 3 to 4 (the last two years of secondary schools), these measures gave an incomplete picture of overall value added. For example, a school could achieve a good Key Stage 3-4 VA score whilst underperforming between Key Stages 2 and 3. The decision was therefore taken not to publish VA scores nationally until the full secondary age range could be covered. In the first year this was possible, 2001, a further pilot of both KS 2-3 and KS 3-4 measures was undertaken, and the following year value added scores for these key stages were published for all schools (with a few exceptions, e.g. some private independent schools). Value added scores for primary schools were calculated using the same method, piloted in 2002 and published in The secondary and primary value added scores have appeared in Performance Tables in each subsequent year and were also used by Ofsted in their PANDA reports. It was eventually possible to match Key Stage 2 directly to Key Stage 4 results and value added scores on this basis were added to the Tables in The move to contextualised value added Whilst the publication of value added measures in the Performance Tables was generally seen as a positive advance on the publication of raw results only, there were some concerns about aspects of the methodology and presentation (see Section 4). At the same time, the developments in linking data, via the unique pupil numbers, and the first Pupil Level Annual Schools Census (PLASC) in 2002, offered scope to reconsider the possibility of including contextual data alongside prior attainment in value added models. Once PLASC data matched to information on pupil progress became available, towards the end of 2002, statisticians in the Department began analysing it to understand the relationships between the variables and what they said about national performance. Views of a selection of academics in the field were sought on the future direction of the value added work and, although there was no consensus of opinion, there was strong support from some for the development of more complex models that used the new data. Outside the Department, statisticians at the National Foundation for Education Research (NFER) and the Fischer Family Trust also began building value added models that took into account contextual factors. On the basis of some of the early work by NFER, the National Audit Office (2003) recommended that performance information should take into account not just prior attainment, but 10

11 also other external influences on performance, based on the new PLASC data. In January 2004 the schools minister made a speech announcing the New Relationship with Schools (Miliband, 2004). He recognised that there is a flourishing debate as to whether we should take account of more than the prior attainment when we calculate the value added by schools and said that over the coming months, we shall be consulting widely as we move towards a model of value added which commands the confidence of all. This consultation and development process has taken the views of schools and Local Authorities throughout the country, as well as further ongoing discussion with academics and between statisticians in the Department and Ofsted. In October 2004 a prototype contextualised value added (CVA) model covering the Key Stage 2-4 age range was discussed with schools. The following year a system of KS2-4 CVA scores was piloted for use in Performance Tables. The model used PLASC data on pupil background factors and a different methodology for calculating school scores: a two-level multi-level model. Ofsted also started to use this pilot CVA model, along with equivalents for other key stages, in their PANDA reports. The model is discussed in Section 4.2 below. Following the pilot some small amendments are being made to the method and a new 2006 contextualised value added model will be used this autumn. The last ten years has been a continued period of transition and this process will continue. The evolution of value added analysis has allowed concepts to be tested and more complex approaches to be introduced after schools have begun to utilise simpler data. The downside of this approach has been the difficulty in maintaining continuity and the way that some aspects of the current system are partly based on earlier approaches (e.g is a transition year for Performance Tables, with CVA used for secondary schools but the earlier VA measure used for primary schools). There continues to be discussion on which models to use for various purposes and how to present the results, for example in the new School Profiles and the new school improvement software system, RAISEonline. 2.4 Objectives of the current system This brief outline has already indicated that there have been several objectives in developing value added analyses and value added scores. The main uses of the current system are summarised below. 11

12 (1) Performance Tables The objectives of the tables remain to provide consistent accessible national data on the performance of schools, to inform parents and the public more generally, and ensure that schools are accountable for their results. The tables are resource intensive to produce accurately every year and are deliberately kept to a limited range of key indicators. They therefore do not, for example, provide results or value added for every subject taken at Key Stage 4. Users are directed to Ofsted inspection reports for a fuller picture of a given school. They are also told that value added measures represent a better estimate of school effectiveness than the raw results that take no account of prior attainment. As noted above, the new School Profiles, aimed at parents as a new source of general information on specific schools, will also include the Performance Tables value added measures. (2) Data for school improvement and inspections RAISEonline will provide a more extensive range of data than the Performance Tables, including value added for a wider range of outcome measures and for subgroups of pupils within the school. The main objective of RAISEonline is to provide all schools with a free software product that allows them to analyse their own data and compare it against national patterns or the results and value added achieved by high performing schools. Schools will use RAISEonline as part of the self-evaluation and target setting process that they undertake with the help of School Improvement Partners. The data will also be available to Ofsted s inspectors for use in judging the extent to which the school is improving or has the capacity to improve. The statistics will not be made available to the public more generally. It remains the case that schools and Local Authorities use a variety of data sources for school improvement in addition to the DfES and Ofsted products, from simple Excel spreadsheets to the analyses of specialists in value added like the Fischer Family Trust (Kirkup et al, 2005). (3) Selection of schools for particular initiatives Although value added is not used directly in funding schools, it has been used as a way of selecting out particular schools, for example, some schools are currently designated as High Performing and given additional responsibility for helping 12

13 weaker local schools or engaging in other projects 9. The data can also be used to look at specific issues, e.g. underperformance in particular subjects, and then targeted for additional support from the National Strategies (consultants employed to undertake activities to raise attainment standards nationally). (4) Monitoring policy initiatives The availability of value added results also provides useful information for monitoring progress in groups of schools subject to specific policies or administrative arrangements. Results of this kind using the linked key stage data were published by the Department in a statistical bulletin (DfES, 2002) and have subsequently been provided regularly as value added scores for groups of schools. External researchers have also used the data to construct regression models showing the relative progress made by pupils in certain types of school (examples are the results in NAO, 2003). However, there are limitations to this approach and it is debatable how far relatively simple cross-sectional results can be seen as robust estimates of the impact of a policy. 9 For more information see the section on High Performing Specialist Schools here: 13

14 3. Data sources for the calculation of value added Summary The value added models use national linked data on tests and various pupil characteristics. Tests are administered by an external body, the QCA, who are responsible for maintaining standards and ensuring that the levels of attainment are correctly specified. Pupil characteristics data are taken each January from all schools in England. The advantages and disadvantages of the data can be summarised as follows: Advantages Near universal coverage of pupils, allowing the data to be used extensively in school accountability and improvement The testing framework has a clear system of education levels and point scores that can be used in measuring value added Contextual information is now collected on every pupil and only a very small proportion of pupils and schools have missing data Pupil postcode data can be matched to a range of local area indicators The system of unique pupil numbers facilitates accurate matching and has led to the creation of a national database which can be accessed by researchers Disadvantages The system of high stakes national testing may be seen as having potential disadvantages in terms of its effects on pupils and teaching Only data on the core subjects are collected at Key Stages 1-3 Some privately educated pupils are excluded Contextual information does not cover all the background factors that would be relevant, e.g. there is no direct measure of social class. There is no means of linking results to teachers or subject departments. This section only covers the data sources used in the two generations of value added models used by the Department. It would in theory be possible to include additional variables from other existing data sources in these models, especially at school level this is discussed further in Section 4. Although there are data on the school as a 14

15 whole, there is no means of results directly to individual teachers or subject departments. 3.1 Test result data This section describes some of the features of the test data and points to some general issues relevant to the development of value added systems more generally: Should there be a national testing system? Who should administer and collect the test data? How reliable are the tests? Is data available for every school? Which pupils results are excluded? For which subjects should data be collected? Should test data be used exclusively or should teacher assessments also be included? How should different attainment levels be scored relative to each other? Does the test data capture any other information (e.g. gender) which could be used in a value added model? Examinations at age 16 have a long history, but the tests at age 7, 11 and 14 are recent developments. The tests themselves measure attainment in the curriculum subjects and do not measure other related but different qualities like aptitude or general intelligence. Without the collection of national data for these age groups it would not be possible to calculate comparable value added scores for all schools. However, it is worth bearing in mind that the tests were set up to provide outcome measures at the end of the different stages and were not designed specifically with the aim of calculating value added scores. The introduction of testing was a major undertaking and one that met with concerns about an over-emphasis on test results at the expense of broader educational aims. There was also concern about the effects of testing on younger pupils. On balance the government has judged the advantages of testing, both within schools and for wider accountability, to outweigh the disadvantages. However, the Welsh National Assembly voted in 2004 to phase out the Key Stage tests in Welsh schools. The English Department for Education and Skills as an organisation does not administer the testing system directly. This is done instead by external agencies, the QCA and NAA, partly to ensure the independence of the system (since rising pass 15

16 rates are used by the Department as a success measure). Information on the processes for developing tests (which, being public, have to be different each year) and on the ways marks are converted into levels can be found on the QCA website ( As part of their remit, the QCA undertake regular analysis of the tests reliability and quality; for example, they provide Cronbach s alpha measures for the Key Stage tests: in 2005 these ranged from 0.86 to 0.92 for Key Stage The NAA also survey schools for their views: for example in 2005, 94% of schools said that the KS2 maths tests were a very or fairly accurate reflection of pupils abilities; the figures for science and English were 95% and 83% (SMSR, 2005). There have been different systems for collecting and publishing the test results at each key stage. Currently the Key Stage 1 assessments are collected from schools by the 150 Local Authorities and then from these by the Department. For Key Stages 2 and 3 there is an external contractor to organise the external marking and collate the results. For Key Stage 4 the awarding bodies (five main ones and many smaller organizations) send data to an external contractor who assembles a database for the Department. Data are collected and published for almost all schools and pupils, including pupils in state maintained special schools (for pupils with special educational needs). The system is such that there would be no advantage in producing results for just a sample of schools. However, coverage of the tests is not universal. At Key Stage 2, for example, many private independent schools do not take the tests and value added scores can only be produced for a subset 11. Rather than include only those independent schools for which data are available, the KS2 Performance Tables excluded all independent schools. There has been a slightly different approach at Key Stage 4, where independent schools have been included and given the choice of whether to have their value added scores published when these are available. As explained in Annex A, test data for Key Stages 1-3 are only collected for the core 10 See the report here QCA s explanation of the Cronbach s alpha measure is that it measures how far the test is measuring a single concept such as spelling or reading or science. If the test were perfect, the result would be If some of the questions are measuring something else (for example if a lot of the items in the mathematics tests demanded high level skills in reading before you could understand what was required) the Cronbach s alpha would be low. In the national curriculum tests, most have coefficients above 0.80, and some over 0.90, so they do appear to be measuring what they claim. 11 In 2005 only about half the independent school pupils had data that could allow value added calculations at Key Stage 2. 16

17 subjects defined in the National Curriculum. Clearly other educational systems might collect data on a different range of subjects. The balance of subjects in the group used for either the input or output measure in a value added model will affect the results, for example more emphasis on reading in a group of subjects would tend to mean higher results by girls for that overall group 12. At Key Stage 4, by contrast, there are no core subjects: all qualifications can contribute to the outcome measure. At Key Stage 4 all the qualifications are externally assessed. However, at Key Stages 2 and 3 national data are collected on both test results and teacher assessments. Either or both could be used in value added models, but so far test data has been preferred because it is externally verified and therefore in theory more robust and consistent. At Key Stage 1, tests were not externally marked and there have been concerns about robustness of the data (see Tymms and Dean, 2004, who quote some head teachers views, although no firm evidence of biases). Since 2005 all Key Stage 1 levels are based on teacher assessment. Whilst this might introduce potential for bias (in contrast to a more objective test), there are good reasons for supposing it will make data more robust (since teacher assessment draws on a range of evidence rather than one test sat by 7 year old pupils). Some of the issues relating to the scoring system can be illustrated in relation to Key Stage 2. Pupils taking Key Stage 2 tests (aged 11) are given marks in each subject. These marks are converted into overall levels by the QCA. At Key Stage 2 there are basically three main levels. Level 4 is seen as the level expected of pupils at age 11 (when the levels were set up, the median pupil was half way through the Level 4s; since then targets have been set to try to increase the number achieving at least a Level 4). Level 5 is a higher level of attainment than Level 4; Level 3 is lower. Table 3.1 shows the current distribution for Mathematics, taken from DfES (2006a). Clearly the main levels cover a broad range of pupils and the system does not distinguish very finely between different pupils (for a discussion of the use of marks data in disaggregating the levels, see Section 4.2). It is also apparent from this table that there is potential for ceiling effects since the maximum level is 5 and some of the 33% of pupils achieving this may have been able to achieve higher levels if these existed. Table 3.1 Key Stage 2 test levels and points 12 In the PISA 2000 and 2003 studies, girls achieved higher marks for reading in all countries: see OECD (2004). 17

18 KS2 test outcome Points Distribution (%) for Mathematics in 2006 Level Level Level Compensatory N (not awarded a test level) 15 2 B (working below the level of the test) 15 4 Disapplied Disregarded 0 Absent Disregarded 1 Table 3.1 also shows the way levels have been assigned a point value (each level worth six additional points), for convenience in calculating averages across the three core subjects (English, mathematics and science). So, for example, a pupil achieving two level 4s and a level 5 would be given ( )/3 = 29 points. Pupils below level 3 all receive 15 points 13. Pupils who were absent or disapplied because they could not access the test 14 are not assigned points and are disregarded from the calculations. However the aim is to disregard as few pupils as possible so, for example, pupils with special educational needs (disabilities or learning difficulties) are included, even though some may not reach Level 3. The average point scores (APS) are published at school level in Performance Tables and have been used as the input measures in the simple median method value added models. The levels at each key stage are aligned by QCA to measure equal levels of attainment across subjects, over time and even across key stages. This means, for example, that a Level 5 achieved in mathematics at Key Stage 2 in 2002 is broadly equivalent to a Level 5 achieved in Science at Key Stage 3 in A system of this kind has numerous practical advantages but the task of maintaining these equivalencies is difficult and there is research that indicates that in some cases there may have been misalignment (Massey et al, 2003). Nevertheless, it is worth bearing in mind that none of these equivalencies is essential for a value added system. As long as all schools have taken the same tests, they can be compared against each 13 Pupils whose marks are relatively close to the Level 3 boundary receive a Level 2, others are given an N. Some pupils (often with special educational needs) are not entered for the test because they are known to be working below the level 3 standard ( B ). All three categories are given 15 points. It could be argued that the Bs and Ns should have lower point scores than 15 - decisions on these detailed aspects of the scoring system will have a small impact on the eventual value added models. 14 An example might be a pupil who is blind but unable to read brail (the tests are available in brail). 18

19 other within a given year. It would of course be possible to move away from the point score system of Table 3.1 or rescale the results (e.g. normalise them) so that, for example, the English and mathematics results have the same distribution within a year. However, the Department s models preserve the Levels used in the tests because they have a concrete meaning in terms of what can be achieved. They are also well understood by schools and parents. Table 3.2 The new point scoring system at Key Stage 4 Qualification Grade Points Qualification Grade Points GCSE Vocational GCSE GCSE Short Course A* 58 D 220 A 52 GNVQ Full M 196 B 46 Intermediate P 160 C 40 U/ X 0 D 34 D 110 E 28 M 98 GNVQ Part 1 F 22 P 80 G 16 U/ X 0 U / X / Q 0 D 136 A*A* 116 GNVQ Full M 112 AA 104 Foundation P 76 BB 92 U/ X 0 CC 80 D 68 DD 68 M 56 GNVQ Part 1 EE 56 P 38 FF 44 U/ X 0 GG 32 U / X / Q 0 A* 29 A 26 B 23 C 20 D 17 E 14 F 11 G 8 U / X / Q 0 At Key Stage 4 (16 year olds) there is a different qualification structure and point score framework. Table 3.2 gives an indication of the complexity of the system at this age: it only includes the main types of qualification but still suggests how wide the range of possible outcomes can be. For example, a pupil gaining an A in one of 19

20 his/her GCSE s would get 52 points for that qualification, whilst a B in a GCSE short course would contribute 23 points. An average point score across all the subjects can be calculated. The table illustrates the need for an elaborate framework of point scores and equivalency rules when all the less common qualifications available are included, as is now the case in England. In addition to the test levels, marks and exam grades, information is also available from the test data on the pupil s name, date of birth and gender. Names may reflect aspects of social background or ethnicity but could not reasonably be used in a value added model. Even though pupils normally sit the key stage tests at a given age, their age within year is correlated with outcomes and so can be used in a value added model (see Section 4). Obviously gender can also be used in value added modelling. At present there are no plans for additional national tests. There are optional tests available for intermediate years, administered by QCA, but these are not taken in all schools. There are also Pscales designed to measure very low levels of attainment for pupils with special educational needs: these are used to inform teacher assessment but are not yet part of the statutory national data collection system. The Foundation Stage Profile is completed by schools for all 5 year olds, but only a 10% sample is collected nationally so value added measures for individual schools could not currently be based on this data. 3.2 The Pupil Level Annual School Census PLASC was introduced in 2002 with the aim of collecting contextual data from schools administrative records on all pupils annually (i.e. not just at the end of each key stage). The main variables in the PLASC data, all of which are used in the contextualised value added model, either directly or transformed in some way, are discussed below. The numbers of pupils falling into some of the groups (ethnic minorities, special educational needs etc.) are shown in the context of Section 4 s discussion of the model (Table 4.4). This section focuses on the key variables and does not discuss more general problems in collecting data for every pupil, e.g. duplicate records or pupils without a Unique Pupil Number. These kinds of error are relatively rare and are not significant enough to be a concern for the production of value added scores. As will be apparent, PLASC has continued to evolve since its introduction in 2002, 20

21 with new variables and new classifications, and it will continue to change, offering new possibilities for value added modelling. This year a termly count has been introduced, although most of the variables here will still only be collected once a year. Data will also be collected on absences and exclusions from school. It would not be appropriate to include these directly in a value added model because although they may well improve the model s explanatory power, schools should to some extent be responsible for these factors. However, data will be available on reason for absence and so it may be possible to include a flag for pupils who have, for example, been sick for a long period. Gender Gender was available before the introduction of PLASC but is now collected as part of PLASC. There are no particular issues with this variable, although given the millions of pupil records collected there are inevitably a few anomalies (0.1% of pupils are recorded as having a different gender in 2006 compared to 2005). Entitlement to Free School Meals Children whose parents receive the social welfare benefit Income Support, and some related benefits, are entitled to claim free school meals (FSM). To become entitled to FSM the parents have to indicate a wish for their child to have a school meal and give a proof of benefit receipt. FSM is the only direct measure on PLASC that relates to the pupil s family income. It is a useful variable, but cannot be considered an accurate proxy for social class or income more generally because it is a simple binary flag. The 85% of pupils who are not entitled to FSM vary considerably in their home background and socio-economic status, and there is also variation in circumstances within the FSM group itself (see for example Table 5.7 in DfES, 2006b). Even as a simple proxy measure for deprivation FSM has disadvantages. The fact that parents have to register an interest in their pupils having school meals may discourage some from applying. Those not applying but eligible may be an unrepresentative group, e.g. choosing not to register for cultural reasons or on the basis of dietary preferences that may correlate with attitudes to education. The benefit rules may also exclude some families who could be considered deprived, e.g. some low paid workers. Clearly a pupil's entitlement to a free meal may change when family circumstances change and this may not always get recorded, although it 21

22 is possible to trace movements in and out of FSM status on PLASC (see DfES (2006b) Appendix D). There is also a specific problem currently with using data from one Authority, Hull, where a healthy eating campaign was introduced in 2004 that gave all primary school pupils, regardless of income, free school meals. More generally, the propensity of parents to register as entitled to FSM may vary in response to the quality of meals, which may vary from school to school or between Local Authorities 15. Ethnicity PLASC collects data for 18 main ethnic groups, with a 19 th code available for unclassified since provision of this data is voluntary. The unclassified group represented 3.4% of the pupils included in the Key Stage 2-4 contextualised value added model. The proportion of unclassified pupils has fallen since the introduction of PLASC, although the current proportion is only a slight reduction on the 2005 figures. The unclassified pupils are not nationally representative they tend to have relatively low attainment. In 2006 there were two schools that declined to return any data on ethnicity 16. Even with 18 ethnic groups, the codes obviously have to cover pupils with very different characteristics. For example, the Black African category covers pupils who may or may not speak English, or who may come from recent or long established immigrant communities. PLASC includes extended codes which Authorities can use, that record, for example, specific African countries of origin. However these extended codes are not used in value added model they are not universally collected and they would add considerably to the complexity of the model. Between 2002 and 2003 the available ethnicity codes changed (see Godfrey, 2004) and although no further changes are planned, it is worth noting that any change of this nature would affect the specification and consistency of value added models. The source of the data on ethnicity could be the school, parents or the pupil themselves: this question of who is best placed to supply the data is something that needs to be considered separately for each variable. There is no firm guidance on ethnicity, although parents are now the main source and would be considered the most reliable supplier for young pupils in particular. In 2006, across all pupils in 15 Recent criticism of the quality of school meals in the media has been cited by some schools as a reason for the fall in their reported proportion with FSM in One of these is a Jewish school which objects to the ethnicity codes because Jewish is not one of the possible codes. 22

23 PLASC in secondary and primary schools, 86% had an ethnicity code provided by their parents and almost 70% of schools had 90%+ of their pupils ethnicity codes supplied by parents. There would be some concerns about the quality of data in the small number of schools (especially primary schools) where the majority of data appears to come from pupils. There are also a few schools that supply all the information themselves rather than asking pupils or parents. Clearly a change in source for a given pupil may result in an ethnicity code changing. For PLASC as a whole, 2.3% of pupils changed ethnic category between 2005 and Ideally we would want ethnicity to be fixed and stable, but some of the changes are actually from the unclassified or other status to a specific ethnic group and these represent improvements in the quality of the data (Table 3.3) Table 3.3 Improvements to ethnicity coding between 2005 and 2006 Percentage of pupils of compulsory school age and over [1] as at January 2005 and January 2006 Matched on Pupil UPN Specific ethnic category Pupils codes in 2005 Other Ethnic category Unclassified ethnic category [2] Pupils codes in 2006 Specific ethnic category Other Ethnic category Unclassified ethnic category [2] Pupil ethnicity is only collected for those pupils aged 5 or over as at 31 August preceding the start of the academic year. 2. Includes both pupils where ethnicity has not been collected and pupils who have refused to give their ethnicity. Special Educational Needs Special Educational Needs (SEN) covers a wide range of needs that are often interrelated as well as specific needs that usually relate to particular types of impairment. Children with SEN will have needs and requirements which may fall into at least of one of four areas: communication and interaction; cognition and learning; behaviour, emotional and social development; and sensory and / or physical needs. The SEN Code of Practice sets out a graduated response to meeting children's SEN: School Action, where the class teacher or SEN Co-ordinator provides interventions that are additional to or different from those provided as part of the school's usual differentiated curriculum and strategies 23

24 School Action Plus, where the intervention at School Action has not resulted in improvement and external advice is sought; A statement of SEN setting out the child s needs, issued by the Local Authority when School Action Plus has not resulted in improvement or the child's needs are particularly complex. Funding for SEN can differ in each Authority depending on the funding formula agreed with their schools. A child with very similar needs may be at School Action Plus in one Authority, but have a statement in another. For the purposes of the Department's modelling, the two interventions which involve external advice and support School Action Plus and a statement of SEN - are combined. More detailed data is collected on the type of primary SEN need for pupils with a statement and those at School Action Plus. The four areas of need are further split to include autistic spectrum disorder, type of learning difficulty, visual and hearing impairment. Pupils identified as having learning difficulties are more likely to be low attainers than those with a physical or other impairment without additional learning needs (DfES, 2005). This detailed data could in theory be used in the modelling but has not so far, partly so as not to complicate the model, but partly because there may be inconsistencies in the reporting of this information across Authorities. In addition there are still about 5% of pupils who are not recorded as having one of the specific primary need categories. In PLASC overall, 91% of pupils had the same SEN status in 2006 as they had in The overall percentage of pupils with SEN has been under 20% for the last 5 years. Where pupils with SEN are educated will be affected by Local Authority changes to their schools. For example, the closure or contraction of a special school (which caters for pupils with a statement of SEN) would lead to more SEN pupils in other local schools both mainstream and special. Conversely, the opening of a special school, or of a special needs unit or resource base within one of the maintained schools, could reduce the number of pupils with SEN in surrounding schools. First Language This data item aims to collect data on the first language of pupils English or other than English (in addition, some pupils are coded as not known but believed to be English or not known but believed to be other than English ). First language is 24