Survey Weighting and the Calculation of Sampling Variance



Similar documents
Calculation of Sampling Weights

Demographic and Health Surveys Methodology

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

An Alternative Way to Measure Private Equity Performance

Traffic-light a stress test for life insurance provisions

The Current Employment Statistics (CES) survey,

What is Candidate Sampling

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

How To Calculate The Accountng Perod Of Nequalty

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

Statistical algorithms in Review Manager 5

To manage leave, meeting institutional requirements and treating individual staff members fairly and consistently.

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

Overview of monitoring and evaluation

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

DEFINING %COMPLETE IN MICROSOFT PROJECT

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

The OC Curve of Attribute Acceptance Plans

Hollinger Canadian Publishing Holdings Co. ( HCPH ) proceeding under the Companies Creditors Arrangement Act ( CCAA )

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Instructions for Analyzing Data from CAHPS Surveys:

1. Measuring association using correlation and regression

Multiple-Period Attribution: Residuals and Compounding

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

Reporting Forms ARF 113.0A, ARF 113.0B, ARF 113.0C and ARF 113.0D FIRB Corporate (including SME Corporate), Sovereign and Bank Instruction Guide

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

A 'Virtual Population' Approach To Small Area Estimation

The Greedy Method. Introduction. 0/1 Knapsack Problem

Traffic-light extended with stress test for insurance and expense risks in life insurance

Recurrence. 1 Definitions and main statements

Brigid Mullany, Ph.D University of North Carolina, Charlotte

Analysis of Premium Liabilities for Australian Lines of Business

1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

Forecasting the Direction and Strength of Stock Market Movement

Tuition Fee Loan application notes

Meta-Analysis of Hazard Ratios

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

Stress test for measuring insurance risks in non-life insurance

VISA REQUIREMENTS AND FEES BY COUNTRY

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

Data Mining from the Information Systems: Performance Indicators at Masaryk University in Brno

FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES

Extending Probabilistic Dynamic Epistemic Logic

Realistic Image Synthesis

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Design and Development of a Security Evaluation Platform Based on International Standards

LAW ENFORCEMENT TRAINING TOOLS. Training tools for law enforcement officials and the judiciary

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

BERNSTEIN POLYNOMIALS

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

Capacity-building and training

Sketching Sampled Data Streams

Evaluating the Effects of FUNDEF on Wages and Test Scores in Brazil *

Traffic State Estimation in the Traffic Management Center of Berlin

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

This circuit than can be reduced to a planar circuit

7 ANALYSIS OF VARIANCE (ANOVA)

An Interest-Oriented Network Evolution Mechanism for Online Communities

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

Construction Rules for Morningstar Canada Target Dividend Index SM

Marginal Benefit Incidence Analysis Using a Single Cross-section of Data. Mohamed Ihsan Ajwad and Quentin Wodon 1. World Bank.

CHAPTER 14 MORE ABOUT REGRESSION

Time Value of Money Module

Transition Matrix Models of Consumer Credit Ratings

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Single and multiple stage classifiers implementing logistic discrimination

Implementation of Deutsch's Algorithm Using Mathcad

Nordea G10 Alpha Carry Index

Statistical Methods to Develop Rating Models

Section 5.4 Annuities, Present Value, and Amortization

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Criminal Justice System on Crime *

Computer-assisted Auditing for High- Volume Medical Coding

A multiple objective test assembly approach for exposure control problems in Computerized Adaptive Testing

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Simple Interest Loans (Section 5.1) :

Wage inequality and returns to schooling in Europe: a semi-parametric approach using EU-SILC data

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Transcription:

Survey Weghtng and the Calculaton of Samplng Varance Survey weghtng... 132 Calculatng samplng varance... 138 PISA 2012 TECHNICAL REPORT OECD 2014 131

Survey weghts are requred to facltate analyss of PISA data, calculaton of approprate estmates of samplng error and makng vald estmates and nferences of the populaton. The nternatonal contractor calculated survey weghts for all assessed, nelgble and excluded students, and provded varables n the database that permt users to make approxmately unbased estmates of standard errors, conduct sgnfcance tests and create confdence ntervals approprately, gven the complex sample desgn for PISA n each ndvdual partcpatng country. SURVEY WEIGHTING Whle the students ncluded n the fnal PISA sample for a gven country were chosen randomly, the selecton probabltes of the students vary. Survey weghts must therefore be ncorporated nto the analyss to ensure that each sampled student represents the approprate number of students n the full PISA populaton. There are several reasons why the survey weghts are not the same for all students n a gven country: A school sample desgn may ntentonally over- or under-sample certan sectors of the school populaton: n the former case, so that they could be effectvely analysed separately for natonal purposes, such as a relatvely small but poltcally mportant provnce or regon, or a sub-populaton usng a partcular language of nstructon; and n the latter case, for reasons of cost, or other practcal consderatons, such as very small or geographcally remote schools. 1 Informaton about school sze avalable at the tme of samplng may not have been completely accurate. If a school was expected to be large, the selecton probablty was based on the assumpton that only a sample of students would be selected from the school for partcpaton n PISA. But f the school turned out to be small, all students would have to be ncluded. In ths scenaro, the students would have a hgher probablty of selecton n the sample than planned, makng ther ncluson probabltes hgher than those of most other students n the sample. Conversely, f a school assumed to be small actually was large, the students ncluded n the sample would have smaller selecton probabltes than others. School non-response, where no replacement school partcpated, may have occurred, leadng to the underrepresentaton of students from that knd of school, unless weghtng adjustments were made. It s also possble that only part of the PISA-elgble populaton n a school (such as those 15-year-old students n a partcular grade) were represented by ts student sample, whch also requres weghtng to compensate for the mssng data from the omtted grades. Student non-response, wthn partcpatng schools, occurred to varyng extents. Sampled students who were PISAelgble and not excluded, but dd not partcpate n the assessment for reasons such as absences or refusals, wll be under-represented n the data unless weghtng adjustments were made. Trmmng the survey weghts to prevent undue nfluence of a relatvely small subset of the school or student sample mght have been necessary f a small group of students would otherwse have much larger weghts than the remanng students n the country. Such large survey weghts can lead to estmates wth large samplng errors and napproprate representatons n the natonal estmates. Trmmng survey weghts ntroduces a small bas nto estmates but greatly reduces standard errors (Ksh, 1992). In countres that partcpated n the fnancal lteracy study, addtonal students were selected n the schools elgble for the fnancal lteracy assessment. Snce the fnancal lteracy sample was also desgned to represent the full PISA student populaton, the weghts for the sampled students were adjusted to account for ths. Dfferent adjustment factors appled to each student s weght, dependng on whether the student was sampled for fnancal lteracy or not. The procedures used to derve the survey weghts for PISA reflect the standards of best practce for analysng complex survey data, and the procedures used by the world s major statstcal agences. The same procedures were used n other nternatonal studes of educatonal achevement such as the Trends n Internatonal Mathematcs and Scence Study (TIMSS) and the Progress n Internatonal Readng Lteracy Studes (PIRLS), whch were all mplemented by the Internatonal Assocaton for the Evaluaton of Educatonal Achevement (IEA). The underlyng statstcal theory for the analyss of survey data can be found n Cochran (1977), Lohr (2010) and Särndal, Swensson and Wretman (1992). 132 OECD 2014 PISA 2012 TECHNICAL REPORT

Weghts are appled to student-level data for analyss. The weght, W j, for student j n school conssts of two base weghts, the school base weght and the wthn-school base weght, and fve adjustment factors, and can be expressed as: 8.1 W A = t 2 ff 1 2 f 1 t 1 w 2 jw 1 j j j j where: w 1, the school base weght, s gven as the recprocal of the probablty of ncluson of school nto the sample; w 2j, the wthn-school base weght, s gven as the recprocal of the probablty of selecton of student j from wthn the selected school ; f 1 s an adjustment factor to compensate for non-partcpaton by other schools that are somewhat smlar n nature to school (not already compensated for by the partcpaton of replacement schools); f1j A s an adjustment factor to compensate for schools n some partcpatng countres where only 15-year-old students who were enrolled n the modal grade for 15-year-old students were ncluded n the assessment; f 2j s an adjustment factor to compensate for non-partcpaton by students wthn the same school non-response cell and explct stratum, and, where permtted by the sample sze, wthn the same hgh/low grade and gender categores; t 1 s a school base weght trmmng factor, used to reduce unexpectedly large values of w 1 ; and t 2j s a fnal student weght trmmng factor, used to reduce the weghts of students wth exceptonally large values for the product of all the precedng weght components. The school base weght The term w 1 s referred to as the school base weght. For the systematc samplng wth probablty proportonal-to-sze method used n samplng schools for PISA, ths weght s gven as: 8.2 w 1 I = 1 g MOS f < MOS < I otherwse g The term MOS denotes the Measure of Sze gven to each school on the samplng frame. Despte country varatons, MOS was usually equal to the estmated number of 15-year-old students n the school, f t was greater than the predetermned target cluster sze (TCS), whch was 35 students for most countres that dd not partcpate n the fnancal lteracy study, and 43 for most countres that dd. If small schools were under-sampled wthout the use of an explct stratum for small school, then f the enrolment of 15-year-old students was less than the TCS, but greater than TCS/2, MOS =TCS. If the enrolment was between 3 and TCS/2, MOS =TCS/2 and f the enrolment was 1 or 2, MOS =TCS/4. These dfferent values of the measure of sze are ntended to mnmse the mpact of small schools on the varaton of the weghts. The term I g denotes the samplng nterval used wthn the explct samplng stratum g that contans school and s calculated as the total of the MOS values for all schools n stratum g, dvded by the school sample sze for that stratum. Thus, f school was estmated to have one hundred 15-year-old students at the tme of sample selecton, MOS =100. If the country had a sngle explct stratum (g=1) and the total of the MOS values over all schools was 150 000 students, wth a school sample sze of 150, then the samplng nterval, I 1 = 150 000/150 = 1 000, for school (and others n the sample), gvng a school base weght of w 1 =1000/100=10.0. Thus, the school can be thought of as representng about ten schools n the populaton. In ths example, any school wth 1 000 or more 15-year-old students would be ncluded n the sample wth certanty, wth a base weght of w 1 =1 as the MOS s larger than the samplng nterval. PISA 2012 TECHNICAL REPORT OECD 2014 133

The school base weght trmmng factor Once school base weghts were establshed for each sampled school n the country, verfcatons were made separately wthn each explct samplng stratum to determne f the school base weghts requred trmmng. The school trmmng factor t 1, s the rato of the trmmed to the untrmmed school base weght, and for most schools (and therefore most students n the sample) s equal to 1.0000. The school-level trmmng adjustment was appled to schools that turned out to be much larger than was assumed at the tme of school samplng. Schools were flagged where the actual 15-year-old student enrolment exceeded 3 maxmum (TCS, MOS ). For example, f the TCS was 35 students, then a school flagged for trmmng had more than 105 (=3 x 35) PISA-elgble students, and more than three tmes as many students as was ndcated on the school samplng frame. Because the student sample sze was set at TCS regardless of the actual enrolment, the student samplng rate was much lower than antcpated durng the school samplng. Ths meant that the weghts for the sampled students n these schools would have been more than three tmes greater than antcpated when the school sample was selected. These schools had ther school base weghts trmmed by havng MOS replaced by 3 maxmum (TCS, MOS ) n the school base weght formula. The wthn-school base weght The term w 2j s referred to as the wthn-school base weght. Wth the PISA procedure for samplng students, w 2j dd not vary across students wthn a partcular school. That s, all of the students wthn the same school had the same probablty of selecton for partcpaton n PISA. Ths weght s gven as: 8.3 w 2 j = enr sam where enr s the actual enrolment of 15-year-old students n the school on the day of the assessment (and so, n general, s somewhat dfferent from the MOS ), and sam s the sample sze wthn school. It follows that f all PISA-elgble students from the school were selected, then w 2j =1 for all elgble students n the school. For all other cases w 2j >1 as the selected student represents other students n the school besdes themselves. In the case of the grade samplng opton, for drect sampled grade students, the samplng nterval for the extra grade students was the same as that for the PISA students. Therefore, countres wth extra drect-sampled grade students (Iceland and some of the grade students n Swtzerland) have the same wthn school student weghts for the extra grade students as those for PISA-elgble students from the same school. For Swtzerland s other grade sampled students, these had weghts of 1. For Slovena, a separate sample sze was specfed for the non-pisa grade students and so ther weghts dffered from those of the PISA students n the same school. Addtonal weght components were needed for the grade students n Chle and Germany. For these two countres, the extra weght component conssted of the class weght for the selected class(es) (all students were selected nto the grade sample n the selected class[es]). In these two countres, the extra weght component resulted n the necessty of a second weghtng stream for the extra grade students. The school non-response adjustment In order to adjust for the fact that those schools that declned to partcpate, and were not replaced by a replacement school, were not n general typcal of the schools n the sample as a whole, school-level non-response adjustments were made. Several groups of somewhat smlar schools were formed wthn a country, and wthn each group the weghts of the respondng schools were adjusted to compensate for the mssng schools and ther students. The compostons of the non-response groups vared from country to country, but were based on cross-classfyng the explct and mplct stratfcaton varables used at the tme of school sample selecton. Usually, about 10 to 15 such groups were formed wthn a gven country dependng upon school dstrbuton wth respect to stratfcaton varables. If a country provded no mplct stratfcaton varables, schools were dvded nto three roughly equal groups, wthn each explct stratum, based on ther enrolment sze. It was desrable to ensure that each group had at least sx partcpatng schools, as small groups could lead to unstable weght adjustments, whch n turn would nflate the samplng varances. Adjustments greater than 2.0 were also flagged for revew, as they could have caused ncreased varablty n the weghts 134 OECD 2014 PISA 2012 TECHNICAL REPORT

and would have led to an ncrease n samplng varances. It was not necessary to collapse cells where all schools partcpated, as the school non-response adjustment factor was 1.0 regardless of whether cells were collapsed or not. However, such cells were sometmes collapsed to ensure that enough respondng students would be avalable for the student non-response adjustments n a later weghtng step. In ether of these stuatons, cells were generally collapsed over the last mplct stratfcaton varable(s) untl the volatons no longer exsted. In partcpatng countres wth very hgh overall levels of school non-response after school replacement, the requrement for school non-response adjustment factors to all be below 2.0 was waved. Wthn the school non-response adjustment group contanng school, the non-response adjustment factor was calculated as: 8.4 f 1 = w1 kenr k Ω() w1 kenr k Γ() k k ( ) ( ) where the sum n the denomnator s over G(), whch are the schools wthn the group (orgnals and replacements) that partcpated, whle the sum n the numerator s over W(), whch are those same schools, plus the orgnal sample schools that refused and were not replaced. The numerator estmates the populaton of 15-year-old students n the group, whle the denomnator gves the sze of the populaton of 15-year-old students drectly represented by partcpatng schools. The school non-response adjustment factor ensures that partcpatng schools are weghted to represent all students n the group. If a school dd not partcpate because t had no PISA-elgble students enrolled, no adjustment was necessary snce ths was consdered nether non-response nor under-coverage. Fgure 8.1 shows the number of school non-response classes that were formed for each country/economy and the varables that were used to create the cells. The grade non-response adjustment Because of perceved admnstratve nconvenence, ndvdual schools may occasonally agree to partcpate n PISA but requre that partcpaton be restrcted to 15-year-old students n the modal grade for 15-year-old students, rather than all 15-year-old students. Snce the modal grade generally ncludes the majorty of the populaton to be covered, such schools may be accepted as partcpants rather than have the school refuse to partcpate entrely. For the part of the 15-year-old populaton n the modal grade, these schools are respondents, whle for the rest of the grades n the school wth 15-year-old students, such a school s a refusal. To account for ths, a specal non-response adjustment can be calculated at the school level for students not n the modal grade (and s automatcally 1.0 for all students n the modal grade). No countres had ths type of non-response for PISA 2012, so the weght adjustment for grade non-response was automatcally 1.0 for all students n both the modal and non-modal grades, and therefore dd not affect the fnal weghts. If the weght adjustment for grade non-response had been needed (as t was n earler cycles of PISA n a few countres), t would have been calculated as follows: Wthn the same non-response adjustment groups used for creatng school non-response adjustment factors, the grade non-response adjustment factor for all students n school, f A 1, s gven as: 8.5 f A 1 ( ) w1 kenra k k C() = 1 w1 kenra k Β() k ( ) for students not n the modal grade otherwse The varable enra(k) s the approxmate number of 15-year-old students n school k but not n the modal grade. The set B() s all schools that partcpated for all elgble grades (from wthn the non-response adjustment group wth school ()), whle the set C() ncludes these schools and those that only partcpated for the modal respondng grade. PISA 2012 TECHNICAL REPORT OECD 2014 135

Country/ Economy Fgure 8.1 [Part 1/2] Non-response classes Implct stratfcaton varables Number of orgnal cells (2012) Number of fnal cells (2012) Albana ISCED2/Mxed (2) 18 8 Argentna Fundng (2); Educaton type (3); Educaton level (9); Urbancty (2); Secular/Relgous (2) 83 14 Australa Geographc Zone (3); School Gender Composton (3); School Soco-economc Level (6); 455 84 Numeracy Achevement Level (6); ISCED Level (3) Austra School Type (4); Regon (9); Percentage of Grls (5) 191 22 Belgum Grade Repetton Flemsh Communty and French Communty (5), German Communty (1); 224 36 Percentage of Grls Flemsh Communty and French Communty (4), German Communty (1); School Type French Communty (4), German Communty and Flemsh Communty (1) Brazl Admn (3); DHI Quntles (6); ISCED level (4); Urbancty (2) 420 118 Bulgara Type of School (8); Sze of Settlement (5); Fundng (3) 131 28 Canada Urbancty (3); Fundng (2); ISCED Level (4) 194 57 Chle % Grls (6); Urbancty (2); Regon (4) 156 22 Colomba Urbancty (2); Fundng (2); Weekend school or not (2); Gender (5); ISCED Programme 113 26 Orentaton (4) Costa Rca Programme (2); Urbancty (2); Shft (2); Regon (27); ISCED Level (3) 93 11 Croata Gender (3); Urbancty (3); Regon (6) 78 27 Cyprus 1, 2 Language (2); ISCED Level (3) 14 10 Czech Republc School Sze (3); Regon for Programmes 3, 4, 5, 6 (15); School Gender Composton (3) 194 39 Denmark School Type (8); ISCED Level (4); Urbancty (6); Regon (6) 164 42 Estona School Type (3); Urbancty (2); County (15); Fundng (2) 81 24 Fnland School Type (7) 52 20 France School Type for small school strata (4); Fundng (2) 19 8 Germany State for other schools (17); School Type (6) 79 34 Greece School Type (3); Fundng (2) 44 15 Hong Kong-Chna Student Academc Intake (4) 11 8 Hungary Regon (7); Mathematcs Performance (6) 122 31 Iceland Urbancty (2); ISCED Level (2) 41 14 Indonesa Provnce (32); Fundng (2); School Type and Level (5); Natonal Exam Result (3) 148 27 Ireland Soco-Economc Status Category (5); School Gender Composton Category (5) 93 25 Israel ISCED Level (4); Group Sze (3); SES (4); Dstrct (3) 67 17 Italy Fundng (2) 152 69 Japan Levels of proporton of students takng Unversty/College Entrance Exams (4) 16 13 Jordan Urbancty (2); Gender (3); Level (2); Shft (2) 52 25 Kazakhstan Urbancty (2); ISCED Level (3); ISCED Programme Orentaton (2); Fundng (2) 128 35 Korea Urbancty Level (3); School Gender Composton (3) 24 13 Latva School Type/Level (5) 18 9 Lechtensten Fundng (2) 2 2 Lthuana Fundng (2) 21 12 Luxembourg School Gender Composton (3) 8 8 Macao-Chna Gender (3); School Orentaton (2); ISCED Level (2) 19 13 Malaysa School Type (16); Urbancty (2); State (16); Gender (3); ISCED Level (2) 61 17 Mexco School Level (2); School Programme (7); Fundng (2); Urbancty (2) 610 124 Montenegro Gender (3) 17 15 Netherlands Programme Category (7) 14 6 New Zealand School Decle (4); Fundng (2); School Gender Composton (3); Urbancty (2) 37 16 Norway None 8 4 Peru Regon (26); Gender (3); School Type (7) 104 27 Poland School Sub-type (2); Fundng (2); Localty (4); Gender Composton (3) 36 7 Portugal ISCED Level (3); Fundng (2); Urbancty (3) 101 31 Qatar Gender (3); Language (2); Level (5); Fundng (2); Programme Orentaton (3) 42 13 Romana Language (2); Urbancty (2); LIC Type (3) 10 5 Russan Locaton/Urbancty (9); School Type (8); School Sub-type (5) 193 43 Federaton Serba Regon (5); Programme (7) 38 18 Shangha-Chna Urbancty (2); Fundng (2); Vocatonal School Type (4) 19 17 Sngapore Gender (3) 6 4 Slovak Republc Sub-Type (6); Language (3); Grade Repetton Level (25); Exam (11) 96 34 Slovena Locaton/Urbancty (5); Gender (3) 146 43 1. Note by Turkey: The nformaton n ths document wth reference to Cyprus relates to the southern part of the Island. There s no sngle authorty representng both Turksh and Greek Cyprot people on the Island. Turkey recognses the Turksh Republc of Northern Cyprus (TRNC). Untl a lastng and equtable soluton s found wthn the context of the Unted Natons, Turkey shall preserve ts poston concernng the Cyprus ssue. 2. Note by all the European Unon Member States of the OECD and the European Unon: The Republc of Cyprus s recognsed by all members of the Unted Natons wth the excepton of Turkey. The nformaton n ths document relates to the area under the effectve control of the Government of the Republc of Cyprus. 136 OECD 2014 PISA 2012 TECHNICAL REPORT

Country/ Economy Fgure 8.1 [Part 2/2] Non-response classes Implct stratfcaton varables Number of orgnal cells (2012) Number of fnal cells (2012) Span None 129 105 Sweden Geographc LAN (22); Responsble Authorty (4); Level of Immgrants (5); Income Quartles (5) 114 33 Swtzerland School Type (28); Canton (26) 144 36 Chnese Tape County/Cty area (22); School Gender (3) 125 41 Thaland Regon (9); Urbancty (2); Gender (3) 118 27 Tunsa ISCED Level (3); Fundng (2); % Repeaters (3) 85 23 Turkey School Type (18); Gender (3); Urbancty (2); Fundng (2) 128 27 Unted Arab School Level (3); School Gender (3) 128 59 Emrates Unted Kngdom Gender (3); School Performance England and Wales (6), Northern Ireland (1); Local Authorty 339 66 England (151), Wales (22), Northern Ireland (1); Area Type Scotland (6) Unted States Grade Span (5); Urbancty (4); Mnorty Status (2); Gender (3); State (51) 223 32 Uruguay Locaton/Urbancty (4); Gender (4) 33 16 Vet Nam Economc Regon (8); Provnce (63); School Type (6); Study Commtment (2) 142 28 Ths procedure gves, for each school, a sngle grade non-response adjustment factor that depends upon ts non-response adjustment class. Each ndvdual student has ths factor appled to the weght f he/she dd not belong to the modal grade, and 1.0 f belongng to the modal grade. In general, ths factor s not the same for all students wthn the same school when a country has some grade non-response. The wthn school non-response adjustment The fnal level of non-response adjustment was at the student level. Student non-response adjustment cells were created by formng four cells wthn each school, by cross-classfyng gender wth grade, dchotomsed nto hgh and low categores. The defnton as to whch grades were hgh and whch were low vared across explct strata, wth the am of makng the two groups as equal n sze as possble. In general the cells formed n ths way were too small for the formaton of stable nonresponse adjustment factors (sometmes such cells even contaned no respondng students). Thus cells were collapsed to create fnal student non-response adjustment cells. Intally the collapsng was across schools, wthn school non-response adjustment classes. Then as necessary ether grade or gender was collapsed. The student non-response adjustment f 2 was calculated as: 8.6 f 2 fww k = Χ () fww k () 1 1 2k 1 1 2k where () s all assessed students n the fnal student non-response adjustment cell; and, X() s all assessed students n the fnal student non-response adjustment cell, plus all other students who should have been assessed (.e. who were absent, but not excluded or nelgble). As mentoned, the hgh and low grade categores wthn each explct stratum n each country were defned so as to each contan a substantal proporton of the PISA populaton. In most cases, ths student non-response factor reduces to the rato of the number of students who should have been assessed to the number who were assessed. In some cases where t was necessary to collapse cells together, and then apply the more complex formula shown above was requred. Addtonally, an adjustment factor greater than 2.0 was not allowed for the same reasons noted under school non-response adjustments. If ths occurred, the cell wth the large adjustment was collapsed wth the closest cell wthn grade and gender combnatons n the same school non-response cell. Some schools n some countres had extremely low student response levels. In these cases t was determned that the small sample of assessed students wthn the school was potentally too based as a representaton of the school to be ncluded n the fnal PISA dataset. For any school where the student response rate was below 25%, the school was treated PISA 2012 TECHNICAL REPORT OECD 2014 137

as a non-respondent, and ts student data were removed. In schools wth between 25 and 50% student response, the student non-response adjustment descrbed above would have resulted n an adjustment factor of between 2.0 and 4.0, and so the grade-gender cells of these schools were collapsed wth others to create student non-response adjustments. 2 For countres wth extra drect grade sampled students (Iceland, Slovena and Swtzerland), care was taken to ensure that student non-response cells were formed separately for PISA students and the extra non-pisa grade students. No procedural changes were needed for Chle and Germany snce a separate weghtng stream was needed for the grade students. Trmmng the student weghts Ths fnal trmmng check was used to detect ndvdual student weghts that were unusually large compared to those of other students wthn the same explct stratum. The sample desgn was ntended to gve all students from wthn the same explct stratum an equal probablty of selecton and therefore equal weght, n the absence of school and student non-response. As already noted, poor pror nformaton about the number of elgble students n each school could lead to substantal volatons of ths equal weghtng prncple. Moreover, school, grade, and student non-response adjustments, and, occasonally, napproprate student samplng could, n a few cases, accumulate to gve a few students n the data relatvely large weghts, whch adds consderably to the samplng varance. The weghts of ndvdual students were therefore revewed, and where the weght was more than four tmes the medan weght of students from the same explct samplng stratum, t was trmmed to be equal to four tmes the medan weght for that explct stratum. The student trmmng factor, t 2j, s equal to the rato of the fnal student weght to the student weght adjusted for student non-response, and therefore equal to 1.0 for the great majorty of students. The fnal weght varable on the data fle s the fnal student weght that ncorporates any student-level trmmng. As n PISA 2000, PISA 2003, PISA 2006 and PISA 2009 mnmal trmmng was requred at ether the school or the student levels. The fnancal lteracy adjustment factor The fnancal lteracy weghtng adjustment factor was appled to all students n the schools sampled for the fnancal lteracy study. Despte dfference n TCS values and number of fnancal lteracy sampled students, the factors were the same for almost all countres. The fnancal lteracy booklet was appled at a rate of 43/8, whch then became the adjustment factor for students who receved the fnancal lteracy booklet. For the remanng students, the factor was 43/35, the rate at whch non-fnancal lteracy booklets were appled. Alternatve factors were used for whole schools usng the une heure booklet (UH) to reflect the slghtly dfferent rate at whch the fnancal lteracy UH booklet was appled n those schools (16/3 for students recevng a fnancal lteracy booklet and 16/13 for students not recevng a fnancal lteracy booklet) (see Chapter 2 for further detals on the UH booklet). Weghtng for computer-based assessment No non-response adjustments were made for schools or students sampled for computer-based assessment (CBA) whch dd not partcpate. Snce CBA was beng treated as a mnor doman lke mathematcs and readng, absent CBA students were treated n the same manner as a student not assgned a booklet contanng tems n the mathematcs or readng doman. Plausble values were generated for these CBA students, as well as for all other students who had not been subsampled for CBA. For most countres, CBA fnal sample szes are therefore dentcal to sample szes of paper-based tests. Sample weghts and replcate can be used wthout any modfcaton. The school subsamplng for CBA for Brazl, Italy and Span needed to be accounted for n weghtng through an addtonal weght component. Thus, schools subsampled for CBA for Brazl, Italy and Span had ther own weghtng stream, separate from the weghtng stream for the large natonal samples n these countres. Once n ther own weghtng stream, weghtng procedures for these CBA subsampled schools and students were the same as the weghtng procedures used for all countres. CALCULATING SAMPLING VARIANCE A replcaton methodology was employed to estmate the samplng varances of PISA parameter estmates. Ths methodology accounted for the varance n estmates due to the samplng of schools and students. Addtonal varance due to the use of plausble values from the posteror dstrbutons of scaled scores was captured separately as measurement error. Computatonally the calculaton of these two components could be carred out n a sngle program, such as WesVar 5.1 (Westat, 2007). SPSS and SAS macros were also developed. For further detal, see PISA Data Analyss Manual (OECD, 2009). 138 OECD 2014 PISA 2012 TECHNICAL REPORT

The Balanced Repeated Replcaton varance estmator The approach used for calculatng samplng varances for PISA estmates s known as Balanced Repeated Replcaton (BRR), or balanced half-samples; the partcular varant known as Fay s method was used. Ths method s smlar n nature to the jackknfe method used n other nternatonal studes of educatonal achevement, such as TIMSS, and t s well documented n the survey samplng lterature (see Rust, 1985; Rust and Rao, 1996; Shao, 1996; Wolter, 2007). The major advantage of the BRR method over the jackknfe method s that the jackknfe s not fully approprate for use wth non-dfferentable functons of the survey data, most notceably quantles, for whch t does not provde a statstcally consstent estmator of varance. Ths means that, dependng upon the sample desgn, the varance estmator can be unstable, and despte emprcal evdence that t can behave well n a PISA-lke desgn, theory s lackng. In contrast the BRR method does not have ths theoretcal flaw. The standard BRR procedure can become unstable when used to analyse sparse populaton subgroups, but Fay s method overcomes ths dffculty, and s well justfed n the lterature (Judkns, 1990). The BRR method was mplemented for a country where the student sample was selected from a sample of schools, rather than all schools, as follows: Schools were pared on the bass of the explct and mplct stratfcaton and frame orderng used n samplng. The pars were orgnally sampled schools, except for partcpatng replacement schools that took the place of an orgnal school. For an odd number of schools wthn a stratum, a trple was formed consstng of the last three schools on the sorted lst. Pars were numbered sequentally, 1 to H, wth par number denoted by the subscrpt h. Other studes and the lterature refer to such pars as varance strata or zones, or pseudo-strata. Wthn each varance stratum, one school was randomly numbered as 1, the other as 2 (and the thrd as 3, n a trple), whch defned the varance unt of the school. Subscrpt j refers to ths numberng. These varance strata and varance unts (1, 2, 3) assgned at school level were attached to the data for the sampled students wthn the correspondng school. Let the estmate of a gven statstc from the full student sample be denoted as X*. Ths was calculated usng the full sample weghts. A set of 80 replcate estmates, X* t (where t runs from 1 to 80), was created. Each of these replcate estmates was formed by multplyng the survey weghts from one of the two schools n each stratum by 1.5, and the weghts from the remanng schools by 0.5. The determnaton as to whch schools receved nflated weghts, and whch receved deflated weghts, was carred out n a systematc fashon, based on the entres n a Hadamard matrx of order 80. A Hadamard matrx contans entres that are +1 and 1 n value, and has the property that the matrx, multpled by ts transpose, gves the dentty matrx of order 80, multpled by a factor of 80. Detals concernng Hadamard matrces are gven n Wolter (2007). In cases where there were three unts n a trple, ether one of the schools (desgnated at random) receved a factor of 1.7071 for a gven replcate, wth the other two schools recevng factors of 0.6464, or else the one school receved a factor of 0.2929 and the other two schools receved factors of 1.3536. The explanaton of how these partcular factors came to be used s explaned n Appendx 12 of the PISA 2000 Techncal Report (Adams and Wu, 2002). To use a Hadamard matrx of order 80 requres that there be no more than 80 varance strata wthn a country, or else that some combnng of varance strata be carred out pror to assgnng the replcaton factors va the Hadamard matrx. The combnng of varance strata does not cause bas n varance estmaton, provded that t s carred out n such a way that the assgnment of varance unts s ndependent from one stratum to another wthn strata that are combned. That s, the assgnment of varance unts must be completed before the combnng of varance strata takes place, and ths approach was used for PISA. The relablty of varance estmates for mportant populaton subgroups s enhanced f any combnng of varance strata that s requred s conducted by combnng varance strata from dfferent subgroups. Thus n PISA, varance strata that were combned were selected from dfferent explct samplng strata and also, to the extent possble, from dfferent mplct samplng strata. In some countres, t was not the case that the entre sample was a two-stage desgn, of frst samplng schools and then samplng students wthn schools. In some countres/economes, for part of the sample (and for the entre samples for Cyprus, 3 Iceland, Lechtensten, Luxembourg, Macao-Chna and Qatar), schools were ncluded wth certanty nto the samplng, so that only a sngle stage of student samplng was carred out for ths part of the sample. In these cases PISA 2012 TECHNICAL REPORT OECD 2014 139

nstead of parng schools, pars of ndvdual students were formed from wthn the same school (and f the school had an odd number of sampled students, a trple of students was formed). The procedure of assgnng varance unts and replcate weght factors was then conducted at the student level, rather than at the school level. In contrast, n one country, the Russan Federaton, there was a stage of samplng that preceded the selecton of schools. Then the procedure for assgnng varance strata, varance unts and replcate factors was appled at ths hgher level of samplng. The schools and students then nherted the assgnment from the hgher-level unt n whch they were located. Procedural changes were n general not needed n the formaton of varance strata for countres wth extra drect grade sampled students (Iceland, Slovena and Swtzerland) snce the extra grade sample came from the same schools as the PISA students. However, f there were certanty schools n these countres, students wthn the certanty schools were pared so that PISA non-grade students were together, PISA grade students were together and non-pisa grade students were together. No procedural changes were requred for the grade students for Chle and Germany, snce a separate weghtng stream was needed n these cases. The varance estmator s then: 8.7 80 2 V BRR ( X )= 005 {. ( Xt X ) } t= 1 The propertes of BRR method have been establshed by demonstratng that t s unbased and consstent for smple lnear estmators (.e. means from straghtforward sample desgns), and that t has desrable asymptotc consstency for a wde varety of estmators under complex desgns, and through emprcal smulaton studes. Reflectng weghtng adjustments Ths descrpton does not detal one aspect of the mplementaton of the BRR method. Weghts for a gven replcate are obtaned by applyng the adjustment to the weght components that reflect selecton probabltes (the school base weght n most cases), and then re-computng the non-response adjustment replcate by replcate. Implementng ths approach requred that the nternatonal contractor produce a set of replcate weghts n addton to the full sample weght. Eghty such replcate weghts were needed for each student n the data fle. The school and student non-response adjustments had to be repeated for each set of replcate weghts. To estmate samplng errors correctly, the analyst must use the varance estmaton formula above, by dervng estmates usng the t-th set of replcate weghts. Because of the weght adjustments (and the presence of occasonal trples), ths does not mean merely ncreasng the fnal full sample weghts for half the schools by a factor of 1.5 and decreasng the weghts from the remanng schools by a factor of 0.5. Many replcate weghts wll also be slghtly dsturbed, beyond these adjustments, as a result of repeatng the non-response adjustments separately by replcate. Formaton of varance strata Wth the approach descrbed above, all orgnal sampled schools were sorted n stratum order (ncludng refusals, excluded and nelgble schools) and pared. An alternatve would have been to par partcpatng schools only. However, the approach used permts the varance estmator to reflect the mpact of non-response adjustments on samplng varance, whch the alternatve does not. Ths s unlkely to be a large component of varance n any PISA country, but the procedure gves a more accurate estmate of samplng varance. Countres and economes where all students were selected for PISA In Iceland, Lechtensten, Luxembourg, Macao-Chna and Qatar, all PISA-elgble students were selected for partcpaton n PISA. It mght be unexpected that the PISA data should reflect any samplng varance n these countres/economes, but students have been assgned to varance strata and varance unts, and the BRR method does provde a postve estmate of samplng varance for two reasons. Frst, n each country/economy there was some student non-response, and, n the case of Iceland and Qatar, some school non-response. Not all PISA-elgble students were assessed, gvng samplng varance. Second, the ntent s to make nference about educatonal systems and not partcular groups of ndvdual students, so t s approprate that a part of the samplng varance reflect random varaton between student populatons, even f they were to be subjected to dentcal educatonal experences. Ths s consstent wth the approach that s generally used whenever survey data are used to try to make drect or ndrect nference about some underlyng system. 140 OECD 2014 PISA 2012 TECHNICAL REPORT

Notes 1. Note that ths s not the same as excludng certan portons of the school populaton. Ths also happened n some cases, but cannot be addressed adequately through the use of survey weghts. 2. Chapter 11 descrbes these schools as beng treated as non-respondents for the purpose of response rate calculaton, even though ther student data were used n the analyses. 3. Note by Turkey: The nformaton n ths document wth reference to Cyprus relates to the southern part of the Island. There s no sngle authorty representng both Turksh and Greek Cyprot people on the Island. Turkey recognses the Turksh Republc of Northern Cyprus (TRNC). Untl a lastng and equtable soluton s found wthn the context of the Unted Natons, Turkey shall preserve ts poston concernng the Cyprus ssue. Note by all the European Unon Member States of the OECD and the European Unon: The Republc of Cyprus s recognsed by all members of the Unted Natons wth the excepton of Turkey. The nformaton n ths document relates to the area under the effectve control of the Government of the Republc of Cyprus. References Adams, R.J. and M.L. Wu, (2002), PISA 2000 Techncal Report, PISA, OECD Publshng, Pars. Cochran, W.G. (1977), Samplng Technques, Thrd edton, John Wley and Sons, New York. Judkns, D.R. (1990), Fay s Method of Varance Estmaton, Journal of Offcal Statstcs, No. 6(3), pp. 223-239. Ksh, L. (1992), Weghtng for Unequal P., Journal of Offcal Statstcs, No. 8(2), pp. 183-200. Lohr, S.L. (2010), Samplng: Desgn and Analyss, Second edton, Pacfc, Pacfc Grove, Duxberry. OECD (2009), PISA Data Analyss Manual: SPSS, Second edton, PISA, OECD Publshng, Pars. http://dx.do.org/10.1787/9789264010666-en OECD (2005), PISA 2003 Data Analyss Manual: SAS, PISA, OECD Publshng, Pars. http://dx.do.org/10.1787/9789264010642-en Rust, K. (1985), Varance Estmaton for Complex Estmators n Sample Surveys, Journal of Offcal Statstcs, No. 1, pp. 381-397. Rust, K.F. and J.N.K. Rao (1996), Varance Estmaton for Complex Surveys Usng Replcaton Technques, Survey Methods n Medcal Research, No. 5, pp. 283-310. Shao, J. (1996), Resamplng Methods n Sample Surveys (wth Dscusson), Statstcs, No. 27, pp. 203-254. Särndal, C.-E., B. Swensson and J. Wretman (1992), Model Asssted Survey Samplng, Sprnger-Verlag, New York. Westat (2007), WesVar 5.1 Computer Software and Manual, Author, Rockvlle, MD (also see http://www.westat.com/wesvar/). Wolter, K.M. (2007), Introducton to Varance Estmaton, Second edton, Sprnger, New York. PISA 2012 TECHNICAL REPORT OECD 2014 141