The construction and use of sample design variables in EU SILC. A user s perspective

The construction and use of sample design variables in EU SILC. A user s perspective Report prepared for Eurostat Abstract: Currently, EU SILC is the single most important data source on income and living conditions in the European Union. It is widely used to inform policy makers and it constitutes a major resource for social research on the social situation in the European Union and its member states. As EU SILC consists of samples in every EU Member State, estimates based on EU SILC should be accompanied by proper standard errors and confidence intervals to indicate the precision of the estimates. In order to compute accurate standard errors, it is essential to take account of the sample design. In this report the quality of the variables which should enable the latter is discussed as well as the documentation of the sample design in every EU Member State. The report lists the most important problems and shortcomings of these variables and develops practical recommendations to improve their quality and usefulness. Furthermore, some suggestions are made for better documenting the sample design in the national quality reports. The main conclusions are: (1) especially in countries with complex sample designs the sample design variables do not always reflect the real sample design; (2) the sample design should be better documented in the national quality reports in order to facilitate the evaluation of the sample design variables. The main recommendations are: (1) the responsible persons at the national statistical offices should be informed on the use of the sample design variables; additionally they should receive more guidance in the construction of the sample design variables and the description of the sample design in the national quality reports; (2) a discussion should be organised about the organisation of sample designs in the future: in some countries they seem to be unnecessarily complex, lacking an adequate sampling frame and so prohibiting the correct computation of sampling variances; (3) a research based discussion should be organised about the possibilities to disclose all sample design variables in the UDB. Tim Goedemé PhD Fellow of the Research Foundation Flanders (FWO) Herman Deleeck Centre for Social Policy University of Antwerp St. Jacobstraat 2 (M479) 2000 Antwerp Belgium T.: +32 3 265 55 55 tim.goedeme@ua.ac.be

Contents 1 Introduction... 3 2 The importance of the sample design and the construction of the sample design variables... 4 2.1 Sample design... 4 2.2 Guidelines and recommendations with regard to the construction of the Primary Sampling Unit and Primary Strata variables... 6 3 Documenting the sample design... 7 4 Main problems with current variables... 9 4.1 overview... 9 4.2 country specific overview... 12 5 Conclusion... 15 References... 16

1 Introduction The Community Statistics on Income and Living Conditions (EU SILC) are the principal data source for cross national comparative research on income and living conditions in the European Union (EU). Furthermore, the data serve for the construction of the Laeken indicators as well as the Europe 2020 indicators for poverty and social exclusion. As EU SILC is composed of samples in all EU Member States, sampling and non sampling errors can seriously affect the accuracy of all estimates based on EU SILC. However, until now, Eurostat has refrained from consistently publishing standard errors and confidence intervals alongside the official poverty indicators (e.g. Eurostat, 2010; Wolff, 2010). Unfortunately, this is not a feature unique to Eurostat publications. It seems to be rather common practice to ignore confidence intervals in the case of descriptive (poverty) statistics in scientific journals as well. Although confidence intervals do not address all kinds of survey errors, the estimation of confidence intervals can save money, time and effort in that they indicate which differences between point estimates are not worth further investigating and do not merit attention of policy makers by showing whether they have a high probability of being due to random error. However, they can only serve this purpose if standard errors have been estimated accurately. In order to do so, among others it is necessary to take account of the sample design. Previous studies which focused on design effects in the case of poverty measures generally found strong effects of the sample design on the standard error (e.g. Rodgers and Rodgers, 1993: 43; Howes and Lanjouw, 1998: 107; Jolliffe et al., 2004: 563). Many countries covered by EU SILC employ complex sample designs involving multiple stages of selection including stratification and clustering. In general, two different types of information are necessary to take account of the sample design: an accurate description of the implemented sample design and adequate variables in the dataset to take account of clustering and stratification. In the case of the EU SILC user database (UDB) for many participating countries one or both types of information are lacking. The problem is more limited in the EU SILC dataset available to Eurostat, although also in that case some problems remain. There are thus two distinct issues involved here: 1) the fact that Eurostat lacks some crucial information about sampling designs; 2) the fact that Eurostat cannot or does not disclose all information in the user database UDB. This note addresses the first issue, and aims at giving an overview of the main problems encountered when estimating standard errors when using the complete EU SILC dataset available to Eurostat. However, I come back to the second issue in the conclusion. The main problems are a lack of accurate information on the sample design in some quality reports and mistakes in the construction of the EU SILC sample design variables DB050 and DB060. Several concrete recommendations are made to improve the situation in the future. Section two sets out some general principles when estimating sampling variances and derives general guidelines and recommendations for constructing sample design variables. Section three highlights how the documentation in relation to the sample design can be improved. The fourth section gives an overview of the problems encountered with the current sample design variables and describes how these problems can be overcome in the future. The fifth section concludes. This note is the result of a short stay at Eurostat during the last week of October 2010. As the EU SILC UDB contains very partial information on the EU SILC sample design, the aim of my stay was to find out which variables are best used in the UDB to estimate standard errors. In order to do so, the standard errors using the UDB have been compared to those obtained when using the more complete data available to Eurostat. The result of this research can be found in

Goedemé (forthcoming). Especially section 2.1 draws on this paper. I am genuinely grateful to Fabienne Montaigne and Pascal Wolff for offering me the opportunity to run my Stata programmes at Eurostat and to Karel Van den Bosch and Guillaume Osier for comments and suggestions. All opinions expressed in this note as well as any remaining errors and shortcomings are my own responsibility. 2 The importance of the sample design and the construction of the sample design variables There are several approaches to the estimation of standard errors and the computation of confidence intervals. In the case of linearization, formulae are derived which can be used to compute the standard errors, which in turn enable the computation of confidence intervals assuming a certain sampling distribution. A completely different approach is based on resampling from the original sample a high number of samples in order to empirically derive a sampling distribution (e.g. Jackknife repeated replication or the bootstrap). Subsequently, on the basis of this empirical sampling distribution standard errors and confidence intervals are computed (cf. Mooney and Duval, 1993 for an introduction; and Biewen, 2002; Trede, 2002; Van Kerm, 2002; Davidson and Flachaire, 2007; and del Mar Rueda and Muñoz, 2009 for an application to poverty and inequality measures). There are various methods which combine features of the approaches mentioned (see Efron and Tibshirani, 1998: 53 56) and each of these methods has its advantages and shortcomings (for a more detailed account and background information, see also Eurostat, 2002). This section will not go into the details of the various approaches to variance estimation. Rather, it aims at explaining the general principles which always should be taken into account when computing standard errors. Whichever approach is used, for getting the standard errors right one should replicate as closely as possible the entire procedure of drawing the sample and calculating the statistic one is interested in. There are four main ingredients to this: sample design, imputation, weighting and computation. In section 2.1 I shortly elaborate on how the sample design affects the sampling variance. For a discussion of the importance of weighting, imputation and the complexity of the estimators, see Goedemé (forthcoming). Section 2.2 consists of the main recommendations with regard to the construction of the variables containing the information on the sample design. 2.1 Sample design The sample design can seriously affect the standard error of estimates. Multi stage designs involve several stages of sampling and sub sampling. Multi stage designs start from the random selection of clusters of elements (e.g. municipalities, census sections, dwelling blocks), i.e. primary sampling units (PSUs). If the design consists of several stages, the selection proceeds to drawing a subsample within each selected cluster. The advantage of (geographical) clustering is that interviewers can collect the interviews in a limited number of geographical areas, reducing the costs of the survey (e.g. Sturgis, 2004: 1). However, a major disadvantage is that clustering can seriously increase the standard error if the variance within clusters is small compared to the between cluster variance with respect to the variable of interest. Stratification has the opposite effect. Stratification serves the purpose of increasing the representativeness of the sample and decreasing the risk that some parts in the population remain unrepresented. In order to do so, the population is divided into exclusive groups (strata). Subsequently an independent sample is drawn within each of these strata. Since the distribution of the sample across strata is fixed, differences between strata in the variabele of interest do not contribute to the sampling variance of sample estimates. Especially if the variance between strata is large with respect to

the relevant variable, stratification contributes to decreasing the standard error. Stratification can be applied at every stage of the sample selection process. Strata at the first stage are called primary strata. Usually, the effect of stratification is larger in the case of a clustered sample (cf. Kish, 1965; Kalton, 1983; Howes and Lanjouw, 1998; Lee and Forthofer, 2006: 9) 1. The effect of the sample design can differ strongly from one variable to another: clusters or strata may largely differ with respect to one variable and be rather heterogeneous with respect to another. Every stage in the sampling design adds an additional source of random error. However, a crucial finding of sampling theory is that if the ratio of selected clusters at the first stage to the total number of clusters in the population is small, other stages than the first add relatively little to the sampling variance (for a mathematical elaboration, see Kish, 1965; Cochran, 1977). As usually this ratio is relatively small, only the first stage is taken into account for calculating standard errors in standard software packages such as Stata. In this situation, only accurate information about the first stage of the sample design is needed. However, for an exact computation of the standard error, the entire sample design should be taken into account. In other words: for an estimation of standard errors, variables identifying the primary sampling units and the strata used at the first stage do normally suffice. No information on the secondary sampling units is necessary. However, in some countries there are complications worth noting. First, some countries consider that some PSUs should be always included, e.g. the biggest cities, and thus select them with a probability of one. Such a primary sampling unit is called autorepresentative (i.e.: it is by definition included in the sample). Statistically it is a primary stratum rather than a PSU. Therefore, in the case of autorepresentative PSUs, the secondary sampling units should be included in the PSU variable as if they were primary sampling units. In addition, simulations show that accurate information on PSUs is in many cases much more important than full information on stratification (Goedemé, forthcoming). Second, it is important that households retain the PSU and stratum identifier equal to the one at the moment of selection, even if they move afterwards to a place which belongs to another PSU/stratum. Third, if PSUs are drawn with replacement, each sampled PSU must receive a unique identification number, even if it concerns a PSU which has been drawn several times. 1 In fact, whether stratification reduces the sampling variability does depend on the sample sizes in each stratum. If these are not proportional to the population sizes and the finite population factor cannot be ignored, stratification may increase the standard error (see also Kalton, 1983: 24-26).

2.2 Guidelines and recommendations with regard to the construction of the Primary Sampling Unit and Primary Strata variables The data should contain one variable which uniquely identifies all sampled PSUs (DB060) and one variable which uniquely identifies all primary strata (DB050). If PSUs are selected several times, at each selection the selected PSU should receive a separate, unique identification number. The information on PSUs and primary strata should be included in the data for all rotational panels, even if some of them have been selected during a previous wave. Households/persons retain the same PSU and stratum number during the entire period they are in EU SILC, even if they move to a place which belongs to another PSU/stratum. PSUs and primary strata should contain information about the first moment of real selection, i.e. the first stage in which the probability of being selected is not equal to 1. More concretely, this means that: o o o Whenever dwellings/addresses are selected, the PSU variable should identify all households living in the dwelling by one unique identification number (now clustering at the level of dwellings is ignored in, for instance, Austria) Whenever the sampling frame consists of a sample itself (e.g. Germany), the primary sampling units and primary strata are those of the sampling frame (in case of Germany: the PSUs and primary strata of the Microzensus), not those used for selecting households (PSUs) from the sampling frame. In the case of autorepresentative PSUs (e.g. Italy), the PSU and primary strata variables should contain identification numbers of the next stage in the sample design (DB062 should be included in DB060 if the PSU is autorepresentative): if a municipality/city is selected with a probability of 1, the PSU is not the municipality in question, but the clusters selected within that municipality. The municipality itself is rather a stratum than a PSU. If within the municipality clusters of households are selected, these clusters are the real PSUs which should be included in the PSU variable. In the case the second stage is stratified, these strata should be included in the primary strata variable. Moreover, in the case of autorepresentative PSUs, DB050 and DB060 should contain a flag indicating that the PSU (stratum) refers to an autorepresentative PSU. PSU identification numbers should be unique across primary strata, such that the correct number of PSUs can still be obtained in case the primary strata variable is incomplete or missing. The second and third recommendations have also been stressed in the Guidelines for EU SILC 2008. In the case of systematic samples, an additional variable is required which identifies the order of selection of the PSUs. If the PSUs are drawn without replacement and the finite population correction factor cannot be ignored (which is admittedly seldom the case), this variable is required to compute the finite population correction. In the case that the fraction of selected PSUs is very large, for precise variance estimation another variable is needed which contains information on the fraction on selected PSUs, even if the sample has been drawn with

replacement. Furthermore, in these cases also information on the subsequent stages of the sample design is required. 3 Documenting the sample design Accurate information on the sample design is essential. Currently, much of this information is reported in the national quality reports as well as the comparative quality report. However, in some cases important information is lacking if one wants to compute sampling variances using the available data. As explained in section two, especially information on the first stage of the sample design should be complete. This includes information on the number and kind of real primary sampling units and primary strata, as well as the number of PSUs for each stratum. In addition, it should be clearly indicated which strata containing one PSU do so because only one PSU from many has been selected (or has responded) or because the PSU is an autorepresentative PSU such that the real PSUs are to be found in the next stage of the sample selection scheme. Therefore, it would be helpful if information would be provided on the number of selected and the number of responding units at each stage of the selection process, and in particular on the first stage of the sample selection. As an example, table 1 presents the number of PSUs and observations for each stratum, as well as the total number of PSUs and strata in the case of the Irish EU SILC. Such a table acts directly as a check for the description of the sample design in the national quality report and comparison with number of selected vs. number of included observations (PSUs/strata) as well as a check on whether the PSU and strata variables have been computed correctly. It is indispensible for researchers to check the degree to which the sample design variables in the UDB correctly reflect the real sample design. Additionally, the table could be extended to provide also information on other aspects of the sample design such as the number of selected households 2, or in case of a three stage design information on secondary sampling units. 2 E.g. one could add three columns with information the minimum, mean and maximum number of households per PSU in each stratum and change the columns on the number of observations to the minimum, mean and maximum number of observations per households.

Table 1: Overview of primary sampling units, primary strata and number of observations in the Irish EU SILC available to Eurostat. #Obs per PSU Stratum #PSUs #Obs Min Mean Max 1 10 47 1 4.7 9 2 7 25 1 3.6 5 3 5 24 1 4.8 8 4 199 1283 1 6.4 17 5 91 737 1 8.1 19 6 5 57 7 11.4 15 7 10 92 5 9.2 15 8 1* 5 5 5 5 9 1* 8 8 8 8 10 33 295 1 8.9 15 11 5 62 3 12.4 16 12 19 180 1 9.5 18 13 2 18 8 9 10 14 1* 10 10 10 10 15 52 449 1 8.6 18 16 23 114 1 5 15 17 1* 12 12 12 12 18 40 316 1 7.9 16 19 3 9 1 3 5 20 2 21 7 10.5 14 part of table not shown 130 2 15 7 7.5 8 131 10 62 3 6.2 12 132 19 113 2 5.9 13 133 17 100 1 5.9 11 134 4 30 4 7.5 12 135 1* 6 6 6 6 136 4 30 3 7.5 16 137 7 42 4 6 9 138 9 84 5 9.3 16 Total: 138 1723 12551 1 7.3 22 Source: EU SILC 2008. If, as in the case of the Irish EU SILC, strata contain only one PSU, this will be immediately visible to both statistical officers and data users. In that case, more precise information on which PSUs are autorepresentative and which PSUs have been selected among a larger number of eligible PSUs should be provided. As mentioned earlier, DB050/DB060 should contain a flag indicating which PSUs are autorepresentative and which variable contains the relevant clusters/strata information of the subsequent stage of the sample design. In addition, it would be preferable to construct the primary strata/primary sampling unit variables in such a way that autorepresentative strata/psus are directly replaced by the relevant strata/units with flags indicating these units belong to a higher order autorepresentative PSU.

4 Main problems with current variables 4.1 overview When analysing the sample design in EU SILC using the variables DB060 (PSUs) and DB050 (primary strata), several problems must be faced. First of all, in some cases DB060 is missing although the sample has been clustered on a higher level than the household level. In some countries, DB060 is entirely lacking although the sample has been clustered on a more aggregate level. This is the case for Belgium (2008), even though the PSU variable should have been provided to Eurostat in variable DB061. This is also the case for countries where dwellings/addresses rather than households have been selected (Austria, Finland). Although clustering at the household level will most probably account for most of the increase in sampling variance compared to a sample of persons, it is not entirely accurate. More worringly this is also the case in some countries where use is made of a so called master sample. In that case the primary strata and PSU variables should refer to those of the master sample (cf. section two). At least in Germany there is a problem in this regard. Germany provides DB050, but probably this variable does most probably not refer to the real primary strata 3. In several countries DB060 is partially lacking. This is the case of France (2007), Hungary and Latvia (only 81 cases). If a sample consists at the first stage in some parts of the country of a clustered sample and in other parts of a sample of households, it could be recommended to provide flags which inform the user whether DB060 is missing or whether another variable should be used in these cases. Second, as noted earlier, in cases where PSUs are autorepresentative, primary strata and PSU variables should contain for these PSUs information on the subsequent stage of the sample selection. Currently, it is not well documented in which cases PSUs are autorepresentative, such that it can only be guessed whether or not DB062 or household ID should be used. Strata containing only one sampling unit after using DB050 have been found in the case of Austria, France (2008), Spain, Ireland, Italy, Hungary and the UK. A flag should indicate whether the PSU is autorepresentative or not. Third, at least in Belgium PSU codes are not unique across selections, i.e. several postal zones have been drawn several times. Nevertheless, each postal zone receives the same identification number. However, for the estimation of the sampling variance it is necessary that with each draw the PSU receives a unique ID. In other words, currently only 243 PSUs can be discerned in the case of Belgium (2007) even though in reality 275 PSUs have been drawn. It could be that 3 Until 2008 at least a part of the German sample is based on quota sampling. In that case the computation of standard errors is somewhat inappropriate as the sample is not a regular probability sample. Note also that no national quality report in English is publicly available in the case of Germany. DESTATIS (2009: 4) reports a design effect of 1.3 as a cause of clustering in the case of the Mikrozensus. However, it is not reported with respect to what variable this design effect has been computed. In the Mikrozensus PSUs are groups of dwellings, dwellings or parts of large dwellings (DESTATIS, 2006: 5). If the first stage of selection from the DSP ( Master Sample ) is a selection of households (instead of entire PSUs from the Mikrozensus), the effect of clustering can be expected to be limited once one controls for clustering at the household level (variance is most limited within households and high chance that there are few households in EU-SILC which belong to the same Mikrozensus PSU). It is not entirely clear whether DB050 refers to the strata as originally applied for drawing the Mikrozensus or to the strata applied for drawing households from the DSP.

the same problem occurs in the case of other countries, especially when the identified number of PSUs is below the one reported in the national quality reports. However, this should be clarified by the national statistical institutes. Fourth, in some countries DB050 is lacking entirely (Belgium 2007, 2008, France 2007) or partially (Spain, France 2008). In these cases, DB040, which identifies the region at the moment of the interview could be used as a substitute. However, in doing so PSUs could be split if some households moved from one region to another between the moment of selection and the moment of interview. In fact, this happens in a relatively large number of cases. Therefore, (as is also the case in the UDB), when DB040 is used, split PSUs have to be re grouped together. In the case of Spain and France it is highly recommended that DB050 contains directly all strata in such a way that DB040 and DB050 do not have to be combined to replicate stratification. Fifth, in some countries (Bulgaria, Hungary, Poland and Slovenia) DB060 is not unique across strata 4. If DB050 and DB060 are constructed adequately, this poses no problem when using the dataset available to Eurostat, as is the case for Bulgaria. However, in the other countries the resulting number of PSUs is different from those reported by the national statistical offices in the national quality reports or through personal communication. The problem is even more severe when DB050 is not available, as is the case in the UDB: in that case both the number of strata and the number of PSUs is severely under estimated. Although for some countries relatively accurate estimates of the sampling variance can be made, this is much less the case in others. In any case it is strongly recommended that DB060 contains a unique identification number for each PSU. From a users perspective it is necessary to identify whether PSUs are split because of households moving region or because the correct number of PSUs can only be obtained after using DB050 as stratification variable. With the current information at hand, it is not possible to know for all countries when which diagnosis is the correct one. The solution rests in making PSUs unique across strata. In that case users can be sure that PSUs are split because of respondents moving between regions. Even with the complete information at hand, the number of PSUs and strata does not always correspond to the number reported in the national quality reports or mentioned after personal communication with the statistical office. As far as stratification is concerned, the problems are relatively limited. The United Kingdom reports there are 30 explicit strata in EU SILC although DB050 identifies 31 different strata. In the case of Hungary the number of strata is somewhat under estimated. In the case of Spain the correct number of strata could be reconstructed, but not in the case of France (2008). DB050 identifies 174 strata in the case of Germany, but as mentioned earlier they probably do not identify the real primary strata. As discussed previously, wider differences between the EU SILC data and the reported information can be found in the case of DB060. In general it is strongly recommended that Eurostat consistently checks all sample design variables and compares them to the actual sample designs. Many of the problems mentioned in this section could be relatively easily resolved if member states are asked (and are willing) to do so. Furthermore, for some countries it seems useful to organise a short workshop in which the importance, use and method of construction of the sample design variables is explained. 4 This is also the case for 2 PSUs in the case of the Czech Republic, but probably, unlike in the case of the other countries, this is a mistake.

Table 2: Observed number of single PSUs, primary strata and primary sampling units in EU SILC and as reported by national statistical offices Number as identified in EU-SILC dataset Reported number by national statistical office # single # primary PSUs strata # PSUs # observations # primary strata # PSUs AT 3 247 5,711 13,631 247 5,711 BE07 11 243 15,493 11 275 BE08 11 6,300 15,108 11 275 BG 56 1,415 12,191 56 1,415 CY 9 3,355 10,025 9 3,355 CZ 53 2,364 26,933 53 2,362 DE 1 13,312 28,904?? DK 1 5,778 14,836 1 5,778 EE 3 4,744 13,032 3 4,744 ES 1 93 1,994 35,970 93 2,000 FI 26 10,472 26,481 26 10,472 FR07 22 9,017 25,907 86 349 FR08 15 87 349 25,510 86 GR 90 1,064 16,869 90 1,056 HU 66 526 5,639 22,363 529 4,184 IE 21 138 1,723 12,551 138 1,747 IS 1 2,887 8,644 1 2,887 IT 110 288 749 52,433 288 912 LT 7 4,823 12,150 7 4,823 LU 1 160 3,779 10,147 160 3,779 LV 4 912 13,039 4 930 MT 1 3,368 9,591 1 3,368 NL 40 462 25,448 40 463 NO 1 5,553 14,216 1 5,553 PL 211 5,093 41,200 211 5,912 PT 7 541 11,786 7 542 RO 88 779 19,131 88 780 SE 1 7,452 18,825 1 7,452 SI 6 1,672 28,958 6 2,799 SK 48 5,450 16,546 48 5,450 UK 1 31 1,014 21,043 30 1,065 Notes: DB060 has been used to identify PSUs, where missing household ID has been used instead. In some countries it could be that DB062 should be used instead of household ID. Single PSUs: PSUs in strata containing no other PSU. Stratification using DB050, in France (2008) and Spain in combination with DB040, in Belgium and France (2007) using only DB040. In the case of Germany DB050 contains 174 strata, but they do not refer to the primary strata. One stratum means that no stratification has been applied. A comparison with the sample design as it can be identified in the UDB can be found in Goedemé (forthcoming).

4.2 country specific overview country Primary Strata PSUs Further comments Households instead of dwellings, real number of Austria changed selection process in 2008. As a result, stratification cannot be accurately replicated: Strata are not exclusive. PSUs Several single PSUs, no information on this issue in quality AT OK unknown report. BE07 DB040 instead of DB050 DB060 not unique across selections (243 instead of 275 PSUs) DB040-> split PSUs due to moving of households BE08 DB040 instead of DB050 BG OK OK CY OK OK DB060 missing (DB061?) 2 too much after stratification with DB050 CZ OK probably DB050 does not refer to primary strata DE in Mikrozensus missing DK OK OK EE OK OK DB050 contains only part of stratification information, must be combined with modified less PSUs DB040 than ES variable) reported FI FR07 OK DB050 missing Households instead of dwellings, real number of PSUs unknown DB060 partially missing DB060 not unique across DB050 - > incorrect number of PSUs in UDB The sample frame can be problematic as: it is based on another sample; only consists of households who have indicated they are willing to participate in another survey. In any case the primary strata and PSUs of the Mikrozensus should be included in DB050 and DB060. DB040 -> split PSUs due to moving of households; One single PSU, no information provided It should be checked whether stratification has been applied for creating the master sample or for selecting SILC respondents from the master sample. Additionally, if I understood it correctly, for each wave (of two waves) strata receive a unique ID although they refer to the same strata. Therefore, in principle, for variance estimation they should be regrouped together such that the number of strata is 13 instead of 26. (currently ignored in Goedemé (forthcoming)). DB040 -> split PSUs due to moving of households

country Primary Strata PSUs Further comments DB050 contains only part of stratification information, must be combined with modified DB040 -> split PSUs due to moving of households; FR08 DB040 variable OK single PSUs: no information provided more PSUs than GR OK reported DB060 partially missing, not known whether DB062 or household ID should be used DB060 Many single PSUs, not reported whether autorepresentative. 3 strata less partially Hungary has changed several times the sample design, making HU than reported missing. accurate variance estimation very complex. IE OK number of PSUs less than reported No information on single PSUs IS OK OK IT OK OK Many single PSUs. In the quality report it is notified that there are autorepresentative PSUs. However, it is not clear whether this applies to all 110 single PSUs and whether DB062 should be used instead of DB060 in these cases. No information on subsequent stratification within autorepresentative PSUs. LT OK OK LU OK OK No information on single PSU LV OK less PSUs than reported In 81 case no information on DB050 and DB060 MT OK Households instead of dwellings (?), real number of PSUs unknown Change in sample frame. Only quality report of 2007 operation publicly available. The sample frame can be problematic as: it is based on another sample; only consists of households who have participated in several waves of the Labour Force Survey. Nevertheless, DB050 and DB060 probably correctly refer to the real primary strata and PSUs of the LFS. In the Intermediate national quality report it is mentioned that primary strata are a crossing of COROP region (40 regions) and interviewer region. However, DB050 contains only 40 regions. Actual number of strata not 1 PSU less than NL clear reported NO OK OK changed criteria for implicit stratification between waves. PL OK less PSUs than reported Furthermore, it is not entirely clear whether for the 'second' part of the sample (for households containg no person younger than 65), the same PSUs have been used than for the first part. DB060 not unique across DB050 - > incorrect number of PSUs in UDB. It could be that the actual number of PSUs is different fro 5912, which is the number of PSUs for the first wave of EU-SILC. However, normally the total number of PSUs is kept more or less constant.

country Primary Strata PSUs Further comments 1 less than PT OK reported RO OK 1 less than reported SE OK OK SI OK much less PSUs than reported Implicit stratification has been applied. SK OK OK UK one stratum more than reported less PSUs than reported

5 Conclusion As EU SILC consists of samples in every EU member state, indicators based on EU SILC should be accompanied by appropriate estimates of their precision and statistical reliability. An important aspect in the estimation of sampling variance, is the sample design. Therefore, sample designs must be fully and precisely documented (especially the first stage of the sample) and adequate variables in the database are needed to replicate the sample design. EU SILC contains several variables relating to the sample design. In this report the quality of variables DB050 and DB060 and their documentation has been evaluated. For many countries especially those with a simple sample design the information in the EU SILC database is adequate, as well as the documentation. However, for many other countries the sample design variables are not accurate and information is lacking. Several guidelines have been described and recommendations have been proposed for improving the situation in the future. If Eurostat takes the lead, in many cases these guidelines and recommendations can be implemented with a limited effort of both Eurostat and the national statistical offices. The first step should be to clarify some remaining questions and problems identified in this report. The second step consists in informing and training the responsible persons in the national statistical institutes to construct the sample design variables correctly and to improve the documentation on the sample design. The third step is that Eurostat performs a consistent and regular quality check on both the documentation and the sample design variables as part of its regular validation exercise for each wave of EU SILC in each country. Additionally, the disclosure of all sample design variables within the UDB merits a proper discussion, possibly disclosing the information at least in the case of countries which permit this. Researchers working with the EU SILC UDB need to be able to estimate correct standard errors and make correct statistical inferences. Not providing such information limits the usefulness and use of EU SILC. If information on PSUs and strata is properly anonymised, it is hard to see how disclosing such information would present real issues of confidentiality or privacy. Finally, I believe it is worthwhile to continue general discussions on how to organise statistically efficient and cost efficient household surveys in the EU and how to improve the quality of sampling frames. Random samples enable researchers to compute estimates of indicators in a relatively cheap way as relatively reliable estimates for millions of individuals can be made on the basis of information on several thousands of persons. The strength of a properly drawn random sample is precisely that it can give together with the estimate an indication of the precision and statistical reliability of that estimate. If however, the sample design is so complex that it becomes impossible, or nearly impossible to estimate the sampling variance accurately, the usefulness of the sample loses much of its power from the moment its users truly realise that results on the basis of a limited number of observations are inferred to much larger populations. Therefore, when designing sample selection schemes, cost efficiency and statistical efficiency should be well balanced. In doing so, two sides of statistical efficiency should be kept in mind: one refers to its usual meaning in terms of limiting the sampling variance introduced by clustering (by drawing many small and/or very heterogeneous clusters); the other refers to the limitation of the complexity in order to enable all users of the survey to estimate the sampling variance of the point estimates if a reasonable effort is made to do so. Especially in the case of a rotational panel design it is important to keep a check on complexity, among others by reducing changes of selection schemes between panels and waves as much as possible.

References Biewen, M. (2002), 'Bootstrap Inference for Inequality, Mobility and Poverty Measurement' in Journal of Econometrics, 108(2): 317 342 Cochran, W. G. (1977), Sampling Techniques, New York: John Wiley & Sons. Davidson, R. and Flachaire, E. (2007), 'Asymptotic and Bootstrap Inference for Inequality and Poverty Measures' in Journal of Econometrics, 141(1): 141 166 del Mar Rueda, M. and Muñoz, J. (2009), 'Estimation of poverty measures with auxiliary information in sample surveys' in Quality and Quantity, online first: 1 14 DESTATIS (2006), Mikrozensus, Wiesbaden: Statistisches Bundesamt. DESTATIS (2009), Gemeinschaftsstatistik über Einkommen und Lebensbedingungen, Wiesbaden: Statistisches Bundesamt, 8p. Efron, B. and Tibshirani, R. J. (1998), An Introduction to the Bootstrap, Boca Raton: Chapman & Hall/CRC, 436p. Eurostat (2002), Monographs of official statistics. Variance estimation methods in the European Union, Luxembourg: Office for Official Publications of the European Communities, 63p. Eurostat (2010), Combating poverty and social exclusion. A statistical portrait of the European Union 2010, Luxembourg: Publications Office of the European Union, 111p. Goedemé, T. (forthcoming), How much confidence can we have in EU SILC? Complex sample designs and the standard error of the Europe 2020 poverty indicators, CSB Working Paper Series, Antwerp: Herman Deleeck Centre for Social Policy, University of Antwerp Howes, S. and Lanjouw, J. O. (1998), 'Does Sample Design Matter for Poverty Rate Comparisons?' in Review of Income & Wealth, 44(1): 99 109 Jolliffe, D., Datt, G. and Sharma, M. (2004), 'Robust Poverty and Inequality Measurement in Egypt: Correcting for Spatial price Variation and Sample Design Effects' in Review of Development Economics, 8(4): 557 572 Kalton, G. (1983), Introduction to Survey Sampling, Quantitative Applications in the Social Sciences, Sage University Paper No. 35, Beverly Hills: Sage Publications, 96p. Kish, L. (1965), Survey Sampling, New York: John Wiley & Sons, 643p. Lee, E. S. and Forthofer, R. N. (2006), Analyzing Complex Survey Data. Second Edition, Quantitative Applications in the Social Sciences, 71, Thousand Oaks: Sage Publications, 91p. Mooney, C. Z. and Duval, R. D. (1993), Bootstrapping: A Nonparametric Approach to Statistical Inference, Quantitative Applications in the Social Sciences, Sage University Paper No. 95, Newbury Park: Sage Publications, 72p. Rodgers, J. R. and Rodgers, J. L. (1993), 'Chronic Poverty in the United States' in The Journal of Human Resources, 28(1): 25 54 Sturgis, P. (2004), 'Analysing Complex Survey Data: Clustering, Stratification and Weights' in Social Research Update, 43: 1 6 Trede, M. (2002), 'Bootstrapping inequality measures under the null hypothesis: Is it worth the effort?' in Journal of Economics, 77(Supplement 1): 261 282 Van Kerm, P. (2002), 'Inference on Inequality Measures: A Monte Carlo Experiment' in Journal of Economics, 77(Supplement 1): 283 306 Wolff, P. (2010), 17% of EU Citizens were at risk of poverty in 2008, Statistics in Focus, 9/2010: Eurostat, 8p.