The electoral register as a sampling frame



Similar documents
2. Incidence, prevalence and duration of breastfeeding

Poverty among ethnic groups

Why Sample? Why not study everyone? Debate about Census vs. sampling

Article: Main results from the Wealth and Assets Survey: July 2012 to June 2014

Electoral Registration Analysis

UK application rates by country, region, constituency, sex, age and background. (2015 cycle, January deadline)

Public and Private Sector Earnings - March 2014

Maidstone is the largest district in Kent with a resident population of 155,143. This grew by 11.7% between 2001 and 2011.

Profile of Black and Minority ethnic groups in the UK

Danny R. Childers and Howard Hogan, Bureau of the,census

Stigmatisation of people with mental illness

2011 UK Census Coverage Assessment and Adjustment Methodology. Owen Abbott, Office for National Statistics, UK 1

BUSINESS POPULATION ESTIMATES FOR THE UK AND REGIONS

Customer Satisfaction with Oftel s Complaint Handling. Wave 4, October 2003

The Office of Public Services Reform The Drivers of Satisfaction with Public Services

Farm Business Survey - Statistical information

Egg and sperm donation in the UK:

CUSTOMER SERVICE SATISFACTION WAVE 4

Global Food Security Programme A survey of public attitudes

Childcare and early years survey of parents 2014 to 2015

English Housing Survey Headline Report

COI Research Management Summary on behalf of the Department of Health

TRADE UNION MEMBERSHIP Statistical Bulletin JUNE 2015

Descriptive Methods Ch. 6 and 7

A National Statistics Publication for Scotland

Ethnicity and Second Generation Immigrants

Disability Living Allowance Reform. Equality Impact Assessment May 2012

RR887. Changes in shift work patterns over the last ten years (1999 to 2009)

A Landlord s Guide to Housing Benefit

USES OF CONSUMER PRICE INDICES

Beyond 2011: Administrative Data Sources Report: The English School Census and the Welsh School Census

Social work education in England

Reflections on Probability vs Nonprobability Sampling

Income Tax Liabilities Statistics to

Types of Error in Surveys

In 2013, U.S. residents age 12 or older experienced

CENTRAL GRANT APPLICATION GUIDELINES

Equality Impact Assessment Support for Mortgage Interest

This briefing is divided into themes, where possible 2001 data is provided for comparison.

1.17 Life expectancy at birth

2014 May Elections Campaign Tracking Research

DWP: Evaluation of Removal of the Spare Room Subsidy (Bedroom Tax)

Technical Information

English Housing Survey Headline Report

Make and register your lasting power of attorney a guide

CRIDE report on 2012 survey on educational provision for deaf children in England

architecture and race A study of black and minority ethnic students in the profession Research outcomes: 6

UK Income Tax Liabilities Statistics

Factsheet Empty Homes

Chapter 3: Property Wealth, Wealth in Great Britain

Mode and Patient-mix Adjustment of the CAHPS Hospital Survey (HCAHPS)

Non-random/non-probability sampling designs in quantitative research

BIS RESEARCH PAPER NUMBER 222a. Traineeships: First Year Process Evaluation. Executive Summary MARCH 2015

Contents Executive Summary Key Findings Use of Credit Debt and savings Financial difficulty Background...

National Disability Authority Resource Allocation Feasibility Study Final Report January 2013

Disability Rights Commission Disability Briefing June 2005

Household Finance and Consumption Survey

The relationship between mental wellbeing and financial management among older people

The value of apprenticeships: Beyond wages

Residential Property Investors in Australia 1

Explaining the difference your project makes A BIG guide to using an outcomes approach. Sara Burns and Joy MacKeith Triangle Consulting October 2006

Main Report: The Burden of Property Debt in Great Britain, 2006/08 & 2008/10

Sports Coaching in the UK III. A statistical analysis of coaches and coaching in the UK

PERCEPTION OF BASIS OF SHE AND SHE RISK MANAGEMENT

Introduction Qualitative Data Collection Methods... 7 In depth interviews... 7 Observation methods... 8 Document review... 8 Focus groups...

Estimates of the number of people facing inadequate retirement incomes. July 2012

The income of the self-employed FEBRUARY 2016

INTERNATIONAL STANDARD ON AUDITING (UK AND IRELAND) 530 AUDIT SAMPLING AND OTHER MEANS OF TESTING CONTENTS

Survey Research. Classifying surveys on the basis of their scope and their focus gives four categories:

Acas Telephone Helpline: Findings from the 2004 Customer Survey 04/04

MOST FREQUENTLY ASKED INTERVIEW QUESTIONS. 1. Why don t you tell me about yourself? 2. Why should I hire you?

Missing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random

INTERNATIONAL STANDARD ON AUDITING 530 AUDIT SAMPLING AND OTHER MEANS OF TESTING CONTENTS

Moving Home Guide. A simple step by step guide to buying & selling

Introduction to Sampling. Dr. Safaa R. Amer. Overview. for Non-Statisticians. Part II. Part I. Sample Size. Introduction.

The chain. Unravelling the links between sales

Research into Issues Surrounding Human Bones in Museums Prepared for

Dom Jackson, Web Support Assistant Student Services Information Desk

Investigating the Accuracy of Predicted A Level Grades as part of 2009 UCAS Admission Process

2011 Census: Key Results on Population, Ethnicity, Identity, Language, Religion, Health, Housing and Accommodation in Scotland - Release 2A

Migration indicators in Kent 2014

INTERNATIONAL FRAMEWORK FOR ASSURANCE ENGAGEMENTS CONTENTS

Social Return on Investment

HIV prevention and the wider UK population. What HIV prevention work should be directed towards the general population in the UK?

Private Sector Employment Indicator, Quarter (February 2015 to April 2015)

Market Research. Market Research: Part II: How To Get Started With Market Research For Your Organization. What is Market Research?

27 February 2014 Population

Financial capability and saving: Evidence from the British Household Panel Survey

Consumer Price Indices in the UK. Main Findings

THE JOINT HARMONISED EU PROGRAMME OF BUSINESS AND CONSUMER SURVEYS

Transcription:

The electoral register as a sampling frame Kate Foster 1. Introduction The Postcode Address File (PAF) and the electoral register (ER) are the most complete and accessible national frames of residential addresses in Great Britain and both are extensively used for drawing random samples for general population surveys. Although an ideal sampling frame would cover the whole population of interest it is well known that both of these frames are, in practice, incomplete. Among survey practitioners there is considerable interest both in monitoring changes in the coverage of the PAF and the ER and also in defining the characteristics of those who are omitted, since this may be a source of bias in survey results. The coverage of both sampling frames is currently being assessed by Social Survey Division (SSD) using the sample of households selected for the Electoral Register Check, which was carried out by SSD in conjunction with the Census Validation Survey. This paper reports on the coverage of the 1991 electoral register as a sampling frame of the private household population and updates a similar analysis presented as part of the report on electoral registration in 1981. 1 The coverage of the PAF will be dealt with in a later paper. Although there is general interest in monitoring changes over time in the coverage of sampling frames, there was particular concern that the electoral register s coverage might have suffered over recent years because some individuals wanted to avoid registration for the community charge and hence did not register as electors. The check on the 1991 register reported in this paper showed that 94.7% of households were in addresses listed on the register and that 95.4% of the usually resident adult population were in listed addresses. These results indicate a slight deterioration in the coverage of the frame since 1981 when 96.4% of households and 96.5% of adults were in listed addresses. The register s coverage was lower among single adult households, those in privately rented accommodation, those in London, especially Inner London, and, elsewhere, in non-metropolitan areas of Great Britain. Coverage among adults was lower for individuals who had moved in the previous 12 months, those in the 20-29 age range, and among non-white ethnic groups. An assessment of the deterioration of the register as a frame of households over time gave similar results to the 1981 study, suggesting that coverage might decrease by around 1% over the year in which the register was available for use. 2. The use of the electoral register as a sampling frame The electoral register is compiled as a list of all people eligible to vote in the United Kingdom; this includes citizens of the Commonwealth and the Irish Republic as well as of Great Britain and Northern Ireland who are aged 18 or over or who will become 18 during the life of the register. The register is compiled on October 10 th each year, and is in force from February 16 th of the following year for a period of one year. Because of the time required to bring together in one place the different parts of the new register, it is normally available for sampling purposes roughly from April of the year in which it comes into force through to March of the following year (ie. form April 1991 to March 1992 for the 1991 register). The register is known to be an incomplete list even of electors, and it will obviously not list adults who are not eligible to vote. The 1991 Electoral Register Check 2 showed that, for Great Britain as a whole, 7.1% of eligible people who were recorded in the Census were not included on the register, but that the non-registration rate varied by the individual s age, length of residence at that address, ethnic origin and region of residence. Although the register is primarily a list of adults, it is preferable to use it as a sampling frame of addresses because the coverage of addresses is known to be more complete than the coverage of named electors. This is primarily because an address is listed so long as at least one elector is registered there, but coverage of addresses may also be improved because of the practice in some areas of including the same information as on the previous year s register if no form has been returned by the household. 3. The method of assessing coverage The usual method of assessing the coverage of a sampling frame is to identify a representative sample of the target population drawn from an independent source and to check whether the sample members are covered by the frame. In 1991, as in 1981, a suitable sample for an assessment of the electoral register as a sampling frame was provided by the sample for the Electoral Register 1 SMB 33 7/93

Kate Foster Check (ERC). This survey was carried out alongside the Quality Check element of the Census Validation Survey (CVS) which used a sample of private households drawn from census records. The sample design for the CVS was a multi-stage probability sample to select 6,000 households in Great Britain that had returned a census form. The sampled households contained about 11,300 usually resident adults aged 17 or over on 15 February 1991. Visitors staying at the sample of addresses on census night (21 April 1991) were excluded from the analysis as they were also listed on the census form for their usual residence. Since the CVS oversampled areas where enumeration was expected to be difficult, the achieved samples of households and of adults were weighted in the analysis to reflect their different probabilities of selection. The tables in this paper give weighted bases only. On the electoral Register Check, interviewers transcribed census information onto the questionnaire before checking the entry for the household on the electoral register. This information was therefore available for all cases regardless of their response to the ERC interview. Further items of information relating to informants eligibility for inclusion on the register were collected in the interview, including the previous address of adults who had moved since the qualifying date in October. The results in this paper are based on households which returned a census form, although the sample of adults also includes any people in that sample of households who were identified by the CVS as not having been enumerated on the Census. The undercount for the Census is estimated to be 2% of the resident population (around one million people), The electoral register as a sampling frame about one fifth of whom were identified by the CVS as being missed form enumerated households. Thus about 1.6% of the resident population were missed both in the Census and in this element of the CVS. Insofar as there was deliberate evasion of both the Census and the CVS, it is likely that the ERC sample will tend to underestimate the level of nonregistration of individuals on the electoral register. It is, however, probable that this under-enumeration has much less effect on the register s coverage of addresses than of named individuals. 4. The coverage of households and of adults Although the electoral register is mainly used as a sampling frame of addresses, surveys are generally concerned with households or adults. The adequacy of the electoral register as a sampling frame is therefore assessed by the proportion of households and of adults that are included in listed addresses. Both coverage rates are shown in Table 1. The study showed that 94.7% of all private households that were occupied on census night were at addresses listed on the electoral register. This coverage rate for households compares with a figure of 96.4% for the 1981 register, so there is evidence of a slight deterioration in the register as a sampling frame for households. The coverage rate for adults is based on those who were defined as being usually resident at the ERC sample of addresses. Some 95.4% of this sample of adults were in addresses listed on the register. Most of the sample of adults (88.0%) were themselves listed on the register at their April address and a further 7.4% lived in addresses that were listed even though they themselves were not. The difference between the coverage rates for adults and households reflects the variation in rates of coverage by household size, as shown in Table 2. Table 1 The registration of adults and their addresses in April 1991 Households Adults Percentage Number Percentage Number Person on the register at April address na na 88.0 8720 Person not on the register but April na A 7.4 736 address is Total in addresses on the register 94.7 4862 95.4 9456 Address not on the register 5.3 271 4.6 451 Total (weighted bases) 100 5133 100 9907* * The sample base for adults is all aged 17 years or over on 15.2.91 who were usually resident on census night in the sampled households. 2 SMB 33 7/93

Kate Foster The electoral register as a sampling frame Table 2 Households and adults on the register by the number of usually resident adults in the household Number of adults usually resident in the household Households Proportion in addresses on the register Base=100% Adults Proportion in addresses on the register Base=100% One 92.2% 1552* 93.2% 1527 Two 95.5% 2698 95.5% 5396 Three 97.4% 615 97.4% 1844 Four or more 95.7% 268 95.3% 1140 All households/adults 94.7% 5133 95.4% 9907* * Includes a small number of households with no usually resident adults under census definitions. 5. Variation in the coverage of households and adults We now look at variation in the register s coverage by selected characteristics of households and of individuals. The conclusions reached are broadly similar to those reported in the 1981 assessment of the electoral register but some new analyses are also presented. Characteristics of the household The register s coverage of households was lowest (92.2%) among those comprising only one adult. Coverage increased with household size up to 97.4% for households with three usually resident adults but was slightly lower (95.6%) for adults but was slightly lower (95.6%) for households for four or more adults (Table 2). An improvement in coverage with increasing Table 3 Households and adults on the register by region and country Region/country Households Adults Proportion in addresses on the register Base=100% Proportion in addresses on the register Base=100% North 95.1% 292 96.2% 537 Yorkshire & Humberside 97.3% 483 97.3% 939 North West 96.9% 563 97.2% 1045 East Midlands 93.5% 379 95.2% 744 West Midlands 96.5% 450 97.1% 889 East Anglia 96.3% 191 96.3% 387 South East (exc London) 93.9% 965 94.5% 1886 South West 93.6% 457 94.8% 909 London 91.2% 622 91.8% 1171 Inner London 87.3% 234 87.9% 426 Outer London 93.7% 388 94.0% 745 Regions exc. London Metropolitan 97.1% 1012 97.6% 1930 Non-Metropolitan 94.4% 3047 95.2% 5939 England 94.7% 4402 95.3% 8505 Wales 93.0% 280 94.9% 534 England & Wales 94.6% 4682 95.3% 9039 Scotland 96.3% 451 97.1% 868 Great Britain 94.7% 5133 95.4% 9907 3 SMB 33 7/93

Kate Foster The electoral register as a sampling frame Table 4 Households and adults in addresses on the register by housing tenure Housing tenure Households Adults Proportion in addresses on the register Base=100% Proportion in addresses on the register Base=100% Owned outright 97.9% 1214 98.5% 2234 Buying with a mortgage 95.7% 2168 96.3% 4710 Local authority rented 96.7% 1051 97.1% 1793 Other rented 82.5% 670 82.8% 1144 All households/adults 94.7% 5133* 95.4% 9907* * Includes a few cases where hosing tenure was not known household size is to be expected since households comprising more adults will generally contain more electors and hence there is a greater chance that one elector is listed. The fact that this improvement in coverage did not extend to the largest households, comprising four or more adults, suggests that such households may differ from smaller ones in other significant respects. The register s coverage of households and adults varied according to region of residence, as shown in Table 3 which also gives summaries by country. At the national level the electoral register gave slightly better coverage of households, and also of adults, in Scotland than in England and Wales; 96.3% of households in Scotland were in addresses listed on the register compared with 94.6% in England and Wales. Within England, London has the lowest coverage rate but this was markedly worse for households in Inner London (87.3%) than in Outer London (93.7%). Outside London, coverage tended to be better in metropolitan areas: 97.1% of households in metropolitan areas (excluding London) were in listed addresses compared with 94.4% of households in nonmetropolitan areas. Table 4 shows coverage by housing tenure group. The strongest pattern to emerge is the lower rate of coverage for households in the privately rented sector, which includes those renting accommodation with their job or business as well as those living in furnished or unfurnished rented accommodation. Around 83% of households and of adults in this tenure group were in addresses listed on the register compared with more than 95% for each of the other major tenure groups those who owned their accommodation outright, those buying on a mortgage and those living in local authority rented accommodation. Table 5 Adults whose April address was on the register by how recently they had moved to the address Whether had moved in 12 months before census Adults on the register at April address Adults whose April address was on the register Base = 100% Had moved in previous 12 months 31.7% 76.1% 993 Of whom - Had moved in 6 months since qualifying date 5.6% 69.8% 415 - Had moved in 6 months before qualifying date 81.9% 93.0% 356 Had not moved in previous 12 Months 94.3% 97.6% 8914 All adults 88.0% 95.4% 9907 4 SMB 33 7/93

Kate Foster The electoral register as a sampling frame Table 6 Adults whose April address was on the register by age Age Adults on the register at April address Adults whose April address was on the register Base = 100% 17 71.1% 95.3% 154 18-19 81.6% 94.5% 346 20-24 67.7% 89.2% 964 25-29 75.9% 91.7% 1007 30-49 89.5% 95.2% 3503 50 and over 96.1% 98.3% 3930 All adults 88.0% 95.4% 9907* * Includes a few individuals whose age was not known Characteristics of individuals We now turn to variation in the register s coverage of adults by selected characteristics of the individuals involved. The tables also show, for reference, the proportion of adults in the different categories who were themselves listed on the register. Table 5 looks at the likelihood of adults being listed on the register, or of living in an address that was listed, by whether and when they had moved in the previous year. As would be expected, only a very small proportion (5.6%) of those adults who had moved in the 6 months since the qualifying date for the register, that is between October10th and April 21 st, were themselves listed on the register. Those who had moved in the six months before the qualifying date were also less likely than non-movers to be listed; 81.9% were on the register compared with 94.3% of non-movers. With respect to the use of the register as a sampling frame, about three quarters (76.1%) of those who had moved in the previous year, and 69.7% of those who had moved in the previous six months, were in addresses that were listed on the register compared with 97.6% of non-movers. There are a variety of reasons why the April addresses of movers might not have been listed: the addresses may not have existed or may have been unoccupied at the qualifying date or they may have been occupied by people who were either ineligible for inclusion on the register or were eligible but not listed. As found in previous checks, adults under the age of 30 were not only less likely than older people to be listed themselves on the register but were also less likely to live at addresses that were listed. Table 6 shows that the proportion of individuals who were themselves listed was lowest for the 20-24 age group (67.7%), rising to Table 7 Adults whose address was on the register by age and whether they had moved in the previous year Age Proportion Adults whose April address was on the Base = 100% of movers register Not moved Moved All Not Moved All moved 17 10% 98.4 (10) 95.3 139 (15) 154 18-19 16% 98.6 72.7 94.5 291 55 346 20-24 28% 94.3 76.1 89.2 694 270 964 25-29 23% 95.7 77.7 91.7 781 227 1007 30-49 9% 97.0 75.3 95.2 3205 298 3503 50 and over 3% 98.9 77.7 98.3 3803 127 3930 All adults 10% 97.6 76.1 95.4 8914* 993* 9907* * Totals include a few individuals whose age was not known 5 SMB 33 7/93

Kate Foster The electoral register as a sampling frame 75.9% among the 25-29 age group, but most of those who were themselves not listed lived in addresses that were listed 89.2% and 91.7% respectively of adults in these age groups lived in addresses that were listed on the register. Although only a relatively small proportion (71.1%) of 17 year olds were themselves listed on the register, presumably because it was the first year in which they were eligible for inclusion, they were no less likely than all adults to live at addresses that were listed. Table 7 explores whether the lower coverage of younger age groups is related to their greater mobility. The second column of the table gives the proportion of adults in each age group who had moved in the previous 12 months and shows clearly that adults in the 20-24 age group (28%) and those aged 25-29 (23%) were most likely to have moved; 10% of all adults had moved in that period. As we have already seen, the coverage rate for those who had moved in the 12 months before the census was much lower than for nonmovers (76.1% compared with 97.6%) and Table 7 shows the coverage rates by age for these two groups. There was little variation with age in the coverage of movers while coverage rates for non-movers were only slightly lower for adults in their twenties. Thus most of the under-representation of adults in their twenties can be explained by the group s higher mobility although non-movers in these age groups were also less likely than non-movers in other age groups to be in listed addresses. Finally, Table 8 looks at variation in the electoral register s coverage of adults according to their ethnic group. Eligibility for inclusion on the register as an elector is defined by an individual s citizenship rather than their ethnic group but it is, of course, likely that there is an association between these two attributes. The census question gave a choice of nine ethnic groups which, because of the relatively small number of ethnic minority households in the survey sample, have been grouped to four categories. These are White, Table 8 Adults whose April address was on the register by ethnic group Black (including Black-Caribbean and Black-African), Indian (including Pakistani and Bangladeshi), and Other (including Chinese). Those classified as White by ethnic group were the most likely group to be living at addresses listed on the register but there was little difference in the coverage of those classified as Black, Indian, or Other. Some 95.9% of White adults were in addresses that were listed compared with 84%-88% of the other groups. 6. Movement and deterioration of the frame The register will deteriorate as a frame of household addresses to the extent that new addresses become occupied over time; such addresses may either be new buildings or existing addresses which were unoccupied at the time that the list was drawn up. The coverage rate may also be affected if some addresses that were occupied at the time that the list was drawn up become unoccupied over time, and hence ineligible for inclusion in the sampling frame. This will result in an increase in the proportion of deadwood among listed addresses but such changes will only cause a deterioration in the coverage of the frame if listed addresses are more likely than unlisted addresses to be affected. The main effect over time on the register s coverage of occupied addresses is, therefore, the extent to which newly constructed and previously vacant addresses are (re-)occupied. It was not possible to pursue analyses of this type using the ERC data-set since that survey did not collect information on the history of the addresses occupied by recent movers. Some relevant information is available from the Department of Environment (DOE) which collects data on house constructions although not on the proportion of dwelling units which are temporarily vacated or re-occupied over a given period. The DOE data show that house construction resulted in a 0.9% increase in the dwelling stock over the year in which the 1991 register was in use as a sampling frame (April 1991 to March 1992). Ethnic group Adults on the register at Adults whose April address Base = 100% April address was on the register White 88.8% 95.9% 9380 Black 69.6% 88.3% 170 Indian 79.9% 87.1% 274 Other 64.8% 84.2% 82 All adults 88.0% 95.4% 9907 6 SMB 33 7/93

Kate Foster The electoral register as a sampling frame Deterioration in the coverage of individuals The ERC data does, however, enable us to make a rough estimate of the deterioration over time in the frame s coverage of adults since it includes information on the previous address and date of move of adults who had moved into their census address during the previous year. In general it would be expected that the movement of individuals during the lifetime of the register would have a much greater effect on the accuracy of the register (ie. whether adults are listed at their latest address) than on its coverage (ie. whether they are living at an address which is listed on the register at all). The deterioration in the register s coverage of adults over time was estimated with reference to those adults who had moved between the qualifying date for the register (in October 1990) and the date of the census (in April 1991). The frame s coverage of adults will deteriorate to the extent that these adults moved from listed addresses to unlisted addresses, but this will be offset to the extent that adults moved from unlisted to listed addresses. An estimate of the net percentage change in coverage can be obtained by expressing these two groups as a percentage of all adults in the sample. There were some problems in carrying out this analysis on the1991 ERC database due to the large number of cases in which either the date of moving or the October address was missing. Analysis on those cases for which complete information was available gave similar results to those obtained in the 1981 study, which suggested a deterioration of about 0.4% in the frame s coverage of adults over a six month period. If movement is assumed to have continued at the same rate over the period in which the register was available for use, then this would imply a deterioration in the coverage of adults of about 1% over the 12 month period from April 1991. References 1. Todd J and Butcher B. Electoral registration in 1981. OPCS (1982) 2. Smith S. Electoral registration in 1991. OPCS (1993) 7 SMB 33 7/93

The use of substitution in sampling Dave Elliot 1. Background On a number of occasions recently, the issue has arisen of whether and how to use substitution to make up for a shortfall in sample numbers, from whatever cause. Many people in SSD have a knee-jerk reaction at the mere suggestion, believing it to be inextricably associated with quota sampling and massaging response rates and therefore not the sort of method that a good survey organisation should ever contemplate using. In this paper I take a different view that substitution for non-respondents when used with proper controls may sometimes be a useful addition to the survey sampler s toolkit. However, on other occasions, especially its superficially more innocuous use in replacing ineligible units, the method may sometimes result in significant biases. By substitution I mean the replacement of some specific unit in the set sample which fails to yield a 2.2 Sampling institutions usable response with another unit from the population. I shall illustrate this issue with four recent examples before moving on to generalities. 2. Four examples 2.1 Survey of the homeless The first comes (indirectly) from the planned survey of Psychiatric Morbidity, as part of which OPCS plan to include the homeless. This mobile group is particularly problematic to sample for a number of obvious reasons and the planned design draws heavily on the lessons learnt in a pioneering survey of single homeless people undertaken by SCPR 1. In discussing the details of the methods used in sampling in short-stay hostels, Lynn describes how the establishments were sampled and then a random sample of beds was used in the selected hostels. Substitution was used at both stages hostels that declined to co-operate were substituted (twice in some cases) and sampled beds that were unoccupied on the date of the survey were substituted. Occupants of sampled beds that refused the interview or who could not be located were however not substituted. Likewise respondents who were not eligible for the survey were not substituted. refusals or non-contacts would have biased the sample towards more co-operative and more available respondents. In all cases the substitute use was randomly selected using methods similar to those used in selecting the initial sample. Another part of the sample consisted of users of day centres. In this case people entering the centres were selected using a constant interval, ineligibles and nonrespondents were noted but disregarded and the sampling was continued until the set sample size was achieved. Despite the statement that No substitutes were selected to replace refusals or people who were screened out, the procedure described can be interpreted as substitution by another name. A second example concerns some advice I gave on sampling for a planned survey of children. The sample design has three stages: children within schools within a stratified sample of local authority areas. The plan is to select just one secondary and a number of primary schools in the selected areas and then seek the cooperation of the schools in selecting and interviewing children. If a secondary school declines to take part in the survey, I advised selecting a substitute but the project officer was not happy with this advice believing that the substitution method is fundamentally flawed. 2.3 Sequential sampling within institutions My third example concerns a design we suggested in response to an invitation to tender for a survey of fees paid to private residential homes. The specification suggested a design in which 5 eligible residents were selected from each of a sample of institutions. Since eligibility could not always be easily determined prior to selection, the suggested method was to sample residents one at a time, determine their eligibility and continue sampling until the target number was achieved. This is an example of sequential sampling and mirrors closely the method of sampling visitors to day centres described in the first example. In justification of this procedure, Lynn writes: These strict probability sampling methods were deemed necessary in order to ensure the accuracy of the survey results. Allowing for It was particularly problematic in this case because the primary aim of the survey was to produce grossed estimates of total expenditure and there was no reliable independent measure of the size of the eligible population. Consequently it was essential to know and control the selection probabilities in order to gross up 8 SMB 33 7/93

Dave Elliot The use of substitution in sampling the survey means. With the method suggested, these probabilities could not be determined nor even iv. estimated without bias, which in turn would bias the grossed estimates. It may be useful to run through the argument for this assertion in a simple case before moving on to a more general discussion of the effects of substitution in different situations. Suppose we need a sample of 5 eligible residents from a home with 10 residents, exactly 5 of whom are eligible. We are aiming to produce an estimate of the probability of sampling any eligible resident. The true value of this is 1, as we are taking a sample of 5 Table 1 Percentage bias in eligible population estimates by eligible residents from a population of only 5. So using population size and ineligibility rate the sequential sampling method suggested, each eligible resident must eventually be selected. Population in Ineligibility % Bias The sample could be achieved in a number of ways, the two most extreme of which are that the first 5 selected residents are all ineligible. In the first case we would estimate the selection probability as only 1/2, as we would assume that the non-selected residents are similar to those selected and are also eligible. In the second case we should estimate the selection probability as 1, as we will have actually selected all 10 residents and will know that only 5 are eligible, and that all of these are bound to be chosen in the sample. Obviously in no case should we estimate a probability greater than 1. Thus averaging over all the possible sequences would produce a mean value less than one and thus the estimated probability is biased downwards in this case. The effect of this underestimation of the selection probabilities is that population totals will inevitably be overestimated (as they are obtained by dividing the sample totals by the erroneous probabilities). The extent of the bias depends on several factors. i. The average number of eligible residents per institution as this increases, the bias reduces. In this case we knew that the average number of residential places per home was just 17 but with a wide variation around this figure. ii. The ineligibility rate. With zero ineligibility there is no bias. As the ineligibility rate increases, so does the bias. A previous feasibility study had suggested between 5% and 15% ineligibility overall (although this estimate was based on a purposive sample and is therefore not reliable) but ineligibility rates in different homes will inevitably vary greatly around the overall value. Any positive correlation between survey variables (average fees) and the number of eligible residents will tend to increase the bias. A negative correlation will reduce it. Table 1 below shows the % bias in estimates of the total eligible population for varying population sizes and ineligibility rates for a fixed eligible sample size of 5 residents. In the absence of any correlation (see (iv) above) the same bias will occur in all survey estimates. Home Rate (%) 10 10 1.9 20 10 1.8 50 10 1.7 10 20 3.8 50 20 3.6 10 30 6.0 50 30 5.6 10 40 8.3 50 40 7.7 10 50 10.8 50 50 10.0 50 60 12.5 50 80 18.3 2.4 PAF used address procedure Substitution is currently used within the PAF sampling system developed in OPCS. PAF addresses that have been selected for one OPCS survey are normally tagged and excluded from reselection in any other OPCS survey for a fixed time period (currently three years for most surveys). The way this is implemented is that such addresses are left on the file and so are liable to be reselected on another survey. When this occurs they are immediately substituted by a neighbouring address. The rationale is that as the first sample which marked them as a used address was random, the substitutes that happen to have such addresses as neighbours can also be regarded as a random sample of the population. In fact insofar as the ordering of addresses within the PAF places addresses with similar characteristics close together, then systematic sampling will act as a kind of implicit stratification and we could expect some iii. The eligible sample size per home. This bias problem is particularly serious for small sample sizes. efficiency gains as a consequence. This stratification effect will be preserved by substituting a neighbouring address rather than simply boosting the initial sample size to compensate for these special non-respondents. 9 SMB 33 7/93

Dave Elliot The use of substitution in sampling 3. Substituting for non-respondents Substitution is being used or considered for use in two quite different contexts in these four examples: replacing initial non-respondents (both refusals and non-contacts) and replacing selected units which are later discovered to have been ineligible for the survey. In both cases the main aim is identical to recover the number of cases that have been lost from the sample and hence boost precision. However the effect is different in the two cases. Under a simple model for non-response, all members of the population will either respond or fail to respond to a survey and different samples will pick up these two groups in different proportions by chance. If the mean for respondents differs from that for non-respondents, the normal survey estimate, excluding the nonrespondents will be biased. If we substitute the initial non-respondents with a further random sample from the population, the mean of the combined sample of the two groups of respondents will be biased to exactly the same extent as the mean of the first group of respondents, but the sample size will undoubtedly be larger and so the estimate will be more precise. Clearly we could extend the substitution procedure by continuing to select and approach people until we achieve a set target of interviews. So long as the 4. Substituting for ineligibles substitutes are randomly selected, the procedure clearly does not affect the bias in either direction. Moving now to a slightly more realistic model of the non-response mechanism, suppose that the tendency to respond to the survey is different amongst different groups of people in the population and that once again the means of respondents and non-respondents differ within the groups. Then on average any random sample will select people from these groups in the proportion that they occur in the population and survey estimates will again be biased. If the initial sample is selected with equal probability, then the remaining population will have exactly the same means as the full population and so an additional sample taken to substitute for the initial non-respondents will not affect the bias of sample estimates. If no substitutions are made and post-stratification by these groups is used to reduce the non-response bias, the effect is artificially to boost the size of the groups the size of these groups directly by substituting the nonrespondents. In this case the effect on the bias is identical to that post-stratification. A problem with the approach occurs when the units are not being selected with equal probability the most likely situation occurs when one is selecting aggregates such as institutions where these are often selected with probability proportional to size. In this case the residual population of institutions, having selected a sample, will have a different mean from the total population. The only way to deal satisfactorily with this situation in general is to take a larger sample than is needed initially and hold part of it in reserve to be used a s substitutes. In most cases, this should be the preferred method of implementing substitution even if the units are being selected with equal probabilities. The design in example 2.1 seems inconsistent, as substitution was allowed for non-cooperating hostels but not for non-responding individuals and the basis of Lynn s argument against substitution of individuals is unclear. However the apparent inconsistency might be due to concerns about the effect on interviewer motivation and response rates if substitution of individuals had been allowed. This is discussed further in section 5, below. In the example in 2.3 above substitution for ineligibles would have made the estimation of selection probabilities and hence of any survey estimates particularly problematic. As the discussion above makes clear, the bias is likely to be most serious when ineligibility rates are high and target sample sizes are small. However it does not disappear entirely in other cases whereas the most straightforward alternative to substitution, boosting the initial sample size in line with overall expected ineligibility, is unbiased. The bias arises because of the necessity of estimating the different selection probabilities. Consequently substitution for ineligibles will only be unproblematic when the units involved are being selected with equal probabilities at that particular stage in the sampling or when the true ineligibility rate for the sampling unit is known. Although this may be true at the final stage of some multi-stage samples, the widespread (and highly desirable) use of pps sampling means that such examples will be rare and that consequently with the lowest response rates by giving them larger weights. This will often reduce but not eliminate the bias. An alternative which can be used if the groups can be identified on the sampling frame would be to boost substitution for ineligibles cannot be recommended in anything other than simple random samples. 10 SMB 33 7/93

Dave Elliot The use of substitution in sampling 5. Other considerations 6. Conclusion Bias and precision should never, of course, be the sole criteria in determining sampling procedures. We should also consider their impact on OPCS staff and particularly on interviewers. Substitution or something very like it is widely used in quota and other nonrandom sampling methods and we must beware of giving interviewers (or anyone else involved with the survey) the impression that any informant is as good as the one we initially selected. There are two separate risks involved here. Firs that interviewers will try less hard to secure a high response rate if they know that a substitute will always be provided the discussion in Section 3 assumes that the methods we use to produce a high response rate will continue to be used on both the initial set sample members and the substitutes. If the result of permitting substitution is a reduction in response rates then we may be more prey to non-response biases of the argument used above on the absence of any change in non-response bias will fall. Secondly there is a risk that if we permit substitution in some cases, interviewers may make their own non-random substitutions in other cases to boost their apparent response rates. Substitution of non-respondents with randomly selected alternatives while not in any way reducing nonresponse bias, in principle does not increase it either. Its use would increase the sample size more efficiently than boosting the set sample since it would fix the final sample size. However the argument of the last selection on the potential psychological impact on interviewers of permitting some substitution when none has been allowed before I believe sways the argument against its widespread introduction in SSD. However in those situations where its use does not impinge on interviewers, for example in replacing noncooperating institutions, it appears to offer some advantages. This is especially true when the substitute can be selected from the same population group as the initial non-respondent, when its effect is akin to poststratification, ie it may reduce non-response bias. Substitution of ineligible units, although not affecting interviewers in any obvious way, may introduce biases in certain cases and should in general be avoided. Reference There is also a third rather less tangible risk that by introducing a method which interviewers may associate with lower quality research, they might start to feel less confident in our own commitment to quality methods which could in turn affect their motivation to maintain high standards. 1. Lynn, P. Survey of single homeless people, Technical Report. SCPR. 11 SMB 33 7/93

Characteristics of non-responding households on the Family Expenditure Survey 1 Sarah Cheesbrough Response rates on the FES are normally in the range of 68-72%. These figures are lower than on most SSD surveys due to the demand placed on respondents to complete a long questionnaire on income and expenditure and then keep a detailed diary of expenditure for the two weeks following the interview. Non-response is a problem in all sample surveys of the general population; on the FES the rather high rate of non-response means that there is a danger of underrepresenting important groups of spenders. Researchers need to seek ways of improving response and, at the same time, to develop methods of compensating for non-response after the event. It is the latter approach which is the subject of this paper. Methods of weighting the data to compensate for nonresponse are being investigated. Every 10 years the comparison of FES data with Census records provides the most accurate analysis of the characteristics of nonrespondents. The variables compared could then provide a source to re-weight for non-response. This and other methods of re-weighting survey data are described and discussed in a recent monograph in SSD s New Methodology Series 2. One drawback of using Census data is the time lag of up to 10 years between the Census measures and the survey data that completed for every household, whatever the outcome code. On return to the office the RCQs were keyed into a Blaise CADI program. This process allowed the comparison of responding and non-responding households to be completed separately from the normal keying and editing timetable of the FES. The RCQ covered basic information about household members such as age and sec, relationships within the household and working status. Additionally there were questions on the type and tenure of the accommodation and on car ownership. A second section required the interviewer to record his/her observations of the ethnic group and main language spoken by the household. Finally, there were three questions at which the interviewers impressions of the household were entered, covering any ill health in the household that might affect response, wealth of the household and any other relevant information. Interviewers were briefed on various methods of introducing the RCQ; a flexible approach in gaining the information was the key to not damaging normal FES response rates. 2. Results are being re-weighted. This project was set up to 2.1 Reception by informants consider a method of collecting information on nonresponding households alongside the main survey. The emphasis of this exercise was to evaluate the feasibility of interviewers gathering information direct from nonresponders on the doorstep. Completed RCQs were available for a total of 772 eligible households. Some forms were not returned in time for analysis but the outcome codes for the RCQ exercise were in exact proportion to the figures for the total of 819 households that were eligible in January. Other studies of non-respondents have looked at the use of a basic question that is asked of non-responding households in order to identify a key characteristic of these households that is relevant to the subject of the survey3. For the FES a pool of basic questions was considered in terms of their importance for reweighting variables and how practical it would be to collect this information without affecting the main survey response rate or antagonising members of the public. 1. Method The study of non-responding households was carried out for one fieldwork month (January 1993) in conjunction with the normal FES quota. A very short Response Characteristics Questionnaire (RCQ) was The January 1993 FES response rate of 73.3% was below that of the previous January (74.6%), when an increased incentive payment for co-operating households had just been introduced; but it was above that of any of the previous six calendar months and slightly above the monthly average for the whole of 1992 (72.8%). It seems safe to conclude, then, that the exercise did not damage the FES response rate. There was also no evidence, from interviewers comments or complaints to OPCS from members of the public, that people were antagonised by the small amount of extra probing for information from nonresponders that the exercise required. 12 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES Table 1 Main source of information for non-responding households Main source of answers Refusal before/during Interview Refusal at diary stage Non- Contacts Member of 105 14 0 119 Household Neighbour etc 7 0 4 11 Interviewer 58 0 8 66 Observation Total Number of households 170 14 12 196 Total 2.2 Introducing the Response Characteristics Questionnaire Once it was clear that at least one member of the household had refused to co-operate with the survey there were two distinct methods used by interviewers to collect the information. This varied both according to 2.4 The quality of information the interviewer and the type of household. The first type of introduction explained briefly the exercise to the household member People refuse for all types of reasons and we are interested in seeing whether we are losing similar groups of the population, so if you could just spare me a few moments I d be very grateful if I could just ask.. Alternatively it was often more appropriate to use indirect methods to obtain the answers to the RCQ questions. An interviewer reported.. I never asked the questions as questions but tried to ask them as part of general conversation. Someone who is telling you about how disgraceful this big brother attitude is is hardly going to turn round and tell you how many bedrooms they have. Interviewers found it easier to gain co-operation where either only one member of the household had refused to participate or where it was only the income section of the survey that the household objected to. Although interviewers were given the options of using the questionnaire or a small prompt card on the doorstep, the majority found it easier to memorise the questions so that there were no physical interruptions to the primary task of converting the household to a response. 2.3 Souurce of information Interviewers were asked to report on the methods they had used to collect the information on non-respondents. In table 1 the methods used are shown against the type of non-response at the household. In the few cases on non-contact interviewers still sometimes managed to Interviewers observations obtain some information from neighbours. Encouragingly, 62% of households who refused to participate in any part of the survey did give basic information for the RCQ. It was a particular concern of the project to evaluate whether the information obtained about non-responding households was of a high enough quality to compare with main FES data. The refusing cases were examined and results are reported according to the method by which the information was obtained. Questions directed to a member of the household In general, if the co-operation of a member of the household was gained, the information was very accurate. As shown in table 2, basic demographic information was readily given, whilst information about the accommodation and vehicles was harder to obtain. Table 2 Proportions of refusing households where no information available Question % of households Sex of HOH 1 Age of HOH (no exact age or band) 8 Marital status of HOH 1 Working status of HOH4 Sex of other household members 1 Age of other household members 8 (no exact age of band) Marital status of other household 2 members Working status of other household 5 members Number of bedrooms in accommodation 15 Household tenure 16 Car or van available 16 Age of vehicle 19 13 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES Two questions on the RCQ required the interviewer to observe the ethnic group and the main language spoken by members of the household. This did not present any problems for interviewers but obviously the results are the opinion of the interviewer rather than the respondent. Interviewers impressions The interviewers were asked to give their impressions of the health and wealth of the household. Although the questions were nearly always answered many interviewers commented on how their experience had shown how misleading initial impressions could be. 3. Comparison of responding and nonresponding households In the following analysis only non-responding households who refused directly to the interviewer, either before or during any interviewing or later at the diary stage, are included. 3.1 Information about individuals Previous studies comparing responding and nonresponding households have matched addresses selected for the FES sample to Census records 4. The sample size for the response characteristics exercise would be too small for any comparable significance tests of differences between the responding and nonresponding groups. However, using the variables found to be significant in the 1981 Census comparison as a basis, some distributions for particular questions were compared. With the emphasis on household non-response, analysis concentrated on information about the head of household (HOH). Age of HOH The 1981 Census comparison found that response declined with increasing age of HOH: young adults in general might be hard to contact or reluctant to cooperate but response was high where a young adult was actually HOH. For this study in households where the age of the HOH was established, it was apparent that a larger proportion of non-respondents fell into older age brackets. Overall the mean age of HOH for non-responding households was 54 years old (n=127) compared to 50 years old (n=566) for responding households. Figure 1 show the distribution of the age of the HOH in the responding and non-responding households. The graph show that age groups more likely to have Figure 1 Distribution of age of HOH for responding and non responding households 14 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES Table 3 Economic activity of HOH Non-responding households Responding households 1991 FES % % % Economically active 60 63 62 of which.. Employed 45 48 48 Self-Employed 11 7 9 Unemployed 5 7 5 Economically inactive 40 37 38 Total 100 100 100 Base = 100% 169 566 7056 dependant children form a greater proportion of responding households whilst the non-responding group contains a larger proportion of households with an older HOH. The 1981 Census comparison found a positive association between households with dependant children and survey response. The results from the RCQ confirmed this finding. Whilst 33% (n-566) of responding households contained one or two adults with at least one child under 16 years old, this was the case for only 19% (n=185) of non-responding households. Not surprisingly, the non-responding group contained a larger proportion of self-employed people. In table 3 results are shown beside those for the FES in 1991 5. The lower level of economic activity for nonresponding households is consistent with a higher proportion of HOHs that were of retirement age. The lower proportion of unemployed HOHs in the nonresponding group could also be a result of the higher average age of the group. However, there were an additional 8 non-responding households where it was not clear whether the HOH was unemployed or economically inactive. Employment status 3.2 Information about the household A larger proportion of heads of household in the nonresponding group were economically inactive. Interviewers were able to observe the type of Accommodation type Interviewers were very successful in determining the accommodation for all refusing households and then employment status of those who were working. Many ask some non-respondents how may bedrooms there reported that the nature of the non-respondent s job is were within the household. With this small sample often mentioned in any explanation for refusal. If a there were no clear differences between the groups. person was not working it was more difficult to clarify whether they were economically active or not. Table 4 Tenure of responding and non responding households Non-responding Responding % % Owned, including with a mortgage 61 67 Rented from a Local Authority, 27 23 New town, Housing Association etc. Rented from Private Landlord 12 9 Total 100 100 Base 155 563 15 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES Table 5 Number of vehicles available to household Non- Responding Responding 1991 FES Responding Households % % % No car or van 36 30 32 One car or van 40 50 45 Two or more cars or vans 23 19 23 Total 100 100 100 Base 156 566 7056 Tenure Interviewers were successful in ascertaining tenure at 84% of refusing households. A larger proportion of responding than non-responding households in January were owner occupiers. Vehicles The 1981 Census Comparison found that response was lowest amongst multi-car households, possibly reflecting non-response among those with high income. Information was available for nearly 85% of refusing households. However interviewers felt this was the most unreliable question unless it could be asked directly of household members. Table 5 shows the RCQ figures beside FES 1991 results. Non-responding households do appear more likely to have two or more cars available and this group also tend to be in a higher income bracket. Most notably, 80% of HOHs from this group are over 40, 28% of HOHs are self-employed and 87% of households are owner-occupiers. At the other end of the scale the higher proportion of non-responding households without a car reflects the larger proportion of elderly people in this group. Recording the age of the car is not normal practice for the FES. For this study interviewers were asked to collect this additional detail. 45% of responding households had cars which were less than 5 years old compared to 41% of non-responding households. However there was more variation in age of car for the non-responding groups which seemed to reflect the proportions of types of household in the group; whilst pensioner households tended to have older cars, the higher income refusing households tended to have very recently registered cars. Ethnic group and main language of household Interviewers were required to record their impressions of the ethnic group and main language of the household. Non-responding households consisted of a slightly higher proportion of ethnic minority households (5% compared to 4% responding, n=181 and 565 respectively). Two of the non-responding households had no members who spoke English compared to only two of the all the responding households. Information on this group of nonresponders was more limited than average and inconclusive. Health and wealth of household As mentioned earlier, interviewers often found ill health discussed when reasons were given for refusing the survey. Interviewers noted that at 25% (n=172) of non-responding there was some or much ill health compared to 17% (n=564) of responding households. However, this information was very closely related to the fact that non-respondents included many elderly households. A scale of wealth was given to rank the household approximately from 1 as the every poorest to 6 as the very richest household. Although interviewers were instructed to record their initial impressions of all households before any interviewing, RCQs were, in general, completed after the main interview and could well be coloured by more detailed knowledge that would not be possible with non-respondents. This information has therefore not been used for comparisons. In future tests, the use of income bands on a show card for use on the doorstep could be considered. 16 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES 4. Conclusions 4.1 Feasibility of the fieldwork Despite some initial reservations, the fieldwork was very successful. The interviewers reports made it clear that any exercise of this nature should use a very small number of key questions that are easy to memorise. This allows the interviewer to adapt to the situation on the doorstep and prevents distraction from the main task of persuading the household to co-operate with the survey. 4.2 Distinguishing variables Questions about the household Where the interviewer obtained the information from a household member the quality of the information was very high. Interviewers often commented that obtaining household grid details forms a natural part of their doorstep introduction. The information on age, household composition and occupation in particular distinguished between responders and non-responders. Questions on accommodation and car ownership were (a) also successfully completed and provided useful (b) information on non-responders characteristics. Interviewers observations Recording the ethnic group of the household presented (c) no problems and there appeared to be some slight difference in response rate that could interact with other variables. Interviewers impressions Long term ill health in the household was relatively easy to record but was highly correlated with the age of the head of household. Overall impression of wealth presented the greatest difficulty to interviewers and did not clearly distinguish between responders and nonresponders. Future work using records from the 1991 Census should provide the accurate information on variables which distinguish between responding and nonresponding households. 4.3 Proposals for future work The following recommendations are made:- Partial interviews The FES should consider accepting information from partial interviews with:- (a) HOH and partner from the household, when nondependant children or other household members refuse to con-operate; (b) elderly people who consent to the interview but fail to complete the diary. In a future repeat of this exercise the following basic questions should be retained: Household level - type of accommodation - number of rooms (if converted property) - number of bedrooms - tenure - number of cars - ethnic group - main language of household Individual level - age - sex - marital status - relationship to HOH - employment status - personal outcome on FES Basic questions would be useful in targeting:- elderly households put off the full FES by its length; higher-income households, often with HOH in late middle age, who have reservations about the financial nature of the survey or object to the invasion of privacy; self-employed persons in all age groups who object to questions that probe for details of financial arrangements. People who refuse for more general reasons, such as dislike of all surveys or the government, may also give some valuable information. Implementation using computer assisted interviewing The practicalities of collecting non-response information must be considered in the context of the transferring of the FES to computer assisted interviewing (CAI) in 1994. Currently, on CAI trials, the interviewers record household and personal outcome records in a section of the interview program known as the administration block. This is separate from the main interview and always completed when the interviewer is at home. If a household has refused the interviewer is required to give any reasons for refusal at both the household and individual level. This provides a greater level of detail than is currently recorded on the calls and outcome records for the paper survey. A non-response exercise carried out using CAI could include some basic questions, built into the administration block, that would appear if a refusal code is used. Basic questions Notes 17 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES 1. This article is based on a paper sent to the Central 3. Statistical Office in March 1993. It forms part of ongoing investigations into the use of re-weighting techniques to compensate for unit non-response on the Kersten, HMP. & Bethlehem, JG. (1984). Exploring and reducing non-response bias by asking the basic question. Statistical Journal of the UN. ECE2, pp369-380. Family Expenditure Survey (FES). 4. Redpath, R (1986). Family Expenditure Survey: a second study of differential response, comparing Census characteristics of FES respondents and non-respondents. Statistical News. Vol. 72, pp.13-16. 2. Elliot, D. (1991). Weighting expenditure and income estimates from the UK Family Expenditure Survey to compensate for non-response. Survey Methodology Bulletin, No. 28, pp.45-54. 5. Central Statistical Office (1992). Family Spending, a report on the 1991 Family Expenditure Survey. London: HMSO. 18 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES The use of standardisation in survey analysis Kate Foster 1. Introduction 2. Methods of standardisation Survey analysts are often interested in comparing the rate for some even or characteristic aceoss different subgroups of a population or for the same population over time. Comparison of the overall rates or proportions is not a problem if the populations are similar with respect to factors associated with the measure concerned, such as age, sex or marital status. When this is not the case, a direct comparison of overall rates may be misleading. One commonly used solution is to present three-way (or more) tables to control for other confounding variables which are associated with the measure of interest and also with the main independent variable. An example would be to take account of age when looking at the relationship between cigarette smoking and social class by tabulating prevalence of cigarette smoking by social class for a number of different age groups. The resulting tables may, however, be difficult to interpret and suffer from small cell sizes. In addition, they do not provide a single summary measure which is suitable for comparison between groups. A more statistically sophisticated solution is to model the data which, for categorical survey data, would normally involve the use of log-linear modelling. However, this approach may not be the most appropriate where the requirement is to produce simple summary measures across a large number of analyses that are suitable for tabulation and can be readily interpreted. An alternative approach to the problem which provides output in recognisable tabular format is standardisation. The technique allows direct comparison between rates or proportions measured for populations which differ in a characteristic which is known to affect the rate being measured by, in effect, holding the confounding variable constant. In the example mentioned above, it would provide a measure of cigarette smoking prevalence for each social class group having adjusted for the effects of age. This paper gives some background to the use of standardisation in Social Survey Division (SSD) and presents the results of recent work on the estimation of standard errors for age-standardised ratios. Standardisation has most commonly been used within SSD in relation to health indicators, which often show a strong relationship with age. The technique provides a way of comparing health indicators between different subgroups of the sample after making allowance for differences in age structure between the groups and provides a single numerical summary of the agespecific rates for each subgroup of interest. There are two commonly-used ways of deriving a summary of age-specific rates, known as direct and indirect standardisation. These methods are illustrated below with some comments about their limitations and advantages. Examples and commentary can also be found in Marsh (1988) 1 and Fleiss (1981) 2. Standardisation, by whichever method, is not a substitute for comparison of the age-specific rates for the subgroups of interest. Even when the technique is used it is advisable to look at the relevant three-way table and, in particular, at whether the relationship between the health measure and the characteristic of interest varies with age. If there are interactions in the data, for example where the percentages of people with the characteristic of interest in two subgroups are lower for some age bands but higher for others, then standardisation will tend to mask these differences. In these circumstances the results of the standardisation may be misleading and should be treated with caution. 2.1 Direct standardisation Direct standardisation is widely used in medical statistics and the output is normally a rate (proportion) for each subgroup. The method applies the observed age-specific rates for each subgroup to a standard population distribution, often that of the total sample and the standardised rate is obtained by summing these values across all strata (age groups) in the subgroup. This is given by the equation (1) Standardised rate = r ij w i i Where r ij is the observed rate (proportion) for the cell defined by the ith stratum and jth subgroup, and w i is the number of cases in the total stratum (age group) as a proportion of the total sample The use of direct standardisation is illustrated by the example at Table 1. The example uses data from the 19 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES Table 1 Example of direct standardisation. Reported longstanding illness by marital status, age and sex Age group {Strata (i)} Marital status groups Subgroups (j)} Married, Cohabiting Single Percentage reporting longstanding Illness (r ij ) Widowed, Divorced, Separated Total sample {Standard Population} Proportion in Stratum (w i ) Marital status group Subgroups (j)} Married, Cohabiting Expected rate ( r ij w i ) Single Widowed, Divorced, Separated Men 16-44 23.5 21.0 32.1 0.52 (12.2) (10.9) (16.7) 45-64 41.7 47.3 43.0 0.30 (12.5) (14.2) (12.9) 65 or over 60.1 57.1 66.3 0.18 (10.8) (10.3) (11.9) All men 37.0 25.2 51.6 1.00 35.5 35.4 41.5 Women 16-44 21.8 24.8 30.0 0.50 (10.9) (12.4) (15.0) 45-64 38.8 43.0 50.7 0.28 (10.9) (12.0) (14.2) 65 or over 55.2 64.0 61.1 0.22 (12.1) (14.1) (13.4) All women 32.4 30.0 52.3 1.00 33.9 38.5 42.6 1991/92 General Household Survey on the proportion of men and women in each of three marital status groups with a long-standing illness or disability, which is an indicator of chronic sickness. For simplicity the example uses only three age bands, but the technique can readily be applied to more strata. The agestandardised proportion of chronically sick in each marital status category is the proportion which would result if a standard population (given here by the total sample of men or women) were to experience the agespecific rates observed by that subgroup. From the observed percentages in Table 1 we see that men and women who were widowed, divorced or separated were much more likely than those in other groups to have reported a long-standing illness. Also, among men only, the observed rate was lower for those who were single than for the married or cohabiting group. These results may be misleading since there is a strong association between marital status and age as well as between the incidence of chronic sickness and age: single people are on average younger than others, while those who are widowed, divorced or separated are on average older than the married or cohabiting group. The direct standardised rates for the marital status groups are shown in the right hand part of Table 1. As would be expected, once age has been taken into account there was less variation between the subgroups in the percentage chronically sick. The directly standardised rates were, however, still higher for men and women who were widowed, divorced or separated indicating that this group had higher rates of chronic sickness even after allowing for their age distribution. There is also some evidence that married or cohabiting women reported lower rates of chronic sickness than would be expected on the basis of their age distribution. These results can be seen to be consistent with the observed age-specific rates for the subgroups. For most of the age-sex strata observed rates of long-standing illness were higher among informants who were widowed, divorced or separated, and married women in each age band had lower observed rates of chronic sickness than single women. Although direct standardisation is initially attractive because the resulting statistic is a rate (proportion), the method has more rigorous data demands than does indirect standardisation. The major requirement is that the age-specific rate for the measure under investigation must be known for each population subgroup being considered. In most survey contexts this level of detail is available in the data but the sample size for each cell in the cross-tabulation may be too small to give reliable measures and hence the resulting standardised rates may be unstable. The other requirement in order to calculate a directly standardised rate is that the age structure of the standard population is known. In cross-sectional surveys the age distribution for the total sample (of men 20 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES or women) is used. When using the method to compare rates over time it is necessary to decide on the composition of a standard population which is then applied to calculate the standardised rates for each repeat of the survey. 2.2 Indirect standardisation Indirect standardisation is often thought to be more appropriate than direct standardisation for the survey data and is the method generally used within SSD in the analysis of cross-sectional surveys. It has less rigorous data requirements than direct standardisation since it uses the age-specific rates for the population as a whole, rather than for each subgroup, so avoiding the use of rates based on small sample sizes. Thus, in the survey context, ratios calculated by indirect standardisation are generally preferred to directly standardised rates because they have greater numerical stability and are less sensitive to changes in agespecific rates within subgroups. The calculation also uses the age distribution and the overall rate (number of occurrences) for the subgroup. The expected number of occurrences for the subgroup is given by: (2) Expected Count = r i n ji i where r i is the observed rate (proportion) for the age band and n ij is the cell sample size. Table 2 Example of indirect standardisation. Reported longstanding illness by marital status, age and sex Age group Marital status groups Total sample Marital status group {Strata (i)} Subgroups (j)} Married, Single Cohabiting Sample size (n ij ) Widowed, Divorced, Separated Proportion reporting longstanding illness (r i ) Married, Cohabiting Single Expected count ( r i n ij ) Widowed, Divorced, Separated Men 16-44 2615 1635 168 0.229 599 374 38 45-64 2123 169 221 0.422 896 71 93 65 or over 1087 98 350 0.613 666 60 215 All men 0.356 2161 505 346 Observed percentage 37.0 25.2 51.6 Observed count 2154 479 381 Standardised ratio (Observed/Expected) 100 95 110 Women 16-44 3100 1344 410 0.233 722 313 96 45-64 2124 114 513 0.412 875 47 211 65 or over 852 164 1122 0.233 503 97 662 All women 0.362 2100 457 969 Observed percentage 32.4 30.0 52.3 Observed count 1971 487 1069 Standardised ratio (Observed/Expected) 94 106 110 21 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES The number of observed occurrences is then compared 3. Assessing the results of indirect with the expected number for the subgroup to give a standardisation standardised ration (SR): Since standardised ratios are complex statistics the calculation of the associated standard error is not (3) straightforward. It has therefore bee usual to apply various rule-of-thumb methods to assess whether Observed count observed ratios are likely to differ significantly from SR = x 100 100 and hence to decide which ratios should be Expected count commented on. The resulting standardised rations are highly dependent on the age structure of the specific population for which they are constructed and are sensitive to variation in the subgroup sample distribution. It is also possible, although less usual, to express the results of indirect standardisation as rates by dividing the expected count for the subgroup by its sample size. Table 2 illustrates the use of indirect standardisation on the same data as shown in Table 1. The resulting standardised ratios compare the observed prevalence of chronic sickness in each marital status group with the rate that would be expected if the age-specific rates in the total sample were applied to the age distribution observed for that subgroup. The results of the standardisation are interpreted by looking at the deviation of the standardised ratios from 100, which is the implied ratio for a standard population (ie. for the total sample shown in a specific table). A ratio of more than 100 implies that the subgroup is more likely to display the characteristic than would be expected on the basis of age distribution if its members were similar to the sample as a whole. Conversely, a ratio of less than 100 implies that the group is less likely to display the characteristic than would be expected from its age distribution. From Table 2 we see than men and women who were widowed, divorced or separated were more likely than expected from their age distribution to have reported chronic sickness (ratios of 100 for both sexes). Rations of less than 100 were recorded for single men (95) and married or cohabiting women (94) although these ratios were both relatively close to 100. Since standardised ratios are calculated from survey data they are subject to sampling error and more precise assessment of their deviation from 100 would involve use of the standard error of the ratio in a conventional test of statistical significance. The next section goes on to look at estimates of the standard errors associated with various standardised ratios. Two methods of estimating standard errors for standardised ratios have recently been investigated on a number of examples. 3 The first method discussed below makes the simplifying assumption of a simple random sample design, whereas the second provides an estimate of true standard errors taking account of complex sample design. The standard errors resulting from either method of estimation can be use to test whether an individual standardised ratio is significantly different from 100, but not to test for significant differences between two ratios. 3.1 Standard errors assuming a simple random sample design Initially we look at estimated standard errors for standardised ratios under the simplifying assumption of the survey having a simple random sample (srs) design. The formula for this calculation 4 was derived using the Taylor series approximation; further details of the method of derivation are given in Wolter (1985). 5 The calculations were carried out by means of a spreadsheet which required only that the total sample sizes and the observed number of cases with attribute were entered for each cell defined by the strata and the subgroups. The first part of Table 3 shows the estimated standard errors for the example of indirect standardisation used above (Table 2). The age-standardised ratios of 110 for widowed, divorced and separated men and women were both statistically significant at the 5% level; standard errors for these groups were 3.50 on a subgroup sample size of 2045 women. On the basis of these standard errors, the ratio of 94 for married or cohabiting women was also found to be significantly different form 100. Since many surveys involve much smaller total sample sizes or smaller subgroup samples, the second part of Table 3 shows the effect on interpretation of the ratios of reducing the sample size by a factor of 5 whilst keeping the age-specific rates and sample distributions for the subgroups the same as in the original example. With this smaller sample size the standardised ratios 22 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES Table 3 Estimated standard errors for standardised ratios: reported longstanding illness by marital status and sex with varying sample size Men s marital status Married, Cohabiting Single Widowed, Divorced, Separated All men Women s marital status Married, Cohabiting Single Widowed, Divorced, Separated All women Observed percentage 37.0 25.2 51.6 35.6 32.4 30.0 52.3 36.2 (a) Full sample Sample size 5825 1902 739 8466 6076 1622 2045 9743 Standardised ratio 100 95 110* 94* 106 110* Standard error 0.86 3.06 3.50 1.02 3.39 1.84 (b) One fifth sample Sample size 1165 380 148 1693 1215 325 409 1949 Standardised ratio 100 94 110 94* 107 110* Standard error 1.92 6.83 7.86 2.27 7.56 4.11 * Significantly different from 100 (p<0.05) for married or cohabiting women and for previously married women were still significantly different from 100 but the ratio for widowed, divorced or separated men was not significantly higher than 100. As with standard errors for survey statistics, the standard error of a standardised ratio is directly related to the square root of the sample size, so reducing the sample size by a factor of 5 results in an increase of about 2.24 in the magnitude of the standard error. Similar calculations were made for a number of standardised ratios taken from a recent survey carried out by SSD with the aim of reaching some general conclusions on the size of standard errors. The examples are shown in Table 4 and are all based on a total sample size of around 1500 men or women, which is perhaps a typical sample size for a small survey. Subgroup sizes in the examples are between 35 and 1119, with most in the range 200 to 600 cases. For each example the table shows the observed proportion, the subgroup sample size, the standardised ratio and the estimated standard error. An asterisk beside a ratio indicates that, based on the estimated standard error, the ratio was significantly different from 100 (p<0.05). The following patterns are revealed by the examples. i. The size of standard errors is strongly influenced by the overall proportion of the sample with the attribute: standard errors increase substantially as the proportion decreases. Thus, the standard errors associated with the ratios in example (c), where the overall prevalence is 78%, are lower than those in examples (a) or (f), where the overall percentages are 24% and 11% respectively. ii. From the result at (i) it is clear that the size of standard errors can be reduced simply by running an analysis for the inverse of a low percentage. For example, if 20% of the sample has a particular attribute then lower standard errors will result if the standardisation is run for the percentage of informants who do not have the attribute, ie. 80% overall. This use of the inverse percentage does not affect the eventual interpretation of the results since the rations are more likely to diverge from 100 when the overall percentage is low. An example of the effect of inverting an analysis is given in examples (c) and (d). With an overall percentage of 78% the standardised ratios are between 86 and 110, standard errors range from 1.5 to 3.8 and the ratios for groups 1, 3 and 4 are significantly different form 100. Repeating the analysis for the inverse percentage (22%) gives individual ratios in a much wider range, from 60 to 147, and larger standard errors, form 5.5 to 14.0, but the ratios for the same subgroups are significantly different from 100. iii. Within a single table standard errors may vary widely for different groups and this was investigated further by looking at the contribution of different terms in the calculation of the standard error to the total standard error. This revealed that standard errors are larger for subgroups with larger values of the standardised ratio, smaller values for the percentage or smaller sub-sample size. Since a large ratio is usually 23 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES Table 4 Estimated standard errors for a variety of examples of standardised ratios Sample subgroup 1 2 3 4 Total sample (a) Percentage 20% 27% 27% 25% 24% Sample size 586 139 456 274 Standardised ratio 90 116 105 103 Standard error 5.5 13.2 5.9 8.9 (b) Percentage 19% 36% 56% 69% 25% Sample size 1119 191 78 35 Standardised ratio 87* 19 138* 153* Standard error 2.9 10.3 13.2 19.4 (c) Percentage 88% 81% 72% 67% 78% Sample size 585 139 456 274 Standardised ratio 110* 103 94* 86* Standard error 1.5 3.8 2.1 3.1 (d) Percentage 12% 19% 28% 33% 22% Sample size 585 139 456 274 Standardised ratio 60* 90 121* 147* Standard error 5.5 14.0 6.8 11.0 (e) Percentage 35% 34% 45% 46% 40% Sample size 586 139 456 275 Standardised ratio 91* 87 107* 113* Standard error 3.8 9.2 4.4 6.4 (f) Percentage 11% 12% 9% 11% 11% Sample size 586 139 453 275 Standardised ratio 107 113 85 104 Standard error 9.3 24.1 10.5 15.8 (g) Percentage 4% 13% 12% 13% 12% Sample size 177 339 347 877 Standardised ratio 34* 92 147* 106 Standard error 11.5 11.0 17.7 6.1 (h) Percentage 26% 21% 10% 18% Sample size 422 489 521 Standardised ratio 111 98 85 Standard error 6.8 6.4 9.1 (i) Percentage 23% 15% 3% 11% Sample size 333 569 770 Standardised ratio 168* 96 49* Standard error 13.2 6.8 7.9 24 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES associated with a large column percentage in comparison with those of other subgroups, the first two situations will not exist for the same subgroup. Example (h), where the subgroups are roughly equal size and ratios are relatively close to 100, suggests that a small percentage may have a marked effect on standard errors. The largest standard error is recorded for group 3 for which the percentage vale is 10% compared with at least 21% for the other subgroups. iv. Subgroup size has a particularly strong effect on the comparative size of the standard error for different subgroups in the same analysis. As would be expected, standard errors increase as subgroup size decreases although the relationship cannot be simply defined because of the influence of other variables. Most of the examples in Table 4 indicate the larger standard errors associated with smaller subgroup sizes, although this effect is sometimes difficult to separate from the similar influence of a higher ratio (examples (b) and (i) or, more rarely, a lower percentage (as perhaps seen in example (e)). In example (f), where the percentages are reasonably similar across the subgroups and the ratios are not in a very broad range, the size of the standard errors increases as subgroup size decreases; the largest standard error (24.1) is recorded for the smallest subgroup (n=139) and the smallest standard error (9.3) for the largest subgroup (n=586). Example (g) is unusual in that the largest standard error is not seen for the smallest subgroup but for the group with the largest standardised ratio. Although some general patterns in the size of standard errors can be identified, it is difficult to quantify the relationships and use them to predict the likely size of standard errors in individual analyses because of the complex interaction of the various factors involved. The examples perhaps illustrate the danger of using rule-of-thumb methods to assess the deviation of ratios from 100. With the subgroup sizes used in Table 4 a possible rule might have been to concentrate on deviations from 100 of +/-15. This approach would have correctly identified nine ratios that were significantly different from 100 on the basis of their estimated standard errors, but would also have identified four that were probably not significant and would have failed to identify six statistically significant ratios. The examples shown indicate that there are difficulties in judging whether individual standardised ratios differ significantly from100 without estimating standard errors. However, standardised ratios can still offer useful insights into the relationships under investigation when used in conjunction with detailed tables, especially where there are strong associations in the data. It is also apparent that carrying out the calculations for some examples of standardised ratios on a specific survey may give a general indication of the size of standard errors for the size of subgroups being considered; this may then be used as a guide in other analyses. 3.2 Making allowance for the complex survey design The estimated standard errors discussed so far and shown in Table 4 are calculated on the simplified assumption of a simple random sample design. Since most surveys carried out by SSD use a multi-stage probability sample design, involving both stratification and clustering, standard errors calculated assuming a simple random sample will tend to underestimate the true values. A method of estimating true standard errors for standardised ratios, allowing for the complex sample design, has been developed using Epsilon, SSD s inhouse package for calculating sampling errors. The method first requires that a set of relatively complex variables are derived from the raw data, and these variables were derived in SPSS for the following examples. True standard errors were calculated for a small number of examples from a survey in which the primary sampling units were postcode sectors and with an achieved sample of about 20 households per cluster. Table 5 compares the resulting true standard errors with the estimated standard errors for the same ratios assuming a simple random sample design (srs). The comparison of results from the two methods of estimation will, of course, vary according to the nature of the variables of interest and the specific details of the sample design. The results in Table 5 show that the two estimates of standard error were, in most cases, very close. As expected, the standard error assuming a simple random sample was usually slightly smaller that the estimate of the true standard error although there were examples where the reverse was true. In four fifths of the cases shown, the ratio of the estimated true standard error to the srs standard error was between 0.9 and 1.1 and in only one case (5% of the total) was the ratio as great as 1.2. With differences of this order of magnitude it is unlikely that use of the srs standard error rather than the true standard error would affect the interpretation of results. 25 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES Table 5 Standard errors for standardised ratios: comparison of estimates assuming a simple random sample and estimates allowing for complex design Sample subgroup 1 2 3 4 (a) Percentage 40% 31% 58% Sample size 1028 334 130 Standardised ratio 98 98 113 Srs standard error 1.9 6.7 7.5 True standard error 2.0 6.7 6.9 (b) Percentage 38% 29% 53% Sample size 1074 286 389 Standardised ratio 98 93 106 Srs standard error 2.2 7.4 4.0 True standard error 2.5 7.6 4.1 Percentage 88% 81% 72% 71% Sample size 585 139 456 274 Standardised ratio 109 102 93 90 Srs standard error 1.6 3.9 2.1 3.2 True standard error 1.6 3.8 2.1 3.9 (d) Percentage 83% 78% 68% 66% Sample size 598 256 442 382 Standardised ratio 108 107 92 92 Srs standard error 1.7 3.1 2.4 2.8 True standard error 1.6 3.3 2.6 2.9 (e) Percentage 12% 20% 14% 27% Sample size 199 240 488 552 Standardised ratio 57 94 110 111 Srs standard error 10.1 10.4 9.4 5.3 True standard error 9.2 11.7 9.4 5.4 (f) Percentage 4% 13% 12% 13% Sample size 177 339 347 877 Standardised ratio 33 100 130 107 Srs standard error 11.2 12.3 16.3 6.2 True standard error 12.5 11.2 15.3 5.9 In general terms the effect of a complex sample design on the accuracy of survey estimates may be measured by the design factor (deft), which is the ratio of the estimated true standard error to the standard error assuming a simple random sample. If summary values of deft can be identified for specific topic areas within a survey, then the appropriate deft value may be applied as a multiplier to the SIS standard error to give an approximate estimate of the true standard error. This approach might also be used for standard errors of standardised ratios and would reduce still further the chance that a ratio might be identified as significantly different from 100 when this difference could be explained by sampling error. The method of calculating true standard errors is relatively complex and time-consuming, even with access to a customised package to calculate standard errors and researchers would not usually be in a position to adopt this approach for a large number of analyses using standardisation. A more realistic approach is to carry out the simpler calculations for 26 SMB 33 7/93

Sarah Cheesbrough Non-responding households on the FES standard errors assuming a simple random sample and then perhaps to adjust by a suitable value of deft to allow for the complex sample design 4. Conclusions Indirect standardisation is a useful technique in survey analysis to summarise relationships after taking account of the confounding effects of a third variable. In SSD it has been used particularly to control for the effects of age when investigating associations with various health measures. The method should not, however, be seen as an alternative to consideration of the full detail of the appropriate three-way tables. Full interpretation of the resulting standardised ratios depends on having an estimate of the standard errors associated with the ratios. Calculation of true standard errors, which take account of the complex sample design, is a relatively complicated procedure and is not likely to be feasible on most surveys. Calculation of standard errors under the assumption of an srs design can be effected by means of a simple spreadsheet and might be considered a more realistic option. Examples presented above suggest that the results of this simpler calculation are, in general, a useful guide to the true standard errors although tending to be slightly smaller. Even where standard errors are not calculated, standardised ratios can provide a useful summary of relationships when used in conjunction with detailed tables. If standardisation is being used for a number of analyses on the same survey then it may be possible to judge which ratios are sufficiently different from 100 to be worthy of comment, but estimating standard errors for a small number of examples may help to provide more specific guidance. The examples presented in this paper show that standard errors tended to be larger where the overall percentage of case with the attribute was small. For the categories included in a specific analysis, standard errors tended to be larger where the subgroup sample size was relatively small, and where the standardised ratio was large or the observed percentage was small in comparison with other groups. References and notes 1. Marsh, C. Exploring data. Polity (1988) 2. Fleiss, J L. Statistical methods for rates and proportions. Wiley (1981) 3. Acknowledgements to Dave Elliot in SSD s Methods and Sampling Branch who carried out the theoretical work on the estimation of standard errors, 4. The variance of a standardised ration (R j ) is given by the following formula: where p is the observed proportion, w is the number of cases as a proportion of the total sample size N, i represents a stratum (age band) and j represents a subgroup. 5. Wolter, K M. Introduction to variance estimation. Springer-Verlag (1985) 27 SMB 33 7/93

A clinical experiment in a field setting: the design of the Dinamap calibration study Keith Bolling, Joy Dobbs and Charles Lound 1. Introduction One of the primary aims of the Health Survey for England is to measure changes in blood pressure levels in the population over time, and to produce comparisons of blood pressure between sub-groups. The Dinamap 8100 automatic blood pressure measuring device was selected for use in the survey because it is robust, portable, easy to use, and has been shown to produce valid results. However, a small clinical study (O Brien et. al. 1992) suggested that compared with the more traditional mercury sphygmomanometer, the Dinamap tends to overestimated systolic blood pressure in the lower BP ranges and underestimate systolic blood pressure in the higher BP range. Diastolic blood pressure tends to be underestimated for all BP levels. In order to calibrate the blood pressure results produced on the main Health Survey it was decided that a larger study should be carried out under field conditions to find the relationship between the Dinamap 8100 and a mercury sphygmomanometer. This paper examines the issues involved in setting up and carrying out the Dinamap study over a four month period. It concentrates on two design aspects of the study the sample and the fieldwork. 2. Measuring blood pressure Various devices are now available for measuring blood pressure all of which give slightly different readings. Until recently not much has been known about how different devices compared. Before selecting the Dinamap 8100 for use on the Health Survey, careful consideration was given to the merits of various measuring devices in an epidemiological setting. The traditional device for measuring blood pressure is the mercury sphygmomanometer, commonly used in doctors surgeries and hospitals. It is based on the auscultatory principle whereby the observer deflates the cuff at about 2mmHg per second while listening for Korotkoff sounds by placing a stethoscope over the brachial artery. At the first regular appearance of sounds (called Korotkoff phase I) the systolic pressure is measured by reading off the column of mercury. Cuff deflation continues until the sounds finally disappear (called Korotkoff phase V) at which point the diastolic pressure is read off the column of mercury. The main disadvantages of the mercury sphygmomanometer in epidemiological studies such as the Health Survey are observer variance, movement artefacts, and background noise. The Dinamap 8100 is one of a number of modern, automatic devices now available on the market. This device is based on the oscillatory principle and measures vibrations in the artery. Cuff deflation is controlled automatically and readings are recorded and displayed on a digital screen. Unlike monitors based on the auscultatory principle, the Dinamap 8100 allows, to a large degree, the standardisation of the procedure and the minimisation of observer variance. Previous studies in both the U.S. and the U.K. (Ornstein, 1988; Whincup 1992) have concluded that the Dinamap models do give blood pressure readings which can be compared across devices and time. 3. Sample design The twin restraints of nurse availability and time/costs meant that the study was restricted to eight areas of the country. The areas selected were spread between the Regional Health Authorities and covered different types of areas (e.g. inner city, rural, suburban, etc.). One area was specifically selected because it had a high proportion of retired people. Given the nature of the study and the desire to minimise travelling time the areas were more clustered than on most surveys, with approximately 1 in 20 addresses in each study area being sampled. Since the main Health Survey is particularly interested in monitoring the proportion of people in the higher blood pressure ranges, the study had to ensure sufficient numbers of respondents with high blood pressure. Analysis of the 1991 health Survey data indicates that only one in twenty adults have systolic blood pressures over 180mmHg and that most of these are aged over 55. With a target of 1000 adults it was clear that a simple random sample would not produce enough respondents with higher blood pressure. Consequently it was decide to use age as the best surrogate for higher blood pressure and to oversample those aged 55 and over. 3.1 Method of oversampling Some very simple rules were devised to ensure oversampling of those aged 55+. Due to the large number of addresses that the interviewers had to visit in a short period of time it was important that these rules could be applied easily and quickly in the field. 28 SMB 33 7/93

Keith Bolling, Joy Dobbs and Charles Lound The design of the Dinamap calibration study Table 1 Households in 1991 GHS by number of adults aged 16-54 and number of adults aged 55+ Number of adults Number of adults in household aged 55+ In household Aged 16-54 0 1 2 3 All Row 1-3 Total % 0 0 1425 793 9 2227 2227 22.4 1 1630 422 65 1 488 2118 21.3 2 3943 137 13 0 150 4093 41.1 3 1010 33 0 0 33 1043 10.5 4+ 464 10 0 0 10 474 4.7 Column total 7047 2027 871 10 2908 9955 % 70.8 20.4 8.7 0.1 100.0 The rules were as follows: - if anyone in the household was aged 55 or over then all the adults in the household would be included in the sample (irrespective of age). - If nobody in the household was aged 55 or over then the interviewer would apply a simple selection procedure to decide whether that household should be included in the sample or not, so that only 1 in r households which contained nobody aged 55+ would be sampled. 3.2 Sampling Fraction Figures from the 1991 GHS provided an estimate of the number of households in Great Britain and their composition (see Table 1). From the table it can be seen that 71% of households in GB contain only adults in the age range 16-54. Such households have an average of 2.06 adults. The remaining 29% of households contain at least one adult in the 55+ age range and have an average of 1.31 adults in the older age group and 0.32 adults in the younger age group. Using these figures as a base we would expect that households containing only the younger age group to yield a sample of: (0.71 x n x 2.06/r) younger adults where n is the sample size and r is the sampling fraction. Similarly, we would expect households containing at least one adult 55 or over to yield a sample of: (0.29 x n x 1.31) older adults and (0.29 x n x 0.32) younger adults Combining these two would produce a sample as follows; adults aged 16-54 = n[0.71 x 2.06/r) + (0.29 x 0.32)] adults aged 55+ = n(0.29 x 1.31) As the sampling fraction (r) varies so the sample size and the proportion of adults in the lower age group varies Table 2 shows what would be expected given different values of r. Table 2 Sample size per household sampled and percentage of adults aged 55+ r No. of adults per household sampled % of adults aged 55+ 1 1.934 20 2 1.205 32 3 0.962 40 4 0.840 45 5 0.767 50 6 0.719 53 These calculations suggested that a sampling fraction of 2 or 3 was most appropriate. Although a sampling fraction of 3 yields a higher proportion of respondents in the 55+ age group it also yields a considerably smaller number of adults, on average, per household thus requiring a larger number of initial contacts to obtain the overall target of 1000 adults. Therefore, in selecting the sampling fraction the importance of having a large number of respondents over 55 had to be balanced with the workload. 29 SMB 33 7/93

Keith Bolling, Joy Dobbs and Charles Lound The design of the Dinamap calibration study Table 3 Initial Sample size required to achieve a sample of 1000 adults Initial HHld Sample Eligible HHld Sample No. of resp. Hhlds r No. of persons sampled Persons aged 55+ Persons aged 16-54 1420 1250 840 2 1012 321 543 1490 1310 880 2 1060 336 572 1560 1370 920 2 1109 352 593 1625 1430 960 2 1157 367 620 1690 1490 1000 2 1205 382 646 1690 1490 1000 3 962 421 528 1760 1550 1040 3 1000 398 549 1840 1610 1080 3 1039 414 570 1900 1670 1120 3 1077 429 591 3.3 Initial Sample Size Having examined the impact of different sampling fractions a few other assumptions had to be made to determine what the initial sample size should be. First, an ineligibility rate of 12% was assumed which is in line with most other surveys. Second, a response rate of 67% was assumed which was slightly lower than the 71% response for measuring blood pressure on the main Health Survey. There were several reasons for this cautious estimate of response rate: unlike the Health Survey no interview was carried out prior to the measurements; it was unclear how respondents would react to six blood pressure measurements being taken particularly as no pilot work had been carried out to assess people s reactions; and the higher proportion of older people in the sample. Taking all the assumptions on household composition and response rates Table 3 illustrates the number of people we would expect to achieve given different initial sample sizes and different sampling fractions. Using these figures it was decided that an initial sample size of 1840 households was most appropriate using a sampling fraction of 1 in 3. Given a response rate of 67% this would be expected to yield 1040 adults, of which 40% should be in the 55+ age group. 4. Field Design The previous small study (O Brien et. al. 1992) was carried out in a controlled laboratory setting with a clinical, rather than an epidemiological, approach. In adopting a field study the challenge was to come up with a design which replicated clinical conditions as far as possible in the field. Thus, from the start the emphasis was on procedures which controlled and regulated the work done by the nurses to ensure a high degree of accuracy and consistency. 4.1 Role of Interviewers During the study, interviewers were involved at the beginning of each field period. Following an advance letter interviewers had to make initial contact with households, to select households for inclusion in the study according to the criteria outlined above, and to make appointments for nurses. Although consideration was given to using just Health Survey nurses in the study the advantages of using interviewers were clear. First, interviewers are better trained in approaching members of the public and persuading them to participate in surveys. Second, it was felt that potential respondents were less likely to co-operate if approached initially by two people carrying equipment rather than by just one person. Third, nurses have no training or experience in collecting household information in a standardised way or in applying standard rules according to prespecified selection criteria. Finally, and probably most importantly, given the short time available for fieldwork, using interviewers to screen out ineligibles and refusals and make appointments would reduce the amount of nonproductive time nurses would have to spend in the field. Given interviewer availability and the nature and size of the task three interviewers were involved in each of the study areas. This required careful co-ordination between interviewers to ensure that different interviewers were not double booking nurses. In each area interviewers were each allocated time slots by the Field Officer in which only they could make appointments. Additionally, one interviewer was designated to co-ordinate their work and also to liaise with the nurses. 30 SMB 33 7/93

Keith Bolling, Joy Dobbs and Charles Lound The design of the Dinamap calibration study 4.2 Role of nurses In each area the role of the nurse team was to call at the participating households, take blood pressure measurements according to the measurement protocols and collect some basic demographic information. Two nurses were used, with one taking only Dinamap measures and the other taking only mercury sphygmomanometer measures in any particular household. In all cases the aim was to ensure that each nurse did not know the measurements taken by their partner to minimise the risks of one set of readings being contaminated by the other set. A number of studies have shown considerable observer variation in blood pressure measurement using a mercury sphygmomanometer. The British Hypertension Society (BHS) protocol for evaluating blood pressure devices (O Brien et. al. 1990) places considerable emphasis on observer training prior to undertaking any validity test, particularly with respect to the mercury sphygmomanometer. If an observer has been trained in the past then re-training just prior to any test is considered essential. For this reason an important element of the study was to hold a training course for the 16 nurses involved. Half a day was used to cover various aspects of the protocol, the schedules, and collecting industry & occupation information, etc. while a whole day was devoted to retraining nurses in the use of mercury sphygmomanometers. This training was carried out by an external consultant with experience in clinical hypertension. This training was held the week before the start of fieldwork. 4.3 Measurement Protocols Three separate protocols were devised. The first related to the use of the Dinamap 8100 and was similar to the protocol used on the main health Survey. The second related to the use of the mercury sphygmomanometer and was based mainly on guidelines set out in the BHS protocol. The third protocol was a general one relating to the way in which the whole set of measurements should be done. discarded in analysis to minimise the risk of obtaining falsely high readings. This is consistent with procedures on the main Health Survey and follows the recommendation of a BHS working group (Petrie et. al. 1986) which concluded that during the first blood pressure reading the body exhibits a defence reaction which causes a temporary increase in blood pressure. ii. The ideal recommended by the BHS is for simultaneous same-arm measurement between the test device (Dinamap) and the standard (mercury sphygmomanometer). However, because the Dinamap 8100 has a rapid-deflation cuff, simultaneous measurement is not possible. Instead a system of sequential measurements was adopted with all the measurements being done on the same arm. This meant that six separate measurements were taken alternating the devices eg. Mercury, Dinamap, mercury, Dinamap, mercury, Dinamap. iii. The ordering of measurements was randomised to ensure that any ordering effect was eliminated. A simple rule relating to the address serial number on the measurement schedule was devised to ensure systematic randomisation. If the address serial number of a particular household was even the nurses would always start with a Dinamap reading and alternate accordingly. Thus the sequence of readings would be: 1 Dinamap 2 Mercury 3 Dinamap 4 Mercury 5 Dinamap 6 Mercury Alternatively, if the address serial number of a particular household was odd the nurses would always start with a Mercury reading. Thus the sequence of readings would be: 1 Mercury 2 Dinamap 3 Mercury 4 Dinamap 5 Mercury 6 Dinamap vi. The BHS protocol recommends that separate observers should measure blood pressure in approximately half the subjects to prevent any observer In designing the general protocol, rules had to be bias. For this reason it was decided to alternate which established which ensured accurate and consistent nurse carried out which measurements. This meant that measurements by all the nurses and were simple to over the period of the whole study each nurse had to carry apply in the field. Again many of the points out half the Dinamap measurements and half the mercury recommended in the BHS protocol were adopted. The sphygmomanometer measurements. To achieve this main design factors were as follows: nurses simply alternated what measuring device they used from household to household. i. It was decided that for each respondent three pairs of reading would be taken (six in total). Although all v. v. When measuring blood pressure a range of cuff six readings were to be recorded the first pair would be sizes can be used depending upon arm circumference. Selecting the correct cuff size is very important since the vi. vii. viii. 31 SMB 33 7/93

Keith Bolling, Joy Dobbs and Charles Lound The design of the Dinamap calibration study use of an inappropriate cuff size in relation to mid-arm circumference can produce inaccurate readings. Moreover, changes in cuff size creates a discontinuity in readings. To minimise this problem nurses were only given two different cuffs (compared with three on the main Health Survey) and very precise rules were laid down for which cuff should be used. Prior to blood pressure measurements being taken one nurse measured the respondent s mid-arm circumference. If the mid-arm circumference was 31cm or less then the adult cuff was to be used. If the mid-arm circumference was greater than 31cm then the large adult cuff was to be used. ix. Preparing the setting before taking measurements was an important consideration. Although it is clearly impossible to impose absolute conditions in a field setting the ideal which the nurses should try to follow, was laid out. Factors to be considered included making sure the mercury sphygmomanometer was always placed on a flat surface at eye level, trying to reduce background noise as much as possible (eg. switching off TV, asking people not to talk, etc.), and covering up the Dinamap s digital display so it could not be seen by people in the room, especially the nurse taking the mercury measurements. 5. Conclusions The design and fieldwork stages of the Dinamap study have now been successfully completed. With design work in December and January, briefing and training towards the end of January and fieldwork during February and March the total time between submission of the initial proposal and completion of the fieldwork was less than 4 months. Response rates have proved to be very encouraging and the target of 1000 adults has easily been met with 52% of the respondents being in the 55 and over age group. Analysis of the data is now being carried out and initial results are due towards the end of May. References O Brien E. et. al. (1990) The British Hypertension Society protocol for the evaluation of automated and semi-automated blood pressure measuring devices with special reference to ambulatory systems, Journal of Hypertension, Vol. 8, pp 607-19 O Brien E. et. al. (1992) Accuracy of the Dinamap Portable Monitor, Model 8100 determined by the British Hypertension Society Protocol. Unpublished. Ornstein, S. et. al. (1988) Evaluation of the Dinamap blood pressure monitor in an ambulatory primary care setting, The Journal of Family Practice, Vol. 26, pp 517-21 Petrie, J. C. et. al. (1968) Recommendations on blood pressure measurement, British Medical Journal, Vol. 293, pp 611/15 Whincup, P. H. et. al. (1992) The Dinamap 1846SX automated blood pressure recorder: comparison with the Hawksley random zero sphygmomanometer under field conditions, Journal of Epidemiology and Community Health, Vol. 46, pp 164-69. 32 SMB 33 7/93

A brief look at response to postal sifts, surveys and keeping in touch exercises from the viewpoint of the sampling implementation unit Tracie Goodfellow In recent years the Sampling Implementation Unit (SIU) has been involved in the administration and management of a wide variety of postal sifts, surveys and keeping in touch exercises referred to hereafter as postals. This paper does not aim to cover all of the postal work that has been undertaken. Instead six postals have been selected; these demonstrate all of the aforementioned types, as well as showing response to postals in general, plus response rates over time and at various times of year. Also included are some points about postals that have been identified during the surveys and may be of use to note for future. 1. Explanation of Postal Sift, Survey, and Keeping in Touch Exercises 1.1 Postal Sift A postal sift involves sending out sift forms to a sample of addresses selected from the Postcode Address File (PAF). These forms are fairly simple and normally ask for the household composition, including age and sex, plus whether the address contains one or more households. Once these forms are returned the information is keyed and a sample can be drawn from those cases showing the required characteristics. Those selected are then followed up with a full survey. Examples in this paper are the Toddlers Dietary Postal Sift and the Day Care Postal Sift. 1.2 Postal Survey A postal survey would normally use address data from non PAF sources although a PAF based postal survey could also be undertaken. For this type of survey the full questionnaire rather than a sift form, is sent to the address. The outcome only is recorded by the SIU, all other processing is undertaken by Data Prep and Primary Analysis Branch (PAB). Examples in this paper are the Infant Feeding Survey, Children s Dental Health Survey and the National Foundation for Educational Research (NFER) Survey. 1.3 Keeping in Touch Exercises Keeping in touch exercises involve contacting those who were interviewed in a main survey to ensure that they can be contacted in a future follow up survey. Normally these people are contacted on a yearly basis. The form is once again simple and asks if they have changed name or are likely to be moving and if so where to. The example in this paper is the Retirement Follow Up Survey. 2. General Mailing Procedures In the majority of postal work the addresses are kept on a database. At present most postal work is conducted using the Postal Administration System (PAS) which resides on the VAX. The addresses reach the database either by a file transfer from the PAF, if a PAF based sample, or by the addresses being keyed by Data Prep, if a non PAF based sample. All addresses keyed by Data Prep require serial numbering by the SIU first. The PAS allows for the production of address and serial number labels, address lists, reminder labels, reports and the production of a final sample if a sift is being used, plus the update of data where necessary. The majority of postal work has two reminders sent. These reminders are despatched to all who have not replied, at intervals of two weeks. Postals are always despatched and returned by first class mail since a fast turn round is essential. Despatch normally takes place on a Thursday since the public seem to respond better if they receive the sift/questionnaire near to the weekend. Some postal sifts and surveys have all non response follow up by interviewers calling. The SIU code the sift forms and Questionnaires to show the outcome using a different set of codes ot allow comparison of postal and interviewer response. 3. Response 3.1 Toddlers Dietary Sift Type: Objective: Source of sample: Postal Survey To identify households with children aged 1.5 to 4.5 years of age by means of a one page sift document, asking age and sex of all those in the household. PAF based - clustered 33 SMB 33 7/93

Tracie Goodfellow Response to Postal Surveys Table 1 Toddlers Dietary Survey Postal Sift Replies by wave and stage WAVE Resp. to Orig M/O Resp. to 1 st Rem Resp. to 2nd Rem Resp. to Int Call No. % No. % No. % No. % 1 2,660 38% 1,127 16% 1,419 20% 1,794 26% 2 2,415 35% 1,520 22% 1,502 21% 1,563 22% 3 2,922 42% 1,604 23% 884 12% 1,590 23% 4 2,632 38% 1,422 20% 1,352 19% 1,594, 23% Tot. 10,629 38% 5,673 20% 5,157 18% 6,541 23% a) The base used for the calculations for each wave was the set sample of 7,000. b) All postal response figures include: Completed Sift forms, Post Office (PO) returns, Refusals. c) The response to the interviewer call includes all non contacts, hence each wave totals 100%. d) M/O Mail out. Size: At each wave 7,000 addresses Timing: Wave 1 February March 1992 Wave 2 May June 1992 Wave 3 August September 1992 Wave 4 November December 1992 Reminders: Two at each wave, plus, interviewer follow up to non response Points to Note; c) Wave 3 generated a much higher response to the postal at the original mail out and first reminder stages than all of the other waves., although the final response to the postal was similar to that experienced at waves 2 and 4 (See Table 2). This may have been due to the timing of this wave, although, other factors such as areas selected, can not be ruled out. 3.2 Day Care a) Tables 1 and 2 show the response to this sift. Type: Postal Sift b) The response at wave 1 was lower. This was a direct result of the decision not to send a sift form with the first reminder. All other waves had sift forms sent at both reminder stages and the response increased markedly (See Table 1). Objective: To identify households containing children under the age of eight, using a one page Table 2 Toddlers Dietary Survey Postal Sift Positive response rates and non response rates WAVE Post. Resp. Int. Resp. Non Resp. Total No. % No. % No. % No. % 1 4,518 65% 1,150 16% 1,332 19% 7,000 100% 2 4,940 71% 1,007 14% 1,053 15% 7,000 100% 3 4,934 71% 984 14% 1,082 15% 7,000 100% 4 4,890 70% 900 13% 1,210 17% 7,000 100% Tot. 19,282 69% 4,041 14% 4,677 17% 28,000 100% a) Response = Completed sift form. b) Non Resposne PO Returns, Refusals, No reply to interviewer/postal, Non Contact by interviewer. 34 SMB 33 7/93

Tracie Goodfellow Response to Postal Surveys Table 3 Day Care Postal Sift Replies by wave and stage WAVE Resp. to Orig M/O Resp. to 1 st Rem Resp. to 2nd Rem Resp. to Int Call No. % No. % No. % No. % 1 13,080 48% 5,753 21% 5,510 9% 21,343 78% 2 4,712 49% 1,897 20% 897 9% 7,506 78% Tot. 17,792 48% 7,650 21% 3,407 9% 28,849 78% Figures quoted include: Completed Returns, PO Returns, Refusals. Source of sample sift document asking age and sex of members of the household. PAF based sample clustered Size: Initially 27,200 addresses boosted by a further 9,600 Timing: July September 1990 Reminders: Two postal reminders, on interviewer follow up to non response. Points to note: a) Sift questionnaires were mailed out with all reminders. b) Response by post was higher than for the Toddlers sift; 78% were returned by post for this sift compared with 69% on the Toddler s Survey. Response on the Toddlers sift was boosted by interviewer follow up. 3.3 nt Feeding, 1990 Great Britain sample Type: Postal Survey Objective: To collect information about experiences of infant feeding from mothers of babies at three stages from the age of six weeks to nine months, using a sixteen page questionnaire. Source of sample: Draft Birth Registrations clustered by registration sub-districts or groups of sub-districts Size: Timing: At stage one 9,064 births At stage two 7,950 babies At stage three 7,139 babies Stage 1 October November 1990 Stage 2 January February 1991 Stage 3 June July 1991 Table 4 Infant Feeding Postal Survey, 1990 Great Britain sample Positive response rates and non response rates STAGE Post. Resp. Int. Resp. Non Resp. Total No. % No. % No. % No. % 1 7,150 79% 800 9% 1,114 12% 9,064 100% 2 6,336 80% 803 10% 811 10% 7,950 100% 3 5,577 78% - - 1,562 22% 7,139 100% Tot. 19,063 80% 1,603 7% 3,487 14% 24,153 100% Source: OPCS, 1990 infant Feeding Survey report a) Response = Completed questionnaire. b) Non response = No baby, PO Returns, Refusals, no reply to interviewer/postal, Non Contact by interviewer. c) Response rates are calculated on the set sample at each stage. 35 SMB 33 7/93

Tracie Goodfellow Response to Postal Surveys Table 5 Children s Dental Health Postal Survey Replies by stage Resp. to Orig M/O Resp. to 1 st Rem Resp. to 2nd Rem Resp. to Int Call No. % No. % No. % No. % 2,254 40% 1,328 23% 988 17% 1,096 19% a) The base used for the calculations for each wave was the set sample of 5,666. b) All postal response figures include: Completed questionnaires, PO returns. Refusals. c) The response to the interviewer call includes all non contacts, hence each wave totals 100%. Reminders: Points to note: Two at each stage plus interviewer follow up to non response at stages 1 and 2 Reminders: Two plus interviewer follow up to non response a) The sample declined over time due to refusals at each stage. However, the actual postal response remained fairly stable. 3.4 Children s Dental Health Type: Postal Survey Objective: To collect background information about children who had been dentally examined in school, via a ten page questionnaire. Source of sample: selected schools Pupils on school registers in Size: The set sample was 5,811, however, 145 dropped out of the sample because the school was told that the child had refused the dental examination or because the child had left school prior to the postal, hence, all figures are based on 5,666. Timing: February March 1993 3.5 National Foundation for Educational Research Type: Postal Survey Personality Test Objective: To collect completed personality tests from a sample of adults. It was estimated that it would take about one hour for the test to be completed. Source of sample: Size: 1,506 Omnibus Survey, individuals aged 16 64 selected by interviewer and who had responded to the Social Survey Division Omnibus Survey. The test was left with respondents who were asked to return it by post in a pre paid envelope. Timing: January March 1993 Interviewer placement 18 28 January 1993 1st reminder despatched 18 February 1993 Table 6 Children s Dental Health Postal Survey Positive response rates and non response rates Post. Resp. Int. Resp. Non Resp. Total No. % No. % No. % No. % 4,516 80% 682 12% 468 8% 5,666 100% a) Response = Completed sift form. b) Non Resposne PO Returns, Refusals, No reply to interviewer/postal, Non Contact by interviewer. 36 SMB 33 7/93

Tracie Goodfellow Response to Postal Surveys Table 7 NFER Postal Survey Positive response by stage and non response Resp. to left Qu. Resp. to 1 st Rem Resp. to 2nd Rem Total Resp Total Non Resp. No. % No. % No. % No. % No. % 1,226 81% 61 4% 88 6% 1,375 91% 131 9% a) Response = completed questionnaire b) All Non response were no replies. Reminders: 2nd reminder despatched 4 March 1993 Two but due to interviewer placement a longer time than normal was allowed between the placement and the first reminder. Timing: Year 1 May June 1991 Year 2 May June 1992 Year 3 May June 1993 Reminders: Points to note: One Points to note: a) A 5 incentive payment was offered to all who returned a completed questionnaire. b) First contact was by interviewer who asked the Omnibus respondent if they would be willing to complete a further questionnaire. This meant that the element of ineligibility was ruled out and that those asked had already responded to the Omnibus Survey (80% of set sample responded to the Omnibus). c) Further NFER surveys using the Omnibus as first contact will be carried out shortly without a 5 incentive payment although only to respondents in Non-Manual occupations. It will be interesting to see how response to these compares to the very high response rate achieved with the incentive payment. 3.6 Retirement Survey Type: Objective: Keeping in touch exercise To keep in touch with a sample of those aged 55 to 69 years of age who were interviewed on a survey of retirement plans. Each person was sent a one page questionnaire to complete. a) The 1993 survey is not yet complete, hence, response rates are only to date. b) The number of movers found by this exercise was minimal at 14. c) Between years some of the sample were lost due to deaths which were notified to the SIU largely by the National Health Service Central Register. The number of deaths on this survey were high due to the age of the population covered. 4. Summary 4.1 Postal Sifts The Toddlers sift yielded a response rate of 69% on average of all of the 4 waves. This was boosted to 83% by the use of an interviewer follow up. There was an indication that response to the reminders was lowered if a further sift form was not included. The Day Care sift, which had no interviewer follow up, achieved a response rate of 78%, however this figure includes PO returns and refusals. 4.2 Postal Surveys The two postal surveys reported in this paper with 2 reminders and a questionnaire included at all stages but with no incentive payment yielded an average actual response rate of 80%. Source of sample: Originally postal sift, then full interview Size: Original follow up size 3543 37 SMB 33 7/93

Tracie Goodfellow Response to Postal Surveys Table 8 Retirement Keeping in Touch Exercise Positive response by wave and stage YEAR. Resp. to Orig M/O Resp. to 1st Rem Total Resp Total Non Resp. No. % No. % No. % No. % 1991 2,087 59% 766 22% 2,853 81% 690 19% 1992 2,245 68% 444 13% 2,689 81% 621 19% 1993 1,941 63% 449 15% 2,390 78% 693 22% a) Response = Completed form b) Non response = PO Returns, Refusals, No reply to Postal, death. The NFER yielded a 91% response rate which was considerably higher and was due to a combination of the fact that first contact was made by the interviewer and the fact that a 5 incentive payment on completion of the questionnaire. 4.3 Keeping in Touch Exercise The total response with one reminder only is 81%. However, this is from an elderly population who tend to respond more quickly and readily. A similar exercise with a younger population may vary considerably. 4.4 Future Postals I hope to include a small paper in the next bulletin following up some of the issues raised in this paper and exploring the use of postals in constructing sampling frames. 38 SMB 33 7/93

Telephone ownership north of the Caledonian Canal Andrea Cameron The enhanced Labour Force Survey (LFS) use of a completely unclustered sample presents problems for the Northern Scotland area, North of the Caledonian Canal (NOCC). The geographical spread of the population means face-to-face interviewing is impractical. Rather than revert to a clustered sample, telephone interviewing was chosen as a way of maintaining relatively low costs together with the benefits of an unclustered sample. The central sampling frame for telephone surveys, the published directories, are incomplete because of the increasing incidence of ex-directory numbers. According to BT (1989) 1 25% of telephone users were ex-directory in 1989, and this figure tends to be even higher for urban areas. Figures from the 1989 General Household Survey indicate that 89% of the population are on the telephone. Interviewing by telephone, sampling from the published directories, therefore effectively excludes 33% of the sample population. In order to attempt to counteract this bias for the LFS a random digit dialling experiment was carried out in the NOCC area. This experiment achieved low final response rates (72%) and low productivity and therefore could not be considered as a serious alternative to directory sampling for the LFS. The problems encountered through this experiment triggered the need for more information about telephone ownership in the NOCC area in order to justify the use of an unclustered sample in this area. A postal survey of 1,000 addresses, randomly selected form the Post Office s Postcode Address File (PAF), was chosen as the quickest and most economical way of getting at this information. As well as establishing information on level and type of telephone ownership, this survey would also reveal more detailed information on numbers of households listed, or expecting to be listed in the published directories and on ex-directory households. Postal surveys have generally tended to attain low response rates in comparison to other survey methodologies. In order to maximise response rates a number of factors were adopted from Scott 2 1961: 1. A short questionnaire was printed on the reverse side of the covering letter. 3. All questionnaires were accompanied by a reply paid envelope. 4. Two follow up letters were sent, the second accompanied by a further questionnaire and reply paid envelope. Response Rates 1,000 letters were sent out to addresses taken from the Postcode Address file (PAF). Replies received form 77 of these indicated that the addresses were ineligible. It is however known that about 12% of PAF addresses are ineligible, and we have therefore taken 880 to be the effective sampling base. From this we received 761 usable replies, a response rate of 86%,13% were non contacts, and 4 replies were refusals to take part. Level of Telephone Ownership 85% of responding households in the NOCC area were on the telephone at the time of this survey, compared to the 89% reported by the GHS in 1989. this figure adds support to earlier arguments (Collins 1987 3 ) that levels of telephone ownership tend to be lower for rural than urban areas. Of responding households who were on the telephone, 84% stated that they were listed in the current published directory at the time of our survey. Only 8% claimed that they were ex-directory households compared to BTs national average of 25% in 1989. This supports Collins argument that the incidence of ex-directory numbers will be higher for urban areas. Fifteen per cent of households were not listed in the published directories at the time of our survey because an additional 7% of households, while not ex-directory, are due to appear in a directory not yet published. (One per cent did not know whether they were listed in the directory). This figure indicates that the level of bias introduced by ex-directory numbers on sampling from published telephone directories is only part of the story: we should also be taking into account those households waiting to be listed in the directories. Interestingly, we found no difference between levels of telephone ownership amongst respondents whether they claimed they live in a City or Town, Village or Rural area. 2. The questionnaire was kept anonymous. 39 SMB 33 7/93

Andrea Cameron Telephone ownership north of the Caledonian Canal Conclusions The results of this survey indicate that telephone interviewing using the published directories as a sampling frame in the NOCC area will exclude approximately 28% (which includes movers) of the population from the effective sampling base. This figure, although lower than the UK average of 33% calculated from BT and GHS sources (see above) which may not include movers, still incorporates a serious amount of bias into any telephone survey. Until further advances are made in methods of random digit dialling and more up to date information is made readily available on level and type of telephone ownership, other methods may have to be adopted to counteract this bias, such as weighting. At present, however, the telephone directory sample is being used in the LFS for the NOCC area, because it is judged that the benefit of an unclustered sample for this area outweighs the losses from bias. References 1. British Telecom. The code decoder. (1989) 2. Scott Christopher (1961). Research on Mail Surveys (paper read before the Royal Statistical Society, February 1961). Published by Social Survey Division in the Methodological Series, M100, London 3. Collins M and Sykes W. The problems of noncoverage and unlisted telephone numbers in telephone surveys in Britain. The Journal of the Royal Statistical Society A, Vol. 150, No 3, 1987, pp 241-253. 40 SMB 33 7/93

Recent OPCS Publications Survey Researcher s Guides The following volumes are produced by the Social Survey Division of OPCS, and are useful for anyone planning social survey research. Weighting for non-response a survey researcher s guide By Dave Elliott This is a survey researcher s guide to procedures aimed at correcting the effects of non-response during the analysis and presentation of survey results. Some of the subjects covered are: Methods available to eliminate or reduce any bias consequent on nonresponse. A summary of the literature describing the characteristics of survey nonrespondents, with particular emphasis on OPCS Social Survey Division s own checks on its continuous surveys. Procedures used in compensating for non-response. Price 5.00 net ISBN 0 904952 70 3 This publication is available from Information Branch, OPCS, St Catherine s House, 10 Kingsway, London WC2B 6JP, telephone 071 242 0262 ext 2243 or 2208. A Handbook for Interviewers By Liz McCrossan This manual of social survey practice and procedures for structured interviewing is intended for use by OPCS Social Survey Division interviewers, but is available to the public for purchase from HMSO. Topics covered include: * Sampling * Approaching the Public *Questionnaires * Asking the questions * Recording the answers * Classification definitions * Analysis of a survey Price 6.75 net ISBN 0 11691344 4 Available from HMSO bookshops or accredited agents, or HMSO Publications Centre, telephone 071 873 9090 41 SMB 33 7/93

NEW METHODOLOGY SERIES NM1 The Census as an aid in estimating the characteristics on non-response in the GHS. R Barnes and F Birch. NM12 The Family Expenditure and Food Survey Feasibility Study 1979-1981. R Barnes, R Redpath and E Breeze. NM2 NM3 NM4 NM5 NM6 NM7 NM8 FES. A study of differential response based on a comparision of the 1971 sample with the Census. W Kemsley, Stats. News, No 31, November 1975. NFS. A study ofdifferential response based on a comparison of the 1971 sample with the Census. W Kemsley, Stats. News, No 35, November 1976. Cluster analysis D Elliot. January 1980. Response to postal sift of addresses. A Milne. January 1980. The feasibility of conducting a national wealth survey in Great Britain. I Knight. 1980. Age of buildings. A further check on the reliability of answers given on the GHS. F Birch. 1980. Survey of rent rebates and allowances. A methodological note on the use of a follow-up sample. F Birch 1980. NM13 A Sampling Errors Manual. R Butcher and D Elliot. 1986. NM14 An assessment of the efficiency of the coding of occupation and industry by interviewers. P Dodd. May 1985. NM15 The feasibility of a national survey of drug use. E Goddard. March 1987. NM16 Sampling Errors on the International Passenger Survey. D Griffiths and D Elliot. February 1988. NM17 Weighting for non-response a survey researchers guide. D Elliot. 1991 NM18 The Use of Synthetic Estimation techniques to produce small area estimates. Chris Skinner. January 1993. NM19 The design and analysis of a sample for a panel survey of employers. D Elliot. 1993. NM20 Convenience sampling on the International Passenger Survey. P Heady, C Lound and T Dodd. 1993. NM9 Rating lists: Practical information for use in sample surveys. E Breeze. NM10 Variable Quotas an analysis of the variability. R Butcher. NM11 Measuring how long things last some applications of a simple life table techniques to survey data. M Bone. Prices: Orders to: NM13 6.00 UK - 7.00 overseas NM17 5.00 UK and overseas NM1 to NM12 and NM14 to NM16 1.50 UK - 2.00 overseas NM 18 to NM20 2.50 UK - 3.00 overseas New Methodology Series, Room 304, OPCS, St. Catherines House, 10 Kingsway, London WC2B 6JP 42 SMB 33 7/93