Does telephone number tracing reduce non-response bias in the EU- SILC? A comparison between sample units with and without registered telephone numbers in Iceland and Norway. Anton Örn Karlsson, Statistics Iceland Drífa Jónasdóttir, Statistics Iceland Bengt Oscar Lagerstrom, Statistics Norway In general, all surveys aim to represent the population as accurately as possible by controlling sources of error which can create bias in the survey estimates or increase their variance (Groves, 1989). As survey response rates are decreasing worldwide (e.g. de Leeuw and de Heer, 2002) more resources are being used to increase survey response rate and/or dealing with the consequences of non-response (e.g. Bethlehem, Cobben & Schouten, 2011). For surveys applying CATI (Computer assisted telephone interviewing) decreasing response rates increases the effort on manual tracing of phone numbers, as phone numbers are unavoidable prerequisite of conducting CATI surveys. Although this practice undoubtedly has the possibility of increasing the response rate of CATI surveys, it is not entirely clear to what extent it has an effect of the representativity of the final group of respondents, or decreases non-response bias in the estimators, since the response rate is a poor measure of non-response bias (Groves, 2006; Groves & Peytcheva, 2008) and representativity (Schouten, Cobben & Bethlehem, 2009). Therefore, the aim of this analysis was to estimate to what extent manual search for phone numbers for respondents in the European Union Survey on Income and Living Conditions (EU-SILC) effects the representation of the final sample, and measure if manual tracing can rectify possible bias in the survey. To gain further knowledge on this matter and to increase generalizability, results are compared between Norway and Iceland. Method The EU-SILC is a longitudinal survey which is conducted in most of the EU member states as well as in Iceland and Norway which have both participated since 2004. The data used in this analysis was drawn from the first wave of the 2012 EU-SILC. The survey is based on a four year rotating panel where data is collected anually, and each sample unit participates in a yearly wave for four years in a row. The target units of the survey are households in each participating country, while in Iceland and Norway individuals are sampled from the national registry and linked to other household members. The main aim of the survey is to increase the knowledge of poverty and social exclusion in Europe. 1
In both Iceland and Norway phone numbers are searched with two different methods. First an automatic search of registered numbers for all household members is conducted through a database link to a private firm which is able to provide all information regarding listed phone numbers in Iceland. Secondly, all households which still lack phone numbers are searched manually by members of staff at Statistics Icland and Statistics Norway, where every possible mean to gain contact information is used (e.g. Facebook, information in registries, workplaces, etc.). In this analysis information based on key variables and answers from sample units, whose phone numbers were either found automatically or manually, were used to examine the differences between the two groups, and more specifically to examine possible non-response bias. These results were then compared between Iceland and Norway, examining both similarities and differences between sample units found with different tracing methods. All calculations were done using un-weighted data from the first wave of 2012 EU-SILC. There were two main stages of analysis: 1) Potential bias reduction related to manual tracing was measured. In order to examine if manual tracing had an effect on non-response bias in the survey, the distribution of answers to key variables was compared to the final group of respondents and respondents whose phone number was listed. 2) The representation of the final group of respondents compared to respondents whose phone numbers were found automatically and manually was assessed. Three steps were taken to assess the representation of the sample with regards of different methods in the search of phone numbers: First: Proportion of eligible sample units by gender, age group, ethnicity, marital status and education was compared between those with listed and unlisted phone numbers. Second: A logistic regression model was calculated in order to predict the likelihood of locating phone numbers using gender, age group, ethnicity, marital status and education as independent variables. Third: The R-indicator was calculated to compare the representation of the final group of respondents versus respondents whose phone number was registered. The R-indicator is an index of the representation of the final group of respondents which ranges from 0 to 1 (1=total representation), were representation refers to how closely the distribution of selected variables in the final group of respondents matches known distributions in the population. 2
Results and discussion Table 1 shows the total and 1 st wave response rates, refusal rates and contact rates for SILC2012 in Iceland and Norway, according to AAPOR-RR1. Table 1. Response rates in Iceland and in Norway. Norway (%) Iceland (%) Total response rate 55,5 76,3 Response rate 1 st wave 63,0 78,5 Total refusal rate 24,2 11,6 Refusal rate 1 st wave 20,7 10,0 Total contact rate 86,6 91,8 Contact rate 1 st wave 87,9 91,5 Comparing automatically found phone numbers for sample units with the total sample. Automatic phone number search was successful for approximately 80% of the cases in Iceland while in Norway is was successful for 81% in the first wave of data collection. Adding manual phone search increased the numbers to 98,2% of the sample in Iceland and 87,5% in Norway. Table 2 displays the proportions of different groups, by known auxiliary variables, for the total sample and sampleunits with listed phone numbers. Table 2. Differences between total and listed population variables in Iceland and Norway. Iceland Norway Total (%) Listed (%) Difference Total (%) Listed (%) Difference All 100 100 100 100 Gender Male 51,7 50,0-1,7 51,4 51,3-0,1 Female 48,3 50,0 1,7 48,6 48,7 0,1 Age group 16-29 22,8 20,0-2,8 25,6 21,7-3,9 30-44 26,4 23,8-2,6 25,3 23,2-2,1 45-59 26,8 29,2 2,4 23,8 25,7 1,9 60 or older 24,0 27,0 3,0 25,3 29,3 4,0 Ethnicity Native 87,5 92,8 5,3 76,8 80,9 4,1 Foreign 12,5 7,2-5,3 23,2 19,1-4,1 Marital stat Married 45,6 48,9 3,3 42,2 44,5 2,3 Not married 54,3 51,1-3,2 57,8 55,5-2,3 Education Low 39,0 38,8-0,2 29,3 28,5-0,8 Middle 33,4 33,4 0,0 40,2 42,8 2,6 High 22,7 24,3 1,6 30,1 28,6-1,5 Unknown 5,3 3,6-1,7 0,3 0,1-0,2 3
As can be seen in table 2, there are considerable differences in the distribution of background variables in the group of listed phone numbers for both countries compared with the total sample. The most difference was for ethnicity, as native born sample units were more likely to be found using automatic tracing, compared to sample units born abroad both in Iceland and Norway. There are also substantial differences for both countries in marital status where married sample units were more likely to have their number listed, and age, for which there was a positive relationship between age and having their phone number listed. Some differences between the countries emerged with regards to education as sample units with medium level of education were more likely to have their number registered in Norway, while highly educated sample units in Iceland were more likely to have their number registered in Iceland. Taken togehter, this suggestes that there were some differences in both samples with regards to the distribution of background variables in the total sample and the proportion of the sample which phone numbers were listed. Therefore the level of representation of the final group of respondents might have been lower if not for manual tracing. Also, it seems that higher educated sample units in Norway shy away from registering their phone numbers, something that is not apparent in Iceland and might have a detrimental effect on the Norwegian response rate as highly educated individuals are often more likely to respond to sample surveys (Stoop, 2005). When the representativity of automatically traced and manually traced sampling units was compard the, R-indicator for sample units located by searching automatically was 0,7 in Iceland, 0,71 in Norway but was 0,98 in Iceland and 0,74 in Norway when manually traced sample units were added to the group, a large improvement for the Icelandic sample and a modest one for the Norwegian. Another pattern was found in the representativity of respondents as the final group of respondents had an R-indicator of 0,88 in Iceland and in Norway, while respondents found automatically had an R-indicator of 0,72 in Iceland and 0,80 in Norway, suggesting that the group of respondents found automatically in Norway were more representative of the total sample, then the same group in Iceland. This all indicates that manual tracing of phone numbers increases the representativity of the final group of respondents, both in Iceland and in Norway. Althoug size of improvement was different between the countries, the final result was the same, with both countries delivering final data with a high level of representativity. Interestingly the improvements in representativity were not equal when the effects of specific variables in the response model were examined, as can be seen in table 3. For the Icelandic 4
sample the representativity of gender, ethnicity and marital status increases with the addition of manually traced respondents to the final group of respondents. However there is no discernible effect on representativity with regards to education and representativity on the age distribution in the final group of respondents was actually worse, compared to respondents whose phone numbers were found automatically. In Norway the representativity is increased for age, it stays mostly the same for gender, ethnicity and education and decreases somewhat for ethinicity. This suggests that manual tracing in Iceland is not effective in increasing the representativity on the variable age. The same applies for Norway with regards to ethnicity. Table 3. Partial R-indicators for Iceland and Norway. Variable Automatic search partial R-indicator Final respondents partial R-indicator Difference Iceland Norway Iceland Norway Iceland Norway Gender 0,021 0,000 0,006 0,004 0,015-0,004 Age 0,002 0,102 0,030 0,081-0,029 0,021 Ethnicity 0,073 0,082 0,024 0,094 0,048-0,012 Marit. stat 0,059 0,005 0,030 0,007 0,029-0,002 Education 0,018 0,021 0,018 0,018 0,000 0,003 Modeling the likelihood of automatic tracing. In table 4 are the results of a logistic regression model used for predicting if the phone numbers of sample units were listed, for Iceland and Norway. Table 4. Logistic regression model for automatic tracing of phone numbers. Iceland Norway Gender (male) 0,69* 1,00 Age 1,01 1,05* Marital status (married) 3,04* 1,12 Ethnicity (Native) 6,32* 4,47* Education (Medium) 0,92 1,18 Education (High) 1,25* 0,77* * < 0,05 As can be seen by the results from Iceland there was some difference between subgroups if phone numbers were generally listed or not; Males were less likely than females to register their number with the Icelandic telephone registry, married were almost three times as likely as those who were unmarried to be successfully located using automatic telephone search, sample units with an Icelandic ethnicity were over six times more likely to be found using an 5
automatic search than other nationalities and the highly educated were somewhat more likely to be found using an automatic search compared to sample units with low levels of education. For Norway the situation was different as only one effect from the Icelandic model had the same direction and was significant: The likelihood of an successful automatic tracing in Norway was much higher for native sample units. Two other variables had a significant effect on the likelihood of a successful automatic tracing of phone number: Age had a positive effect, suggesting that automatic tracing was more likely to be successful for older sample units and sample units with high education levels were less likely then units with lower levels of education to be located automatically an effect that was in the opposite direction in Iceland. This is similar to what has been said before about the registration of phonenumbers for the highly educated in Norway, a fact that can have a negative effect on the total response rate. In both countries, ethnicity was strongly related to the likelihood of automatic tracing, an effect that has often been found in other research (e.g. Stoop, 2005). Differences in responses between automatically and manually traced respondents. Finally the distributions of responses to selected questions were examined in order to assess if manual tracing had an effect on the responses and possibly the non-response bias (see tables 5 and 6). Originally, 23 variables were examined for differences but to save space only variables which had a higher difference than 2 percentage points between groups (either within or between countries) are included in these results. Among variables that showed little or no differences were health related questions, questions on arrears, on subjective labour status etc. Two variables on labor market status were sensitive (especially in Iceland) to non-response bias because of unlisted phone numbers: First: Basic activity status where unemployment was much lower for listed respondents than unlisted in Iceland while in Norway the main difference was for retired persons and other interactive persons, for which the former were less likely to be found automatically. Second: Job search as listed respondents in Iceland were less likely than the final group of respondents to be actively looking for work, whereas in Norway there was no difference between the groups. In Iceland, dwelling types and tenure status was different between the final respondents and respondents found by automatic tracing as the percentage of detached houses decreased as well as the number of persons owning their house, while the percentage of appartment dwellers and tenants rose when respondents with manually traced phone numbers were added to the group of respondents. For Norway, no decernable differnces was found. 6
Table 5. Differences, in percentages, in responses to questions on labour status. Iceland Norway Total Listed Diff Total Listed Diff Basic activity status at work 66,7 66,0 0,7 61,5 61,3-0,2 Unemployed 4,5 2,4 2,1 0,3 0,3 0,0 in retirement or early retirement or has given up business 13,7 15,3-1,6 10,9 10,1-0,8 other inactive person 15,2 15,3-0,1 27,3 28,2 0,9 Actively looking for a job Yes 21,6 19,6 2,0 6,8 6,8 0,0 No 78,4 80,4-2,0 93,2 93,2 0,0 Capacity to face unexpected financial expenses Yes 64,6 68,7-4,1 90,9 91,1-0,2 No 35,4 31,3 4,1 9,1 8,9 0,2 Dwelling type detached house 36,1 39,2-3,1 62,1 62,4 0,3 semi-detached or terraced house 17,6 18,7-1,1 19,6 19,1-0,5 apartment or flat in a building with less than 10 dwellings 13,4 12,2 1,2 4,1 4,0-0,1 apartment or flat in a building with 10 or more dwellings 31,8 28,8 3,0 13,8 13,9 0,1 some other kind of accommodation 11,2 10,3 0,9 0,4,04-0,4 Tenure status Outright owner 20,0 22,3-2,3 24,7 25,6 0,9 Owner paying mortgage 57,6 62,0-4,4 60,7 60,4-0,3 Tenant or subtenant paying rent at prevailing or market rate 12,1 7,8 4,3 7,8 7,3-0,5 Accommodation is rented at a reduced rate (lower price that 8,7 6,6 2,1 0,5 0,4-0,1 the market price) Accommodation is provided free 1,5 1,2 0,3 6,4 6,1-0,3 In table 6 are the results for quantitative variables used from the EU-SILC. For Iceland the mean was overestimated for both variables when looking at respondents whose telephone number was traced with automatic search compared to the final group of respondents. For mean number of hours worked per week in main job, the overestimation was approximately 2% while for the average of the lowest monthly income to make ends meet the difference was 4%. In Norway there was a slight underestimation of number of hours worked (approximately 0,8%) and a slight overestimation of lowest monthly income necessary to make ends meet (approximately 0,8%). As for other variables differences between automatically traced and final group of respondents was higher for the Icelandic data then the Norwegian. Table 6. Differences in responses to quantiative questions in SILC. Iceland Norway Total Listed Diff Total Listed Diff Mean number of hours usually worked per week 42,5 43,2-0,7 38,1 37,8 0,3 Mean low monthly income to make ends meet 1 2449,5 2556,6-107,1 3264,8 3290,9-26,1 1 In Euros based on the exchange rate on January 2nd 2012. 7
Conclusion By compering the sample by some demographical variables (gender, age, ethnicity, marital status and education), we have shown that manual tracing of telephone numbers increase the representativeness of the sample by our set of demographical variables. This echoes findings presented by Wilhelmsen and Kleven (2011). Furthermore, we showed with use of R-indicators that manual tracing improved the representativeness of the part of the sample with known telephone numbers before we started the data collection, but not significant for Norway. However, when looking at the respondents in the net sample we found a large improvement in the representativeness for both countries, especially for Norway. Our last examination was to check if manual tracing effected the distribution for some key variables in the survey. I general the distribution was changed, but especially for Iceland. For Norway we found only minor changes. This is also in line with analysis by Wilhelmsen and Kleven (2011). Our analyses have given us some more insight in how manual tracing of telephone numbers improve the representativety of the net sample due to some demographics, but of course also raised some more questions about what this means for the outcome of the survey. Do we have the right balance between the quality and the efforts and time spent for tracing telephone numbers? Statistics Iceland seems to higher efficiency in their tracing procedures than Statistics Norway, since the impact on the final answer distribution is higher. Which distribution is closes to the truth is not possible to say, but it seems to be a better investment in quality at Statistics Iceland. We will also stress that we have only looked at the unweighted distribution. If the post-stratification or calibration (or both) is done effectively, we should expect that the prediction of the target variables is improved regardless of bias introduced by non-response. Still, our recommendation will be to try to avoid bias by use of effective tools and procedures, in this case tracing procedures, before and during the data collection. In the future we want to investigate the effect of the different tracing approaches and make analysis of process efficiency (ANOPE) (Thomsen et. al, 2007). References Bethlehem, J., Cobben, F., & Schouten, B. (2011). Handbook of nonresponse in household surveys. Wiley. Groves, R.M. (1989). Survey Errors and Survey Costs. New York: John Wiley and Sons. 8
de Leeuw, E., and de Heer, W. (2002). Trends in household survey nonresponse: A longitudinal and international comparison. In R.M. Groves, D.A. Dillman, J.L. Eltinge, J.L. Roderick, R.M. Groves, D.A. Dillman, J.L. Eltinge, and R.J. Little (Eds.), Survey Nonresponse (pp. 41-54). New York, NY: John Wiley and Sons. Schouten, B., Cobben, F. & Bethlehem, J. (2009), Indictators of Representativeness of Survey Nonresponse. Survey Methodology, 35, 101-113. Groves, R. M. (2006). Nonresponse rates and nonresponse bias in household surveys. Public Opinion Quarterly, 70, 646-675. Groves, R. M., & Peytcheva, E. (2008). The Impact of Nonresponse Rates on Nonresponse Bias A Meta-Analysis. Public opinion quarterly, 72, 167-189. Stoop, I. A. (2005). The hunt for the last respondent: Nonresponse in sample surveys. Aksant Academic Pub. Thomsen, I., Kleven, Ø. & Zhang, L-C. (2007). Dealing with non-sampling errors using administrative data. Paper presented at the IAOS Conference, Lisboa 2007 Wilhelmsen, M., & Kleven, Ø. (2011). Telephone-listed or not Telephone listed, does it makes a difference? Paper presentetd at the 22 nd International Workshop on Household Survey Nonresponse, Bilbao 5 th -7 th September 2011. 9