ONS Methodology Working Paper Series No 4. Non-probability Survey Sampling in Official Statistics
|
|
|
- Emory Neal
- 9 years ago
- Views:
Transcription
1 ONS Methodology Working Paper Series No 4 Non-probability Survey Sampling in Official Statistics Debbie Cooper and Matt Greenaway June 2015
2 1. Introduction Non-probability sampling is generally avoided in official statistics, often for good reasons: a lack of selection probabilities makes inference from the sample to the population extremely challenging, quality measures such as standard errors are difficult or impossible to calculate, and so the official statistics context of a wide variety of users, who may use data for different purposes, does not fit well with non-probability methods. However, because of the ever-increasing nonresponse rates, costs associated with probability sampling, and ease of carrying out web surveys some survey researchers have shifted their attention to developing better non-probability sampling and estimation techniques. As there have been numerous developments in the domain of non-probability sampling, this paper endeavours to raise awareness amongst producers of official statistics with regards to challenges and developments relating to non-probability sampling. This paper aims to achieve four main outcomes: i. Provide a concise review of the types of non-probability samples ii. iii. Highlight the key challenges associated with non-probability sampling Increase awareness of techniques available to potentially overcome these challenges iv. Provide guidance to help inform decision-making on whether a non-probability sample is justified In order to achieve these outcomes we first identify the characteristics of non-probability sampling and discuss the growing interest in it. Following this is an overview of various types of non-probability sampling techniques. The key challenges associated with non-probability sampling are then highlighted. Next, techniques developed to overcome some of the challenges associated with non-probability sampling and estimation are discussed. This is followed by a section providing guidance on when the use of non-probability sampling is justified. Finally, a set of recommendations regarding the use of non-probability sampling in official statistics is provided. 2
3 2. What constitutes non-probability sampling? Non-probability sampling has two distinguishing characteristics: i. one cannot specify the probability of selection for each unit that will be included in the sample ii. it is not possible to ensure that every unit in the population has a nonzero probability of inclusion (Frankfort-Nachmias and Nachmias, 1996) In probability sampling, the ability to calculate selection probabilities allows researchers to create design weights which result in an unbiased estimator. Probability samples also allow for representativeness as each unit in the target population has a nonzero probability of selection, and allow for the estimation of sampling variability these are crucial advantages. However, non-random nonresponse and undercoverage violate the assumptions of probability sampling, giving them a non-probability element. Various methods have been developed to deal with coverage and nonresponse issues in probability sampling. These include using multiple sampling frames, adjusting the weights for nonresponse and, if relevant, attrition based on sample characteristics and calibrating to target population totals in order to produce more representative estimates. However, the concerns about increasing nonresponse coupled with the high costs associated with traditional probability sampling methods have led some survey researchers to turn their attention to non-probability sampling. This growing interest in non-probability sampling also results from the fact that web data collection (most of which uses non-probability sampling) has become much easier to carry out. It is also much less costly than certain types of probability sampling. Sometimes non-probability sampling is used because there is no other option available to the researcher. This may be caused by the target population being a hidden population (and therefore there is no sampling frame available) or because of limited resource availability. Section 6 will provide guidance with regards to deciding whether use of a non-probability sample is justified. If this is the case, it is essential to bear in mind the challenges associated with non-probability sampling (see Section 4) and attempt to use techniques aimed at overcoming these challenges. Some of these techniques are described in Section Types of non-probability sampling It is extremely difficult to categorise non-probability sampling techniques because there is a lot of inconsistency in literature regarding definitions and applications of the types of nonprobability sampling methods. The blurred boundaries and different interpretations of the types of non-probability sampling should be borne in mind when interpreting the framework below which attempts to identify the main categories of non-probability sampling. 3
4 Given the multitude of non-probability sampling techniques available, the aim of this section is not to provide a comprehensive review of the types of non-probability sampling methods available but rather to give a flavour for the types of techniques available. This will form the basis for discussing the challenges and limitations of non-probability sampling later on as well as the methods that have been proposed to overcome some of these limitations. A review of literature revealed four common categories for classifying non-probability sampling techniques, these are described below. 3.1 Convenience/accidental sampling According to Baker et al. (2013) Convenience sampling is a form of non-probability sampling in which the ease with which potential participants can be located or recruited is the primary consideration. Therefore, no formal sample design is used. Types of convenience sampling techniques include: i. mall-intercept sampling this is frequently used in market research and involves interviewers attempting to recruit passersby to participate in a survey. ii. volunteer sampling (e.g. some types of online opt-in panels) this consists of people signing-up to participate in research studies. Volunteer sampling is usually done online whereby volunteers are put on a mailing list and receive invitations to participate in surveys. 3.2 Purposive sampling This consists of the researcher using their judgement and approaching only those people who they decide are most appropriate to participate in the study e.g. a sample of experts on a particular topic. 3.3 Sample matching This involves selecting a sample that matches a set of population characteristics of interest (rather than bringing the sample and population into alignment after carrying out the survey as is done with post-stratification). The most common type of sample matching is quota sampling (described in further detail in Section 5.1.1). 4
5 3.4 Chain referral methods These tend to be used for researching rare or hard-to-reach populations. They usually involve obtaining an initial set of respondents (called seeds) from the population of interest and using their links to obtain further respondents from the population of interest. Types of chain referral sampling include: i. snowball sampling - there is a lot of confusion regarding the meaning of snowball sampling. In many texts it is described as a non-probability convenience method used to access hard-to-reach populations whereby respondents from hidden populations are asked to recommend other respondents from the population of interest. However, originally, this method was developed by researchers such as Coleman (1958) and Goodman (1961) to investigate social networks rather than as a means to find participants to interview (Vogt et al., 2012). ii. respondent driven sampling (RDS) - in response to using snowball sampling as a type of convenience sample, survey researchers focused on developing chain referral methods which could be used to produce good estimates. RDS refers to this method (Heckathorn, 2011). As RDS uses a more structured approach to sampling, convenience is not the primary consideration of this type of sampling. RDS is described in further detail in Section Key challenges associated with non-probability sampling The two main concerns when using non-probability sampling are: i. There is a greater likelihood of selection bias. Consequently, the resulting sample may not be representative of the population ii. It is impossible to utilise unbiased estimators and associated quality measures 1 (e.g. variance, standard errors and confidence intervals) These two concerns are described in further detail in Sections 4.1 and 4.2 below. 4.1 Selection bias One of the key challenges when using non-probability sampling is selection bias. Selection bias is The error introduced when the study population does not represent the target population (Delgado-Rodriguez and Llorca, 2004). Selection bias occurs during the recruitment and retention of participants and the most effective way of avoiding such bias is by having a well-designed study. 5 1 Some researchers have focused on developing unbiased estimators for use with RDS. However, these require a number of assumptions to be made and should be used with caution. See Section
6 Selection bias occurs in both probability and non-probability sampling. However, nonprobability sampling is more prone to selection bias. Below is a (non-exhaustive) list of causes of selection bias: i. undercoverage this occurs when some units in the target population have a zero probability of selection thus making the sample unrepresentative of the population ii. iii. volunteer bias many non-probability sampling techniques rely on units volunteering to participate in a study and since volunteers may have different characteristics to those who haven t volunteered, this may result in an unrepresentative sample interviewer/researcher unconscious bias unconscious biases may influence interviewers/researchers so that they are inclined to select participants with particular characteristics e.g. people who look friendly or helpful or people who are more similar to themselves. This is particularly a problem with quota and purposive sampling whereby selection of participants is left to the interviewer/researcher. In all cases above, sampled individuals may differ systematically from non-sampled individuals on variables of interest thus use of the non-probability sample may result in biased estimates. Of the three causes of selection bias listed above, undercoverage may also be an issue in probability sampling. However, this is not usually as extensive/common in probability sampling as it is in non-probability sampling. 4.2 Unbiased estimators and lack of quality measures Standard practice in official statistics, and indeed in most large-scale social surveys, is to utilise probability sampling and design-based estimation, whereby a design weight is calculated as the inverse of the selection probability. This produces the Horvitz-Thompson estimator, which is unbiased for any design where all units have a non-zero probability of selection. Frequently, additional auxiliary information is utilised to adjust these design weights, technically making the estimator model-assisted, although design-based is often still used whenever the estimator accounts for survey design. This methodology has a number of advantages the resulting estimator is unbiased regardless of the purpose for which it is used, and sampling variability can be estimated directly. Since design-based estimation is not suitable for most non-probability samples, these advantages will be lost if a non-probability sample is used. 6
7 5. Types of sampling and estimation methods developed to overcome issues associated with nonprobability sampling Recently, researchers have focused on developing methods for overcoming the challenges relating to non-probability sampling described above. The methods developed focus on the both the sample selection and estimation stages. Some of these methods are described in Sections 5.1 and 5.2 below. 5.1 Overcoming challenges at the sampling stage When using non-probability sampling, the main challenge at the sampling stage is obtaining a representative sample. Two popular non-probability sampling strategies developed to obtain a more representative sample are sample matching and respondent-driven sampling (RDS) Sample matching As described in Section 3.3, the most common type of sample matching is quota sampling. In quota sampling, the interviewers are asked to interview a certain number of people (or units) with particular characteristics so that the final sample mirrors the target population in terms of these characteristics. In order for this to be successful, good estimates of the population characteristics used for matching need to be available (e.g. the estimates could be obtained from a good quality probability sample or a census). By using quota sampling, researchers hope to achieve a more representative sample (for further details of sample matching see Rivers, 2007; Bethlehem, 2014). However, since the choice of who to interview is still in the hands of the interviewer there may still be a substantial amount of selection bias resulting from interviewers approaching certain people over others (because of the unconscious bias as described earlier). For example, Mosteller et al. (1949) suggest in their review of the 1948 United States election poll results that the unconscious bias of interviewers may have considerably affected the incorrect prediction of the results even though quota sampling was used. Another problem with quota sampling can be undercoverage. For example, a quota sample collected on the High Street will not capture people at home or in work. Consequently, quota sampling alone is not sufficient for obtaining a representative sample. In fact Rubin (1979) recommended using both sample matching and weighting in order to obtain more accurate estimates. Various estimation methods for non-probability sampling are discussed in Section Respondent- driven sampling (RDS) This type of sampling is mainly aimed at sampling hidden (or hard-to-reach) populations. It is a type of chain referral sampling that uses link-tracing to obtain respondents from the target population. It is typically used when a sampling frame is not available. 7
8 RDS consists of two distinct sampling phases: in the first phase a convenience sample (the seeds at Wave 0) from the target population is chosen. The rest of the sample (Waves 1 onwards) is selected by following the links from previous respondents. This method, developed by Heckathorn (1997) uses an innovative approach for recruiting participants after Wave 0 because respondents are given a fixed number of coupons to hand out to other members of the target population. People who decide to participate in the survey simply take the coupon to the survey centre. Therefore, after Wave 0, each successive wave of the sample consists of population members who are given coupons by members of the previous wave and return those coupons to the survey centre. This process is repeated several times (until the desired sample size is achieved) so that each time respondents from one wave drive the following wave (Gile and Handcock, 2010). Using coupons in this way reduces confidentiality concerns in marginalized populations. Moreover, it enables the researcher to track social networks for use in estimation. Respondents usually receive additional compensation for each successful recruitment. Respondents handing out coupons are asked to report how many coupons they have distributed. This enables the researchers to develop more accurate estimation methods (refer to Heckathorn, 2011 for a description of the various estimators available using RDS). Moreover, the numerous waves in the study reduce the dependence of the final sample on the original convenience sample (Gile and Handcock, 2010). 5.2 Overcoming challenges at the weighting and estimation stage As described in Section 4.2, it is difficult to obtain unbiased estimates and calculate traditional quality measures when using non-probability sampling. In order to overcome these difficulties during the weighting and estimation stage, researchers have developed a number of methods for use with various sampling techniques. This section will focus on a number of these methods. The aim of this section is not to provide instructions regarding how to apply the methods described, but rather, the aim is to make the reader aware of the weighting and estimation methods that exist when using non-probability sampling. This will enable readers to make better informed decisions regarding the type of non-probability sampling method that would best suit their needs. Section 4.2 outlined why non-probability sampling typically rules out the traditional designbased approach for estimation. The main alternative is model-based estimation, where selection probabilities are not accounted for, and the estimator is based on a (explicit or implicit) model. The precise specification of this estimator will depend on the type of sample and the type and purpose of the estimate required. Some of the methods below propose the use of model-based estimation. In general, it is important to note that a model-based estimator can produce inaccurate or misleading results if the underlying model is incorrect, which is often impossible to verify; and that there are typically a number of models or estimation methods to choose from, each of which may produce different estimates. In contrast, a design-based estimator: will not (usually) produce biased estimates, is appropriate in most situations, and is (broadly) unique 8
9 there is only one Horvitz-Thompson estimator for a given design. For this reason, modelbased estimators should be treated with caution, and users should be wary that the results could always be open to dispute Available estimators and methods for calculating quality measures when using RDS i. Estimators With reference to non-probability sampling, Salganik (2006, p.i98) stated: For many years, researchers thought it was impossible to make unbiased estimates from this type of sample. However, it was recently shown that if certain conditions are met and if the appropriate procedures are used, then the prevalence estimates from respondent-driven sampling are asymptotically unbiased. Heckathorn (2011) provides a description of RDS and the various estimators available when using this sampling approach. Moreover, he specifies the strengths and limitations of these estimators in his paper. It is essential to bear in mind that these estimators require a number of assumptions to be made. Gile and Handcock (2010) caution users of RDS that biased estimates may be produced when these assumptions are not met. In a separate research strand using link-tracing designs Chow and Thompson (2003) proposed a Bayesian approach for estimation. They state that when prior information is available for the characteristics one wants to estimate, then their Bayesian approach should provide better estimators than when no prior information is used. When this prior information is not available, Chow and Thompson (2003) suggest conducting a sensitivity analysis. ii. Quality Measures In terms of quality measures available when using RDS, Salganik (2006) proposes a bootstrap method to construct confidence intervals around estimates produced from RDS samples. Furthermore, following the calculation of design effects for his data, he provides advice regarding the sample sizes required for RDS studies. He recommends that when using RDS, researchers should use a sample size twice as large as that required under simple random sampling. For link-tracing designs using the Bayesian approach described above, Chow and Thompson (2003) describe how, once the estimators are obtained using this approach, not much more effort is required to obtain interval estimates in order to assess the accuracy of estimators. Credibility intervals for use with the Bayesian approach are described in further detail in Section RDS is not suitable for all research and is generally used for researching hidden populations. When RDS is not a suitable non-probability sampling method other estimators and quality measures are required. The use of Propensity Score Adjustments, described next, is one alternative. 9
10 5.2.2 Using weighting for estimation We have already outlined how design-based estimation with non-probability sampling is impossible. However, survey researchers have proposed the use of Propensity Score Adjustments (PSA) to approximate a design-based approach. PSA has been largely used and tested on web panel surveys. There are various methods for using PSA. One of these methods, outlined by Valliant and Dever (2011) involves constructing pseudo design weights and using covariates from a reference (probability) survey to adjust these weights for nonresponse. These adjusted weights are then used to construct estimators. In order to construct the pseudo design weight in the first place, Valliant and Dever (2011) propose that if a subsample from a large panel is used, then the pseudo design weights could be calculated as the inverses of the selection probabilities from the panel. For a comparison of the quality of estimators using PSA see Valliant and Dever (2011). Lee (2006) calculated the bias of estimates when using PSA (by comparing PSA weighted and unweighted estimates to the reference survey estimates) as well as the standard errors. He found that although PSA seems to reduce bias resulting from nonresponse, it seems to increase variance. Consequently, this should be borne in mind when using PSA techniques. Lee (2006) also recommends that covariates that are highly related to the study outcomes should be used in the PSA. For further detail on using PSA for weighting and estimation see Lee and Valliant (2009) and Lee (2006) Additional quality measures proposed for non-probability sampling Some quality measures for estimators calculated from non-probability samples have been discussed in the previous sections. A number of other quality measures for use with nonprobability sampling have been proposed, including credibility intervals and participation rates. These are briefly described below. i. Credibility intervals seem to be gaining popularity when using online opt-in panels. They should be used when a Bayesian approach is adopted. A credibility interval is similar to a confidence interval in that it is used to provide an indication of uncertainty of estimates. In practice it tends to be calculated in exactly the same way as a confidence interval. However, the interpretation of a credibility interval is different from that of a confidence interval (Gill, 2014). Unlike the confidence interval, the credibility interval is directly related to the actual data distribution. Consequently, the interval may or may not include the estimate (e.g. the mean), depending on whether the actual data distribution is skewed. Therefore unlike the confidence interval, the credibility interval is not an interval around the mean (United States, Environmental Protection Agency). Moreover, a Bayesian credible interval has a precise probabilistic meaning (United States Environmental Protection Agency), 2003, p33) so that, for example, a credibility interval of 90% would be interpreted as there being a 90% probability that the true value lies within the credibility interval. For further information on interpreting credibility intervals see AAPOR (2012) and United States Environmental Protection Agency (2003). 10
11 ii. Unweighted probability survey response rates are calculated as: Since the total number of eligible units is generally not known in non-probability samples it is not possible to calculate response rates. Consequently, some researchers have started using the term participation rates for non-probability samples. Participation rates (Baker et al., 2013) can be defined as Baker et al. (2013) state that it is essential for researchers to report on the quality of their estimates in order for readers to be able to use their results appropriately. Unfortunately, there is not currently a widely accepted framework for assessing the quality of estimates resulting from non-probability samples as there is for assessing the quality of estimates produced from probability samples. Therefore, Baker et al. (2013) encourage the development of new quality measures for use with non-probability sampling. They also note the importance of using different terminology for quality measures associated with nonprobability sampling in order to differentiate these from the quality measures associated with probability sampling. 6. When is use of non-probability sampling justified? In designing a study one must consider fitness for purpose. This is a well-known concept in probability sampling too as there is always some degree of compromise that needs to be achieved in terms of cost and precision (or minimisation of error). Groves (2004, p10) emphasises the difference between what he refers to as modellers and describers. Modellers are those researchers from psychometric or econometric backgrounds who are mainly interested in relationships between variables. On the other hand, describers are researchers who are mainly concerned with describing the target population e.g. in terms of means and totals. These include producers of official statistics. Groves (2004) highlights the fact that because of their differing research aims, modellers and describers are interested in different types of errors. As a result, modellers and describers tend to use different sampling techniques. This is important in terms of fitness for purpose because there is no single correct survey method or survey sampling technique. Moreover, there is no single correct level of accuracy that should be achieved when carrying out a survey study. These considerations should be made within the context of the study being carried out bearing in mind the aims of the study. For example, modellers tend to be interested in a narrower range of variables than describers (who tend to conduct large surveys with hundreds of variables); therefore certain techniques such as PSA tend to lend themselves better to modellers needs. In the case of 11
12 PSA this is because it was found that unless the covariates included in the analysis are highly related to the study outcomes (i.e. the variables which will be used to produce estimates), the resulting estimates have similar bias and increased variance compared with estimates produced from the reference survey (Lee, 2006). It follows therefore that it is not possible to making sweeping statements regarding the utility of non-probability sampling techniques in official statistics. However, decisions as to whether to use probability or non-probability sampling boil down to what the researcher is hoping to achieve from the survey. Since in official statistics we are mainly (although by no means exclusively) describers, we need to consider the implications of using non-probability sampling where a large number of variables are collected and used to estimate a fairly wide variety of characteristics of finite populations. As discussed in previous sections, problems such as selection bias and the necessity of model-based estimation make non-probability sampling much less desirable than probability sampling in this context. In addition, the fact that a wide variety of estimates may be produced makes it extremely challenging to ensure that all estimates are fit for purpose. However, sometimes the researcher has no other feasible option for example, when the researcher is attempting to describe characteristics of a hidden population for which there is no sampling frame. In such a case it is essential to consider the outcomes one hopes to achieve in order to design a study that will achieve the best possible outcomes despite the challenges associated with it. In particular, we advise limiting the uses to which the estimates can be put in order to ensure that all estimates are fit for purpose, although this is challenging in an official statistics context where statistics must be publicly available. At the very least, it is crucial to ensure that any quality issues stemming from the choice of sampling method are communicated clearly to users. One aim of this paper was to make researchers aware of the various non-probability sampling and estimation techniques available. These should be carefully considered if a decision to use a non-probability sample is made. For example, with hidden populations, the use of RDS may be a suitable option as much more effort has been made to develop unbiased estimators and good quality measures for RDS than for other sampling types such as convenience sampling. Whether using probability or non-probability sampling, it is the responsibility of researchers to consider carefully the most suitable option, state clearly the reasoning behind their choice of sampling and estimation techniques, and make every effort to describe clearly the quality of their resulting estimates thus ensuring compliance as far as possible with the UK Statistics Authority (2009) Code of Practice for Official Statistics. 7. Recommendations Following a review of literature on non-probability sampling, there are three main recommendations regarding the use of non-probability sampling in official statistics. These are: i. Fitness for purpose should be used to drive survey design. 12
13 ii. Non-probability sampling does not necessarily equate to lack of quality and the various methods available should be carefully considered in order to obtain the best quality estimates for the study at hand. iii. It is essential to be transparent regarding the choice of sampling and estimation techniques, describing the quality of resulting estimates as well as their limitations. 13
14 References AAPOR Understanding a credibility interval and how it differs from the margin of sampling error in a public opinion poll, [online] Available at: atementoncredibilityintervals.pdf [Accessed 7 th May 2015]. Baker, R., Brick, M.J., Bates, N.A., Battaglia, M., Couper, M.P., Dever, J.A., Gile. K.J., Tourangeau, R Report of the AAPOR Task Force on Non-Probability Sampling. [online] Available at: Final_7_revised_FNL_6_22_13.pdf [Accessed 7 th May 2015]. Bethlehem, J Solving the nonresponse problem with sample matching? Statistics Netherlands Discussion Paper, [online] Available at: [Accessed 7 th May 2015]. Chow, M. and Thompson, S.K Estimation with link-tracing sampling designs: a Bayesian approach. Survey Methodology, 29 (2) Coleman, J. S Relational Analysis: The Study of Social Organizations with Survey Methods. Human Organization. 17(4), pp Delgado-Rodriguez, M. and Llorca, J Bias. Journal of Epidemiology & Community Health. 58, pp Frankfort-Nachmias, C. and Nachmias, D Research Methods in the Social Sciences. New York: Worth Publishers. Gile, K.J. and Handcock, M.S Respondent-Driven Sampling: An Assessment of Current Methodology. Sociological Methodology. 40(1), pp Gill, J Bayesian Methods: A Social and Behavioral Sciences Approach. CRC Press. Goodman, L. A Snowball Sampling. Annals of Mathematical Statistics. 32, pp Groves, R.M Survey Errors and Survey Costs. Wiley & Sons: New Jersey Heckathorn, D.D Respondent-driven sampling: A new approach to the study of hidden populations. Sociological Problems, 44( 2), pp Heckathorn, D.D Snowball versus Respondent-Driven Sampling. Sociological Methodology, 41(1), pp Johnston, L.G. and Sabin, K Sampling hard-to-reach populations with respondent driven sampling. Methodological Innovations Online, 5(2), pp
15 Lee, S Propensity Score Adjustment as a Weighting Scheme for Volunteer Panel Web Surveys. Journal of Official Statistics, 22(2), pp Mosteller, F., Hyman, H., McCarthy, P., Marks, E. and Truman, D The Pre-Election Polls of 1948: Report to the Committee on Analysis of Pre-election Polls and Forecasts. New York: Social Science Research Council. Rivers, D Sampling for Web Surveys. [online] Available at: [Accessed 7 th May 2015]. Rubin, D.B Using Multivariate Matched Sampling and Regression Adjustment to Control Bias in Observational Studies. Journal of the American Statistical Association, 74(366), pp Available at: [Accessed 7 th May 2015]. Salganik, M.J Variance Estimation, Design Effects, and Sample Size Calculations for Respondent-Driven Sampling. Journal of Urban Health: Bulletin of the New York Academy of Medicine. 83(7), pp.i98-i112. Salganik, M.J. and Heckathorn, D. D Sampling and Estimation in Hidden Populations Using Respondent-Driven Sampling. Sociological Methodology, 34, pp UK Statistics Authority Code of Practice for Official Statistics. Edition 1.0. [online] Available at: [Accessed 7 th May 2015]. United States, Environmental Protection Agency Occurrence Estimation Methodology and Occurrence Findings Report for the Six-Year Review of Existing National Primary Drinking Water Regulations [online] Available at: df [Accessed 18 th June 2015]. Valliant, R. and Dever, J. A Estimating Propensity Adjustments for Volunteer Web Surveys. Sociological Methods & Research, 40(1) pp Vogt, P.W., Gardner, D.C., and Haeffele, L.M When to Use What Research Design. Guilford Press. 15
Reflections on Probability vs Nonprobability Sampling
Official Statistics in Honour of Daniel Thorburn, pp. 29 35 Reflections on Probability vs Nonprobability Sampling Jan Wretman 1 A few fundamental things are briefly discussed. First: What is called probability
Chapter 8: Quantitative Sampling
Chapter 8: Quantitative Sampling I. Introduction to Sampling a. The primary goal of sampling is to get a representative sample, or a small collection of units or cases from a much larger collection or
Types of Error in Surveys
2 Types of Error in Surveys Surveys are designed to produce statistics about a target population. The process by which this is done rests on inferring the characteristics of the target population from
REPORT OF THE AAPOR TASK FORCE ON NON- PROBABILITY SAMPLING
REPORT OF THE AAPOR TASK FORCE ON NON- PROBABILITY SAMPLING Reg Baker, Market Strategies International and Task Force Co-Chair J. Michael Brick, Westat and Task Force Co-Chair Nancy A. Bates, Bureau of
NON-PROBABILITY SAMPLING TECHNIQUES
NON-PROBABILITY SAMPLING TECHNIQUES PRESENTED BY Name: WINNIE MUGERA Reg No: L50/62004/2013 RESEARCH METHODS LDP 603 UNIVERSITY OF NAIROBI Date: APRIL 2013 SAMPLING Sampling is the use of a subset of the
Missing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random
[Leeuw, Edith D. de, and Joop Hox. (2008). Missing Data. Encyclopedia of Survey Research Methods. Retrieved from http://sage-ereference.com/survey/article_n298.html] Missing Data An important indicator
Explorations in Non-Probability Sampling Using the Web
Proceedings of Statistics Canada Symposium 2014 Beyond traditional survey taking: adapting to a changing world Explorations in Non-Probability Sampling Using the Web J. Michael Brick 1 Abstract Although
Multiple Imputation for Missing Data: A Cautionary Tale
Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust
Why Sample? Why not study everyone? Debate about Census vs. sampling
Sampling Why Sample? Why not study everyone? Debate about Census vs. sampling Problems in Sampling? What problems do you know about? What issues are you aware of? What questions do you have? Key Sampling
Using Proxy Measures of the Survey Variables in Post-Survey Adjustments in a Transportation Survey
Using Proxy Measures of the Survey Variables in Post-Survey Adjustments in a Transportation Survey Ting Yan 1, Trivellore Raghunathan 2 1 NORC, 1155 East 60th Street, Chicago, IL, 60634 2 Institute for
Farm Business Survey - Statistical information
Farm Business Survey - Statistical information Sample representation and design The sample structure of the FBS was re-designed starting from the 2010/11 accounting year. The coverage of the survey is
Non-random/non-probability sampling designs in quantitative research
206 RESEARCH MET HODOLOGY Non-random/non-probability sampling designs in quantitative research N on-probability sampling designs do not follow the theory of probability in the choice of elements from the
Descriptive Methods Ch. 6 and 7
Descriptive Methods Ch. 6 and 7 Purpose of Descriptive Research Purely descriptive research describes the characteristics or behaviors of a given population in a systematic and accurate fashion. Correlational
Handling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
GUIDELINES FOR REVIEWING QUANTITATIVE DESCRIPTIVE STUDIES
GUIDELINES FOR REVIEWING QUANTITATIVE DESCRIPTIVE STUDIES These guidelines are intended to promote quality and consistency in CLEAR reviews of selected studies that use statistical techniques and other
Inferential Problems with Nonprobability Samples
Inferential Problems with Nonprobability Samples Richard Valliant University of Michigan & University of Maryland 9 Sep 2015 (UMich & UMD) WSS seminar 1 / 18 Types of samples Not all nonprobability samples
Application in Predictive Analytics. FirstName LastName. Northwestern University
Application in Predictive Analytics FirstName LastName Northwestern University Prepared for: Dr. Nethra Sambamoorthi, Ph.D. Author Note: Final Assignment PRED 402 Sec 55 Page 1 of 18 Contents Introduction...
National Disability Authority Resource Allocation Feasibility Study Final Report January 2013
National Disability Authority Resource Allocation Feasibility Study January 2013 The National Disability Authority (NDA) has commissioned and funded this evaluation. Responsibility for the evaluation (including
Mode and Patient-mix Adjustment of the CAHPS Hospital Survey (HCAHPS)
Mode and Patient-mix Adjustment of the CAHPS Hospital Survey (HCAHPS) April 30, 2008 Abstract A randomized Mode Experiment of 27,229 discharges from 45 hospitals was used to develop adjustments for the
Statistical Analysis of Social Networks
Statistical Analysis of Social Networks Krista J. Gile University of Massachusetts, Amherst Octover 24, 2013 Collaborators: Social Network Analysis [1] Isabelle Beaudry, UMass Amherst Elena Erosheva, University
INTERNATIONAL FRAMEWORK FOR ASSURANCE ENGAGEMENTS CONTENTS
INTERNATIONAL FOR ASSURANCE ENGAGEMENTS (Effective for assurance reports issued on or after January 1, 2005) CONTENTS Paragraph Introduction... 1 6 Definition and Objective of an Assurance Engagement...
Sampling: What is it? Quantitative Research Methods ENGL 5377 Spring 2007
Sampling: What is it? Quantitative Research Methods ENGL 5377 Spring 2007 Bobbie Latham March 8, 2007 Introduction In any research conducted, people, places, and things are studied. The opportunity to
Introduction to Sampling. Dr. Safaa R. Amer. Overview. for Non-Statisticians. Part II. Part I. Sample Size. Introduction.
Introduction to Sampling for Non-Statisticians Dr. Safaa R. Amer Overview Part I Part II Introduction Census or Sample Sampling Frame Probability or non-probability sample Sampling with or without replacement
Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing
Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing
Chapter 1 Introduction. 1.1 Introduction
Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
The Billion Dollar Lost Laptop Problem Benchmark study of U.S. organizations
The Billion Dollar Lost Laptop Problem Benchmark study of U.S. organizations Independently conducted by Ponemon Institute LLC Publication Date: 30 September 2010 Ponemon Institute Research Report Part
JSM 2013 - Survey Research Methods Section
How Representative are Google Consumer Surveys?: Results from an Analysis of a Google Consumer Survey Question Relative National Level Benchmarks with Different Survey Modes and Sample Characteristics
Data quality and metadata
Chapter IX. Data quality and metadata This draft is based on the text adopted by the UN Statistical Commission for purposes of international recommendations for industrial and distributive trade statistics.
Introduction... 3. Qualitative Data Collection Methods... 7 In depth interviews... 7 Observation methods... 8 Document review... 8 Focus groups...
1 Table of Contents Introduction... 3 Quantitative Data Collection Methods... 4 Interviews... 4 Telephone interviews... 5 Face to face interviews... 5 Computer Assisted Personal Interviewing (CAPI)...
Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University
Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University 1 Outline Missing data definitions Longitudinal data specific issues Methods Simple methods Multiple
Techniques for data collection
Techniques for data collection Technical workshop on survey methodology: Enabling environment for sustainable enterprises in Indonesia Hotel Ibis Tamarin, Jakarta 4-6 May 2011 Presentation by Mohammed
Missing data in randomized controlled trials (RCTs) can
EVALUATION TECHNICAL ASSISTANCE BRIEF for OAH & ACYF Teenage Pregnancy Prevention Grantees May 2013 Brief 3 Coping with Missing Data in Randomized Controlled Trials Missing data in randomized controlled
A LEVEL ECONOMICS. ECON1/Unit 1 Markets and Market Failure Mark scheme. 2140 June 2014. Version 0.1 Final
A LEVEL ECONOMICS ECON1/Unit 1 Markets and Market Failure Mark scheme 2140 June 2014 Version 0.1 Final Mark schemes are prepared by the Lead Assessment Writer and considered, together with the relevant
Fairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
Sampling strategies *
UNITED NATIONS SECRETARIAT ESA/STAT/AC.93/2 Statistics Division 03 November 2003 Expert Group Meeting to Review the Draft Handbook on Designing of Household Sample Surveys 3-5 December 2003 English only
Models for Product Demand Forecasting with the Use of Judgmental Adjustments to Statistical Forecasts
Page 1 of 20 ISF 2008 Models for Product Demand Forecasting with the Use of Judgmental Adjustments to Statistical Forecasts Andrey Davydenko, Professor Robert Fildes [email protected] Lancaster
DESCRIPTIVE RESEARCH DESIGNS
DESCRIPTIVE RESEARCH DESIGNS Sole Purpose: to describe a behavior or type of subject not to look for any specific relationships, nor to correlate 2 or more variables Disadvantages since setting is completely
Survey Research. Classifying surveys on the basis of their scope and their focus gives four categories:
Survey Research Types of Surveys Surveys are classified according to their focus and scope (census and sample surveys) or according to the time frame for data collection (longitudinal and cross-sectional
Handling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
Sampling Procedures Y520. Strategies for Educational Inquiry. Robert S Michael
Sampling Procedures Y520 Strategies for Educational Inquiry Robert S Michael RSMichael 2-1 Terms Population (or universe) The group to which inferences are made based on a sample drawn from the population.
CUSTOMER SERVICE SATISFACTION WAVE 4
04/12/2012 GFK CUSTOMER SERVICE SATISFACTION WAVE 4 GfK NOP Amanda Peet 2 Customer Service Satisfaction Table of Contents: Executive Summary... 3 Objectives and Methodology... 5 Overview of all sectors...
How do we know what we know?
Research Methods Family in the News Can you identify some main debates (controversies) for your topic? Do you think the authors positions in these debates (i.e., their values) affect their presentation
A Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
Welcome back to EDFR 6700. I m Jeff Oescher, and I ll be discussing quantitative research design with you for the next several lessons.
Welcome back to EDFR 6700. I m Jeff Oescher, and I ll be discussing quantitative research design with you for the next several lessons. I ll follow the text somewhat loosely, discussing some chapters out
Chapter 6 Experiment Process
Chapter 6 Process ation is not simple; we have to prepare, conduct and analyze experiments properly. One of the main advantages of an experiment is the control of, for example, subjects, objects and instrumentation.
Sampling Probability and Inference
PART II Sampling Probability and Inference The second part of the book looks into the probabilistic foundation of statistical analysis, which originates in probabilistic sampling, and introduces the reader
Economic impact of privacy on online behavioral advertising
Benchmark study of Internet marketers and advertisers Independently Conducted by Ponemon Institute LLC April 30, 2010 Ponemon Institute Research Report Economic impact of privacy on online behavioral advertising
SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?
SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one? Simulations for properties of estimators Simulations for properties
Sample Size Issues for Conjoint Analysis
Chapter 7 Sample Size Issues for Conjoint Analysis I m about to conduct a conjoint analysis study. How large a sample size do I need? What will be the margin of error of my estimates if I use a sample
National curriculum tests. Key stage 2. English reading test framework. National curriculum tests from 2016. For test developers
National curriculum tests Key stage 2 English reading test framework National curriculum tests from 2016 For test developers Crown copyright 2015 2016 key stage 2 English reading test framework: national
Analysis of academy school performance in GCSEs 2014
Analysis of academy school performance in GCSEs 2014 Final report Report Analysis of academy school performance in GCSEs 2013 1 Analysis of Academy School Performance in GCSEs 2014 Jack Worth Published
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
Appendix B Data Quality Dimensions
Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational
Measuring investment in intangible assets in the UK: results from a new survey
Economic & Labour Market Review Vol 4 No 7 July 21 ARTICLE Gaganan Awano and Mark Franklin Jonathan Haskel and Zafeira Kastrinaki Imperial College, London Measuring investment in intangible assets in the
IS0 14040 INTERNATIONAL STANDARD. Environmental management - Life cycle assessment - Principles and framework
INTERNATIONAL STANDARD IS0 14040 First edition 1997006-15 Environmental management - Life cycle assessment - Principles and framework Management environnemental - Analyse du cycle de vie - Principes et
Assessment Policy. 1 Introduction. 2 Background
Assessment Policy 1 Introduction This document has been written by the National Foundation for Educational Research (NFER) to provide policy makers, researchers, teacher educators and practitioners with
Validity, Fairness, and Testing
Validity, Fairness, and Testing Michael Kane Educational Testing Service Conference on Conversations on Validity Around the World Teachers College, New York March 2012 Unpublished Work Copyright 2010 by
Sampling. COUN 695 Experimental Design
Sampling COUN 695 Experimental Design Principles of Sampling Procedures are different for quantitative and qualitative research Sampling in quantitative research focuses on representativeness Sampling
9100:2016 Series of Standards Frequently Asked Questions (FAQs)
Frequently Asked Questions (FAQs) In developing this list of Frequently Asked Questions (FAQ's) for the 9100:2016 Series revisions, input has been obtained from experts and users of the standard from around
The Margin of Error for Differences in Polls
The Margin of Error for Differences in Polls Charles H. Franklin University of Wisconsin, Madison October 27, 2002 (Revised, February 9, 2007) The margin of error for a poll is routinely reported. 1 But
The Relationship between the Fundamental Attribution Bias, Relationship Quality, and Performance Appraisal
The Relationship between the Fundamental Attribution Bias, Relationship Quality, and Performance Appraisal Executive Summary Abstract The ability to make quality decisions that influence people to exemplary
Binomial Sampling and the Binomial Distribution
Binomial Sampling and the Binomial Distribution Characterized by two mutually exclusive events." Examples: GENERAL: {success or failure} {on or off} {head or tail} {zero or one} BIOLOGY: {dead or alive}
Selectivity of Big data
Discussion Paper Selectivity of Big data The views expressed in this paper are those of the author(s) and do not necessarily reflect the policies of Statistics Netherlands 2014 11 Bart Buelens Piet Daas
1 Annex 11: Market failure in broadcasting
1 Annex 11: Market failure in broadcasting 1.1 This annex builds on work done by Ofcom regarding market failure in a number of previous projects. In particular, we discussed the types of market failure
Basics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu [email protected] Modern machine learning is rooted in statistics. You will find many familiar
INTERNATIONAL STANDARD ON ASSURANCE ENGAGEMENTS 3000 ASSURANCE ENGAGEMENTS OTHER THAN AUDITS OR REVIEWS OF HISTORICAL FINANCIAL INFORMATION CONTENTS
INTERNATIONAL STANDARD ON ASSURANCE ENGAGEMENTS 3000 ASSURANCE ENGAGEMENTS OTHER THAN AUDITS OR REVIEWS OF HISTORICAL FINANCIAL INFORMATION (Effective for assurance reports dated on or after January 1,
Generalized modified linear systematic sampling scheme for finite populations
Hacettepe Journal of Mathematics and Statistics Volume 43 (3) (204), 529 542 Generalized modified linear systematic sampling scheme for finite populations J Subramani and Sat N Gupta Abstract The present
Chapter 3. Sampling. Sampling Methods
Oxford University Press Chapter 3 40 Sampling Resources are always limited. It is usually not possible nor necessary for the researcher to study an entire target population of subjects. Most medical research
Problem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;
SAMPLING METHODS IN SOCIAL RESEARCH
SAMPLING METHODS IN SOCIAL RESEARCH Muzammil Haque Ph.D Scholar Visva Bharati, Santiniketan,West Bangal Sampling may be defined as the selection of some part of an aggregate or totality on the basis of
MEMO TO: FROM: RE: Background
MEMO TO: FROM: RE: Amy McIntosh, Principal Deputy Assistant Secretary, delegated the authority of the Assistant Secretary, Office of Planning, Evaluation and Policy Development Dr. Erika Hunt and Ms. Alicia
National curriculum tests. Key stage 1. English reading test framework. National curriculum tests from 2016. For test developers
National curriculum tests Key stage 1 English reading test framework National curriculum tests from 2016 For test developers Crown copyright 2015 2016 key stage 1 English reading test framework: national
SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA
PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA ABSTRACT The decision of whether to use PLS instead of a covariance
How to achieve excellent enterprise risk management Why risk assessments fail
How to achieve excellent enterprise risk management Why risk assessments fail Overview Risk assessments are a common tool for understanding business issues and potential consequences from uncertainties.
Pilot Testing and Sampling. An important component in the data collection process is that of the pilot study, which
Pilot Testing and Sampling An important component in the data collection process is that of the pilot study, which is... a small-scale trial run of all the procedures planned for use in the main study
UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)
UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design
the role of the head of internal audit in public service organisations 2010
the role of the head of internal audit in public service organisations 2010 CIPFA Statement on the role of the Head of Internal Audit in public service organisations The Head of Internal Audit in a public
Statistical Methods for Sample Surveys (140.640)
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
Guided Reading 9 th Edition. informed consent, protection from harm, deception, confidentiality, and anonymity.
Guided Reading Educational Research: Competencies for Analysis and Applications 9th Edition EDFS 635: Educational Research Chapter 1: Introduction to Educational Research 1. List and briefly describe the
Statistical Rules of Thumb
Statistical Rules of Thumb Second Edition Gerald van Belle University of Washington Department of Biostatistics and Department of Environmental and Occupational Health Sciences Seattle, WA WILEY AJOHN
Critical Thinking Competency Standards
A Guide For Educators to Critical Thinking Competency Standards Standards, Principles, Performance Indicators, and Outcomes With a Critical Thinking Master Rubric by Richard Paul and Linda Elder Foundation
2003 National Survey of College Graduates Nonresponse Bias Analysis 1
2003 National Survey of College Graduates Nonresponse Bias Analysis 1 Michael White U.S. Census Bureau, Washington, DC 20233 Abstract The National Survey of College Graduates (NSCG) is a longitudinal survey
Identifying and Reducing Nonresponse Bias throughout the Survey Process
Identifying and Reducing Nonresponse Bias throughout the Survey Process Thomas Krenzke, Wendy Van de Kerckhove, and Leyla Mohadjer Westat Keywords: Weighting, Data Collection, Assessments. Introduction
18.6.1 Terms concerned with internal quality control procedures
18.6.1 Terms concerned with internal quality control procedures Quality assurance in analytical laboratories Quality assurance is the essential organisational infrastructure that underlies all reliable
1. What experience does your company have in providing online samples for market research?
1. What experience does your company have in providing online samples for market research? Context: This answer might help you to form an opinion about the relevant experience of the sample provider. How
CLINICAL EXCELLENCE AWARDS. Academy of Medical Royal Colleges submission to the Review Body on Doctors and Dentists Remuneration
CLINICAL EXCELLENCE AWARDS Academy of Medical Royal Colleges submission to the Review Body on Doctors and Dentists Remuneration Introduction The Academy of Medical Royal Colleges (the Academy) welcomes
THE JOINT HARMONISED EU PROGRAMME OF BUSINESS AND CONSUMER SURVEYS
THE JOINT HARMONISED EU PROGRAMME OF BUSINESS AND CONSUMER SURVEYS List of best practice for the conduct of business and consumer surveys 21 March 2014 Economic and Financial Affairs This document is written
Consulting Performance, Rewards & Talent. Measuring the Business Impact of Employee Selection Systems
Consulting Performance, Rewards & Talent Measuring the Business Impact of Employee Selection Systems Measuring the Business Impact of Employee Selection Systems Many, if not all, business leaders readily
Guidelines on best practice in recruitment and selection
Guidelines on best practice in recruitment and selection These guidelines are primarily designed to assist you in implementing effective and fair recruitment and selection processes, which will contribute
Management Accounting 303 Segmental Profitability Analysis and Evaluation
Management Accounting 303 Segmental Profitability Analysis and Evaluation Unless a business is a not-for-profit business, all businesses have as a primary goal the earning of profit. In the long run, sustained
THE COMBINED CODE PRINCIPLES OF GOOD GOVERNANCE AND CODE OF BEST PRACTICE
THE COMBINED CODE PRINCIPLES OF GOOD GOVERNANCE AND CODE OF BEST PRACTICE Derived by the Committee on Corporate Governance from the Committee s Final Report and from the Cadbury and Greenbury Reports.
Writing Learning Objectives that Engage Future Engineers: Hands-on & Minds-on Learning Activities
Writing Learning Objectives that Engage Future Engineers: Hands-on & Minds-on Learning Activities S. K. Barnes 1 Associate Professor James Madison University Harrisonburg, VA USA [email protected] Keywords:
