Survey Data Analysis in Stata
|
|
|
- Loren Bennett
- 10 years ago
- Views:
Transcription
1 Survey Data Analysis in Stata Jeff Pitblado Associate Director, Statistical Software StataCorp LP Stata Conference DC 2009 J. Pitblado (StataCorp) Survey Data Analysis DC / 44
2 Outline 1 Types of data 2 Survey data characteristics 3 Variance estimation 4 Estimation for subpopulations 5 Summary J. Pitblado (StataCorp) Survey Data Analysis DC / 44
3 Why survey data? Collecting data can be expensive and time consuming. Consider how you would collect the following data: Smoking habits of teenagers Birth weights for expectant mothers with high blood pressure Using stages of clustered sampling can help cut down on the expense and time. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
4 Types of data Simple random sample (SRS) data Observations are "independently" sampled from a data generating process. Typical assumption: independent and identically distributed (iid) Make inferences about the data generating process Sample variability is explained by the statistical model attributed to the data generating process Standard data We ll use this term to distinguish this data from survey data. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
5 Types of data Correlated data Individuals are assumed not independent. Cause: Observations are taken over time Random effects assumptions Cluster sampling Treatment: Time-series models Longitudinal/panel data models cluster() option J. Pitblado (StataCorp) Survey Data Analysis DC / 44
6 Types of data Survey data Individuals are sampled from a fixed population according to a survey design. Distinguishing characteristics: Complex nature under which individuals are sampled Make inferences about the fixed population Sample variability is attributed to the survey design J. Pitblado (StataCorp) Survey Data Analysis DC / 44
7 Types of data Standard data Estimation commands for standard data: proportion regress We ll refer to these as standard estimation commands. Survey data Survey estimation commands are governed by the svy prefix. svy: proportion svy: regress svy requires that the data is svyset. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
8 Survey data characteristics Single-stage syntax svyset [ psu ] [ weight ] [, strata(varname) fpc(varname) ] Primary sampling units (PSU) Sampling weights pweight Strata Finite population correction (FPC) J. Pitblado (StataCorp) Survey Data Analysis DC / 44
9 Survey data characteristics Sampling unit An individual or collection of individuals from the population that can be selected for observation. Sampling groups of individuals is synonymous with cluster sampling. Cluster sampling usually results in inflated variance estimates compared to SRS. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
10 Survey data characteristics Sampling weight The reciprocal of the probability for an individual to be sampled. Probabilities are derived from the survey design. Sampling units Strata Typically considered to be the number of individuals in the population that a sampled individual represents. Reduces bias induced by the sampling design. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
11 Survey data characteristics Strata In stratified designs, the population is partitioned into well-defined groups, called strata. Sampling units are independently sampled from within each stratum. Stratification usually results in smaller variance estimates compared to SRS. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
12 Survey data characteristics Finite population correction (FPC) An adjustment applied to the variance due to sampling without replacement. Sampling without replacement from a finite population reduces sampling variability. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
13 Survey data characteristics Example: svyset for single-stage designs J. Pitblado (StataCorp) Survey Data Analysis DC / 44
14 Survey data characteristics Population 1000 J. Pitblado (StataCorp) Survey Data Analysis DC / 44
15 Survey data characteristics SRS sample 200 J. Pitblado (StataCorp) Survey Data Analysis DC / 44
16 Survey data characteristics Cluster sample 20 (208 obs) J. Pitblado (StataCorp) Survey Data Analysis DC / 44
17 Survey data characteristics Stratified sample 198 J. Pitblado (StataCorp) Survey Data Analysis DC / 44
18 Survey data characteristics Stratified-cluster sample 20 (215 obs) J. Pitblado (StataCorp) Survey Data Analysis DC / 44
19 Survey data characteristics Multistage syntax svyset psu [ weight ] [, strata(varname) fpc(varname) ] [ ssu [, strata(varname) fpc(varname) ] ] [ ssu [, strata(varname) fpc(varname) ] ]... Stages are delimited by SSU secondary/subsequent sampling units FPC is required at stage s for stage s + 1 to play a role in the linearized variance estimator J. Pitblado (StataCorp) Survey Data Analysis DC / 44
20 Survey data characteristics Poststratification A method for adjusting sampling weights, usually to account for underrepresented groups in the population. Adjusts weights to sum to the poststratum sizes in the population Reduces bias due to nonresponse and underrepresented groups Can result in smaller variance estimates Syntax svyset... poststrata(varname) postweight(varname) J. Pitblado (StataCorp) Survey Data Analysis DC / 44
21 Survey data characteristics Example: svyset for poststratification J. Pitblado (StataCorp) Survey Data Analysis DC / 44
22 Strata with a single sampling unit Big problem for variance estimation Consider a sample with only 1 observation svy reports missing standard error estimates by default Finding these lonely sampling units Use svydes: Describes the strata and sampling units Helps find strata with a single sampling unit J. Pitblado (StataCorp) Survey Data Analysis DC / 44
23 Strata with a single sampling unit Example: svydes J. Pitblado (StataCorp) Survey Data Analysis DC / 44
24 Strata with a single sampling unit Handling lonely sampling units 1 Drop them from the estimation sample. 2 svyset one of the ad-hoc adjustments in the singleunit() option. 3 Somehow combine them with other strata. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
25 Certainty units Sampling units that are guaranteed to be chosen by the design. Certainty units are handled by treating each one as its own stratum with an FPC of 1. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
26 Variance estimation Stata has three variance estimation methods for survey data: Linearization Balanced repeated replication The jackknife J. Pitblado (StataCorp) Survey Data Analysis DC / 44
27 Variance estimation Linearization A method for deriving a variance estimator using a first order Taylor approximation of the point estimator of interest. Syntax Foundation: Variance of the total estimator svyset... [ vce(linearized) ] Delta method Huber/White/robust/sandwich estimator J. Pitblado (StataCorp) Survey Data Analysis DC / 44
28 Variance estimation Total estimator Stratified two-stage design y hijk observed value from a sampled individual Strata: h = 1,..., L PSU: i = 1,..., n h SSU: j = 1,..., m hi Individual: k = 1,..., m hij Ŷ = w hijk y hijk V (Ŷ ) = n h (1 f h ) (y hi y n h 1 h ) 2 + h i m hi f h (1 f hi ) (y hij y m hi 1 hi ) 2 h i j J. Pitblado (StataCorp) Survey Data Analysis DC / 44
29 Variance estimation Total estimator Stratified two-stage design y hijk observed value from a sampled individual Strata: h = 1,..., L PSU: i = 1,..., n h SSU: j = 1,..., m hi Individual: k = 1,..., m hij Ŷ = w hijk y hijk V (Ŷ ) = n h (1 f h ) (y hi y n h 1 h ) 2 + h i m hi f h (1 f hi ) (y hij y m hi 1 hi ) 2 h i j J. Pitblado (StataCorp) Survey Data Analysis DC / 44
30 Variance estimation Total estimator Stratified two-stage design y hijk observed value from a sampled individual Strata: h = 1,..., L PSU: i = 1,..., n h SSU: j = 1,..., m hi Individual: k = 1,..., m hij Ŷ = w hijk y hijk V (Ŷ ) = n h (1 f h ) (y hi y n h 1 h ) 2 + h i m hi f h (1 f hi ) (y hij y m hi 1 hi ) 2 h i j J. Pitblado (StataCorp) Survey Data Analysis DC / 44
31 Variance estimation Example: svy: total J. Pitblado (StataCorp) Survey Data Analysis DC / 44
32 Variance estimation Linearized variance for regression models Model is fit using estimating equations. Ĝ() is a total estimator, use Taylor expansion to get V ( β). Ĝ(β) = j w j s j x j = 0 V ( β) = D V {Ĝ(β)} β= b β D J. Pitblado (StataCorp) Survey Data Analysis DC / 44
33 Variance estimation Linearized variance for regression models Model is fit using estimating equations. Ĝ() is a total estimator, use Taylor expansion to get V ( β). Ĝ(β) = j w j s j x j = 0 V ( β) = D V {Ĝ(β)} β= b β D J. Pitblado (StataCorp) Survey Data Analysis DC / 44
34 Variance estimation Example: svy: logit J. Pitblado (StataCorp) Survey Data Analysis DC / 44
35 Variance estimation Balanced repeated replication For designs with two PSUs in each of L strata. Compute replicates by dropping a PSU from each stratum. Find a balanced subset of the 2 L replicates. L r < L + 4 The replicates are used to estimate the variance. Syntax svyset... vce(brr) [ mse ] J. Pitblado (StataCorp) Survey Data Analysis DC / 44
36 Variance estimation BRR variance formulas θ point estimates θ (i) ith replicate of the point estimates θ (.) average of the replicates Default variance formula: V ( θ) = 1 r r { θ (i) θ (.) }{ θ (i) θ (.) } i=1 Mean squared error (MSE) formula: V ( θ) = 1 r { θ r (i) θ}{ θ (i) θ} i=1 J. Pitblado (StataCorp) Survey Data Analysis DC / 44
37 Variance estimation Example: svy brr: logit J. Pitblado (StataCorp) Survey Data Analysis DC / 44
38 Variance estimation The jackknife A replication method for variance estimation. Not restricted to a specific survey design. Syntax Delete-1 jackknife: drop 1 PSU Delete-k jackknife: drop k PSUs within a stratum svyset... vce(jackknife) [ mse ] J. Pitblado (StataCorp) Survey Data Analysis DC / 44
39 Variance estimation Jackknife variance formulas θ (h,i) replicate of the point estimates from stratum h, PSU i θ h average of the replicates from stratum h m h = (n h 1)/n h delete-1 multiplier for stratum h Default variance formula: L n h V ( θ) = (1 f h ) m h { θ (h,i) θ h }{ θ (h,i) θ h } h=1 Mean squared error (MSE) formula: L n h V ( θ) = (1 f h ) m h { θ (h,i) θ}{ θ (h,i) θ} h=1 i=1 i=1 J. Pitblado (StataCorp) Survey Data Analysis DC / 44
40 Variance estimation Example: svy jackknife: logit J. Pitblado (StataCorp) Survey Data Analysis DC / 44
41 Variance estimation Replicate weight variable A variable in the dataset that contains sampling weight values that were adjusted for resampling the data using BRR or the jackknife. Typically used to protect the privacy of the survey participants. Eliminate the need to svyset the strata and PSU variables. Syntax svyset... brrweight(varlist) svyset... jkrweight(varlist [,... multiplier(#) ] ) J. Pitblado (StataCorp) Survey Data Analysis DC / 44
42 Estimation for subpopulations Focus on a subset of the population Subpopulation variance estimation: Assumes the same survey design for subsequent data collection. The subpop() option. Restricted-sample variance estimation: Assumes the identified subset for subsequent data collection. Ignores the fact that the sample size is a random quantity. The if and in restrictions. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
43 Estimation for subpopulations Total from SRS data Data is y 1,..., y n and S is the subset of observations. { 1, if j S δ j (S) = 0, otherwise Subpopulation (or restricted-sample) total: n Ŷ S = δ j (S)w j y j j=1 Sampling weight and subpopulation size: w j = N n n, N S = δ j (S)w j = N n n S j=1 J. Pitblado (StataCorp) Survey Data Analysis DC / 44
44 Estimation for subpopulations Variance of a subpopulation total Sample n without replacement from a population comprised of the N S subpopulation values with N N S additional zeroes. V (ŶS) = ( 1 n ) N n n 1 n j=1 Variance of a restricted-sample total { δ j (S)y j 1 } 2 n ŶS Sample n S without replacement from the subpopulation of N S values. Ṽ (ŶS) = ( 1 n ) S ˆN S ns n S 1 n j=1 { δ j (S) y j 1 } 2 Ŷ n S S J. Pitblado (StataCorp) Survey Data Analysis DC / 44
45 Estimation for subpopulations Example: svy, subpop() J. Pitblado (StataCorp) Survey Data Analysis DC / 44
46 Summary 1 Use svyset to specify the survey design for your data. 2 Use svydes to find strata with a single PSU. 3 Choose your variance estimation method; you can svyset it. 4 Use the svy prefix with estimation commands. 5 Use subpop() instead of if and in. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
47 References Levy, P. and S. Lemeshow Sampling of Populations. 3rd ed. New York: Wiley. StataCorp Survey Data Reference Manual: Release 11. College Station, TX: StataCorp LP. J. Pitblado (StataCorp) Survey Data Analysis DC / 44
Survey Data Analysis in Stata
Survey Data Analysis in Stata Jeff Pitblado Associate Director, Statistical Software StataCorp LP 2009 Canadian Stata Users Group Meeting Outline 1 Types of data 2 2 Survey data characteristics 4 2.1 Single
The SURVEYFREQ Procedure in SAS 9.2: Avoiding FREQuent Mistakes When Analyzing Survey Data ABSTRACT INTRODUCTION SURVEY DESIGN 101 WHY STRATIFY?
The SURVEYFREQ Procedure in SAS 9.2: Avoiding FREQuent Mistakes When Analyzing Survey Data Kathryn Martin, Maternal, Child and Adolescent Health Program, California Department of Public Health, ABSTRACT
Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.
Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is
Youth Risk Behavior Survey (YRBS) Software for Analysis of YRBS Data
Youth Risk Behavior Survey (YRBS) Software for Analysis of YRBS Data CONTENTS Overview 1 Background 1 1. SUDAAN 2 1.1. Analysis capabilities 2 1.2. Data requirements 2 1.3. Variance estimation 2 1.4. Survey
Chapter 11 Introduction to Survey Sampling and Analysis Procedures
Chapter 11 Introduction to Survey Sampling and Analysis Procedures Chapter Table of Contents OVERVIEW...149 SurveySampling...150 SurveyDataAnalysis...151 DESIGN INFORMATION FOR SURVEY PROCEDURES...152
Complex Survey Design Using Stata
Complex Survey Design Using Stata 2010 This document provides a basic overview of how to handle complex survey design using Stata. Probability weighting and compensating for clustered and stratified samples
Software for Analysis of YRBS Data
Youth Risk Behavior Surveillance System (YRBSS) Software for Analysis of YRBS Data June 2014 Where can I get more information? Visit www.cdc.gov/yrbss or call 800 CDC INFO (800 232 4636). CONTENTS Overview
Sampling Error Estimation in Design-Based Analysis of the PSID Data
Technical Series Paper #11-05 Sampling Error Estimation in Design-Based Analysis of the PSID Data Steven G. Heeringa, Patricia A. Berglund, Azam Khan Survey Research Center, Institute for Social Research
INTRODUCTION TO SURVEY DATA ANALYSIS THROUGH STATISTICAL PACKAGES
INTRODUCTION TO SURVEY DATA ANALYSIS THROUGH STATISTICAL PACKAGES Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-110012 1. INTRODUCTION A sample survey is a process for collecting
New SAS Procedures for Analysis of Sample Survey Data
New SAS Procedures for Analysis of Sample Survey Data Anthony An and Donna Watts, SAS Institute Inc, Cary, NC Abstract Researchers use sample surveys to obtain information on a wide variety of issues Many
Survey Analysis: Options for Missing Data
Survey Analysis: Options for Missing Data Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD Abstract A common situation researchers working with survey data face is the analysis of missing
National Longitudinal Study of Adolescent Health. Strategies to Perform a Design-Based Analysis Using the Add Health Data
National Longitudinal Study of Adolescent Health Strategies to Perform a Design-Based Analysis Using the Add Health Data Kim Chantala Joyce Tabor Carolina Population Center University of North Carolina
Chapter XXI Sampling error estimation for survey data* Donna Brogan Emory University Atlanta, Georgia United States of America.
Chapter XXI Sampling error estimation for survey data* Donna Brogan Emory University Atlanta, Georgia United States of America Abstract Complex sample survey designs deviate from simple random sampling,
STATA SURVEY DATA REFERENCE MANUAL
STATA SURVEY DATA REFERENCE MANUAL RELEASE 13 A Stata Press Publication StataCorp LP College Station, Texas Copyright c 1985 2013 StataCorp LP All rights reserved Version 13 Published by Stata Press, 4905
Chapter 19 Statistical analysis of survey data. Abstract
Chapter 9 Statistical analysis of survey data James R. Chromy Research Triangle Institute Research Triangle Park, North Carolina, USA Savitri Abeyasekera The University of Reading Reading, UK Abstract
Variance Estimation Guidance, NHIS 2006-2014 (Adapted from the 2006-2014 NHIS Survey Description Documents) Introduction
June 9, 2015 Variance Estimation Guidance, NHIS 2006-2014 (Adapted from the 2006-2014 NHIS Survey Description Documents) Introduction The data collected in the NHIS are obtained through a complex, multistage
Using Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses
Using Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses G. Gordon Brown, Celia R. Eicheldinger, and James R. Chromy RTI International, Research Triangle Park, NC 27709 Abstract
Analysis of complex survey samples.
Analysis of complex survey samples. Thomas Lumley Department of Biostatistics University of Washington April 15, 2004 Abstract I present software for analysing complex survey samples in R. The sampling
The Research Data Centres Information and Technical Bulletin
Catalogue no. 12-002 X No. 2014001 ISSN 1710-2197 The Research Data Centres Information and Technical Bulletin Winter 2014, vol. 6 no. 1 How to obtain more information For information about this product
ESTIMATION OF THE EFFECTIVE DEGREES OF FREEDOM IN T-TYPE TESTS FOR COMPLEX DATA
m ESTIMATION OF THE EFFECTIVE DEGREES OF FREEDOM IN T-TYPE TESTS FOR COMPLEX DATA Jiahe Qian, Educational Testing Service Rosedale Road, MS 02-T, Princeton, NJ 08541 Key Words" Complex sampling, NAEP data,
Workshop on Using the National Survey of Children s s Health Dataset: Practical Applications
Workshop on Using the National Survey of Children s s Health Dataset: Practical Applications Julian Luke Stephen Blumberg Centers for Disease Control and Prevention National Center for Health Statistics
A Guide to Survey Analysis in GenStat. by Steve Langton. Defra Environmental Observatory, 1-2 Peasholme Green, York YO1 7PX, UK.
Survey Analysis A Guide to Survey Analysis in GenStat by Steve Langton Defra Environmental Observatory, 1-2 Peasholme Green, York YO1 7PX, UK. GenStat is developed by VSN International Ltd, in collaboration
An Introduction to Secondary Data Analysis
research methodology series An Introduction to Secondary Data Analysis Natalie Koziol, MA CYFS Statistics and Measurement Consultant Ann Arthur, MS CYFS Statistics and Measurement Consultant Outline Overview
6 Regression With Survey Data From Complex Samples
6 Regression With Survey Data From Complex Samples Secondary analysis of data from large national surveys figures prominently in social science and public health research, and these surveys use complex
Statistics 522: Sampling and Survey Techniques. Topic 5. Consider sampling children in an elementary school.
Topic Overview Statistics 522: Sampling and Survey Techniques This topic will cover One-stage Cluster Sampling Two-stage Cluster Sampling Systematic Sampling Topic 5 Cluster Sampling: Basics Consider sampling
Elementary Statistics
Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,
Introduction to Sampling. Dr. Safaa R. Amer. Overview. for Non-Statisticians. Part II. Part I. Sample Size. Introduction.
Introduction to Sampling for Non-Statisticians Dr. Safaa R. Amer Overview Part I Part II Introduction Census or Sample Sampling Frame Probability or non-probability sample Sampling with or without replacement
Comparison of Variance Estimates in a National Health Survey
Comparison of Variance Estimates in a National Health Survey Karen E. Davis 1 and Van L. Parsons 2 1 Agency for Healthcare Research and Quality, 540 Gaither Road, Rockville, MD 20850 2 National Center
Guidelines for Analyzing Add Health Data. Ping Chen Kim Chantala
Guidelines for Analyzing Add Health Data Ping Chen Kim Chantala Carolina Population Center University of North Carolina at Chapel Hill Last Update: March 2014 Table of Contents Overview... 3 Understanding
Accounting for complex survey design in modeling usual intake. Kevin W. Dodd, PhD National Cancer Institute
Accounting for complex survey design in modeling usual intake Kevin W. Dodd, PhD National Cancer Institute Slide 1 Hello and welcome to today s webinar, the fourth in the Measurement Error Webinar Series.
IPUMS User Note: Issues Concerning the Calculation of Standard Errors (i.e., variance estimation)using IPUMS Data Products
IPUMS User Note: Issues Concerning the Calculation of Standard Errors (i.e., variance estimation)using IPUMS Data Products by Michael Davern and Jeremy Strief Producing accurate standard errors is essential
1. Examples using data from the Survey of industrial and service firms
Examples of data use: Stata platform 1 To obtain the results of calculations more rapidly, you should limit the number of variables included in the datasets used. Please remember that permanent datasets
6. CONDUCTING SURVEY DATA ANALYSIS
49 incorporating adjustments for the nonresponse and poststratification. But such weights usually are not included in many survey data sets, nor is there the appropriate information for creating such replicate
From the help desk: hurdle models
The Stata Journal (2003) 3, Number 2, pp. 178 184 From the help desk: hurdle models Allen McDowell Stata Corporation Abstract. This article demonstrates that, although there is no command in Stata for
Robert L. Santos, Temple University
n ONE APPROACH TO OVERSAMPLING BLACKS AND HISPANICS: THE NATIONAL ALCOHOL SURVEY Robert L. Santos, Temple University I. Introduction The 1984 National Alcohol Survey (NAS) is the first national household
From the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
Sample design for educational survey research
Quantitative research methods in educational planning Series editor: Kenneth N.Ross Module Kenneth N. Ross 3 Sample design for educational survey research UNESCO International Institute for Educational
Life Table Analysis using Weighted Survey Data
Life Table Analysis using Weighted Survey Data James G. Booth and Thomas A. Hirschl June 2005 Abstract Formulas for constructing valid pointwise confidence bands for survival distributions, estimated using
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
Statistical and Methodological Issues in the Analysis of Complex Sample Survey Data: Practical Guidance for Trauma Researchers
Journal of Traumatic Stress, Vol. 2, No., October 2008, pp. 440 447 ( C 2008) Statistical and Methodological Issues in the Analysis of Complex Sample Survey Data: Practical Guidance for Trauma Researchers
Measuring change with the Belgian Survey on Income and Living Conditions (SILC): taking account of the sampling variance
Measuring change with the Belgian Survey on Income and Living Conditions (SILC): taking account of the sampling variance Report Financed by FPS Social Security Belgium (Ref.: 2013/BOEP/UA-EU2020) December
Models for Longitudinal and Clustered Data
Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations
Multilevel Modeling of Complex Survey Data
Multilevel Modeling of Complex Survey Data Sophia Rabe-Hesketh, University of California, Berkeley and Institute of Education, University of London Joint work with Anders Skrondal, London School of Economics
Optimization of sampling strata with the SamplingStrata package
Optimization of sampling strata with the SamplingStrata package Package version 1.1 Giulio Barcaroli January 12, 2016 Abstract In stratified random sampling the problem of determining the optimal size
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
Stat 9100.3: Analysis of Complex Survey Data
Stat 9100.3: Analysis of Complex Survey Data 1 Logistics Instructor: Stas Kolenikov, [email protected] Class period: MWF 1-1:50pm Office hours: Middlebush 307A, Mon 1-2pm, Tue 1-2 pm, Thu 9-10am.
CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,
Automated Analysis of Survey Data using STATA. Vinay A. Patel Deputy Manager National Dairy Development Board
Automated Analysis of Survey Data using STATA Vinay A. Patel Deputy Manager National Dairy Development Board 1. Background 2. Prior to using STATA 3. Survey Data Characteristics 4. Estimation of Important
Multivariate Analysis of Health Survey Data
10 Multivariate Analysis of Health Survey Data The most basic description of health sector inequality is given by the bivariate relationship between a health variable and some indicator of socioeconomic
Descriptive Methods Ch. 6 and 7
Descriptive Methods Ch. 6 and 7 Purpose of Descriptive Research Purely descriptive research describes the characteristics or behaviors of a given population in a systematic and accurate fashion. Correlational
NEPS Working Papers. Replication Weights for the Cohort Samples of Students in Grade 5 and 9 in the National Educational Panel Study
NEPS Working Papers Sabine Zinn Replication Weights for the Cohort Samples of Students in Grade 5 and 9 in the National Educational Panel Study NEPS Working Paper No. 27 Bamberg, July 2013 Working Papers
Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
Approaches for Analyzing Survey Data: a Discussion
Approaches for Analyzing Survey Data: a Discussion David Binder 1, Georgia Roberts 1 Statistics Canada 1 Abstract In recent years, an increasing number of researchers have been able to access survey microdata
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
CLUSTER SAMPLE SIZE CALCULATOR USER MANUAL
1 4 9 5 CLUSTER SAMPLE SIZE CALCULATOR USER MANUAL Health Services Research Unit University of Aberdeen Polwarth Building Foresterhill ABERDEEN AB25 2ZD UK Tel: +44 (0)1224 663123 extn 53909 May 1999 1
A General Approach to Variance Estimation under Imputation for Missing Survey Data
A General Approach to Variance Estimation under Imputation for Missing Survey Data J.N.K. Rao Carleton University Ottawa, Canada 1 2 1 Joint work with J.K. Kim at Iowa State University. 2 Workshop on Survey
Comparing Alternate Designs For A Multi-Domain Cluster Sample
Comparing Alternate Designs For A Multi-Domain Cluster Sample Pedro J. Saavedra, Mareena McKinley Wright and Joseph P. Riley Mareena McKinley Wright, ORC Macro, 11785 Beltsville Dr., Calverton, MD 20705
Problem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;
Confidence Intervals and Loss Functions
A.C.E. Revision II PP-42 December 31, 2002 Confidence Intervals and Loss Functions Mary H. Mulry and Randal S. ZuWallack Statistical Research Division and Decennial Statistical Studies Division i CONTENTS
Missing Data & How to Deal: An overview of missing data. Melissa Humphries Population Research Center
Missing Data & How to Deal: An overview of missing data Melissa Humphries Population Research Center Goals Discuss ways to evaluate and understand missing data Discuss common missing data methods Know
Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: SIMPLIS Syntax Files
Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng LISREL for Windows: SIMPLIS Files Table of contents SIMPLIS SYNTAX FILES... 1 The structure of the SIMPLIS syntax file... 1 $CLUSTER command... 4
MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
Statistics 104: Section 6!
Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: [email protected] Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC
Confidence Intervals in Public Health
Confidence Intervals in Public Health When public health practitioners use health statistics, sometimes they are interested in the actual number of health events, but more often they use the statistics
Sampling Probability and Inference
PART II Sampling Probability and Inference The second part of the book looks into the probabilistic foundation of statistical analysis, which originates in probabilistic sampling, and introduces the reader
SSDA.Analysis - A Class Library for Analysis of Sample Survey Data
Vol.2, Issue.1, Jan-Feb 2012 pp-242-246 ISSN: 2249-6645 SSDA.Analysis - A Class Library for Analysis of Sample Survey Data Anu Sharma 1 and S. B. Lal 1 1 (Indian Agricultural Statistics Research Institute)
Why Sample? Why not study everyone? Debate about Census vs. sampling
Sampling Why Sample? Why not study everyone? Debate about Census vs. sampling Problems in Sampling? What problems do you know about? What issues are you aware of? What questions do you have? Key Sampling
National Endowment for the Arts. A Technical Research Manual
2012 SPPA PUBLIC-USE DATA FILE USER S GUIDE A Technical Research Manual Prepared by Timothy Triplett Statistical Methods Group Urban Institute September 2013 Table of Contents Introduction... 3 Section
Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer
Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer Patricia A. Berglund, Institute for Social Research - University of Michigan Wisconsin and Illinois SAS User s Group June 25, 2014 1 Overview
Clustering in the Linear Model
Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple
Methods for Analyzing Longitudinal Health Survey Data. Theory and Applications
Methods for Analyzing Longitudinal Health Survey Data Theory and Applications Mary E. Thompson University of Waterloo Acknowledgments Acknowledgments The support of the following agencies is acknowledged:
Comparison of Imputation Methods in the Survey of Income and Program Participation
Comparison of Imputation Methods in the Survey of Income and Program Participation Sarah McMillan U.S. Census Bureau, 4600 Silver Hill Rd, Washington, DC 20233 Any views expressed are those of the author
STATISTICAL DATA ANALYSIS
STATISTICAL DATA ANALYSIS INTRODUCTION Fethullah Karabiber YTU, Fall of 2012 The role of statistical analysis in science This course discusses some statistical methods, which involve applying statistical
Northumberland Knowledge
Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about
Linear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples
Forecasting in STATA: Tools and Tricks
Forecasting in STATA: Tools and Tricks Introduction This manual is intended to be a reference guide for time series forecasting in STATA. It will be updated periodically during the semester, and will be
Analysis of Regression Based on Sampling Weights in Complex Sample Survey: Data from the Korea Youth Risk Behavior Web- Based Survey
, pp.65-74 http://dx.doi.org/10.14257/ijunesst.2015.8.10.07 Analysis of Regression Based on Sampling Weights in Complex Sample Survey: Data from the Korea Youth Risk Behavior Web- Based Survey Haewon Byeon
From the help desk: Swamy s random-coefficients model
The Stata Journal (2003) 3, Number 3, pp. 302 308 From the help desk: Swamy s random-coefficients model Brian P. Poi Stata Corporation Abstract. This article discusses the Swamy (1970) random-coefficients
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
The construction and use of sample design variables in EU SILC. A user s perspective
The construction and use of sample design variables in EU SILC. A user s perspective Report prepared for Eurostat Abstract: Currently, EU SILC is the single most important data source on income and living
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
Bootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
Machine Learning Methods for Causal Effects. Susan Athey, Stanford University Guido Imbens, Stanford University
Machine Learning Methods for Causal Effects Susan Athey, Stanford University Guido Imbens, Stanford University Introduction Supervised Machine Learning v. Econometrics/Statistics Lit. on Causality Supervised
University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.
University of Ljubljana Doctoral Programme in Statistics ethodology of Statistical Research Written examination February 14 th, 2014 Name and surname: ID number: Instructions Read carefully the wording
