An Introduction to Secondary Data Analysis
|
|
|
- Emerald Welch
- 10 years ago
- Views:
Transcription
1 research methodology series An Introduction to Secondary Data Analysis Natalie Koziol, MA CYFS Statistics and Measurement Consultant Ann Arthur, MS CYFS Statistics and Measurement Consultant
2 Outline Overview of Secondary Data Analysis Understanding & Preparing Secondary Data Brief Overview of Sampling Design Analyzing Secondary Data Illustration of Secondary Data Analysis Other Logistical Considerations
3 Overview of Secondary Data Analysis
4 What is Secondary Data Analysis? In the broadest sense, analysis of data collected by someone else (p. ix; Boslaugh, 2007) Analysis of secondary data, where secondary data can include any data that are examined to answer a research question other than the question(s) for which the data were initially collected (p. 3; Vartanian, 2010) In contrast to primary data analysis in which the same individual/team of researchers designs, collects, and analyzes the data
5 Local Examples of Research Involving Secondary Data Analysis Starting Off Right: Effects of Rurality on Parent s Involvement in Children s Early Learning (Sue Sheridan, PPO) Data from the Early Childhood Longitudinal Study Birth Cohort (ECLS-B) were used to examine the influence of setting on parental involvement in preschool and the effects of involvement on Kindergarten school readiness. Testing Thresholds of Quality Care on Child Outcomes Globally & in Subgroups: Secondary Analysis of QUINCE and Early Head Start Data (Helen Raikes, PPO) Data from two secondary datasets were used to examine the potentially non-linear relationship between quality of child care and children s development
6 What are Secondary Data? Come from many sources Large government-funded datasets (the focus of this presentation) University/college records Statewide or district-level K-12 school records Journal supplements Authors websites Etc.! Available for a seemingly unlimited number of subject areas Quantitative (the focus of this presentation) and qualitative Restricted and public-use Direct (e.g., biomarker data) and indirect observation (e.g., self-report)
7 Where Can I Find Secondary Data? Searching for secondary datasets: Inter-University Consortium for Political and Social Research Data.gov National Center for Education Statistics U.S. Census Bureau Simple Online Data Archive for Population Studies (SodaPop)
8 Examples of Large Secondary Datasets for Education & Social Sciences Research Common Core of Data (CCD) Current Population Survey (CPS) Early Childhood Longitudinal Study (ECLS): Birth (ECLS-B) and Kindergarten (ECLS-K) Cohort General Social Survey (GSS) Head Start Family and Child Experiences Survey (FACES) Monitoring the Future (MTF) National Assessment of Educational Progress (NAEP) National Education Longitudinal Study (NELS) National Household Education Surveys (NHES) National Longitudinal Study of Adolescent Health (Add Health) National Longitudinal Survey of Youth (NLSY) National Survey of American Families (NSAF) National Survey of Child and Adolescent Well-Being (NSCAW) National Survey of Families and Households (NSFH) NICHD Study of Early Child Care and Youth Development (SECCYD) Programme for International Student Assessment (PISA) Progress in International Reading Literacy Study (PIRLS) Trends in International Mathematics and Science Study (TIMSS) U.S. Panel Study of Income Dynamics (PSID): Child Development Supplement (CDS)
9 Advantages of Secondary Data Analysis Study design and data collection already completed Saves time and money Access to international and cross-historical data that would otherwise take several years and millions of dollars to collect Ideal for use in classroom examples, semester projects, masters theses, dissertations, supplemental studies Data may be of higher quality Studies funded by the government generally involve larger samples that are more representative of the target population (greater external validity!) Oversampling of low prevalence groups/behaviors allows for increased statistical precision Datasets often contain considerable breadth (thousands of variables)
10 Disadvantages of Secondary Data Analysis Study design and data collection already completed Data may not facilitate particular research question Information regarding study design and data collection procedures may be scarce Data may potentially lack depth (the greater the breadth the harder it is to measure any one construct in depth) Constructs may be operationally defined by a single survey item or a subset of test items which can lead to reliability and validity concerns Post hoc attempts to construct measurement models may be unsuccessful (survey items may not hang together) Certain fields or departments (e.g., experimental programs) may place less value on secondary data analysis May require knowledge of survey statistics/methods which is not generally provided by basic graduate statistics courses
11 Understanding & Preparing Secondary Data
12 Understanding Secondary Data Familiarize yourself with the original study and data! Read all User s/technical manuals To whom are the results generalizable? E.g., ECLS-B analyses involving data from kindergarten wave can be used to make inferences about children born in the U.S. in 2001 as they enter kindergarten (not to make inferences about U.S. kindergarteners) How are missing data handled? What are the appropriate analysis weights? What is the appropriate method (and what variables are necessary) for computing adjusted standard errors? What composite variables are available and how are they constructed?
13 Understanding Secondary Data Familiarize yourself with the original study and data! Examine questionnaires and interview protocols when available Identify skip patterns to determine coding of missing data; example from the ECLS-B preschool parent interview:
14 Understanding Secondary Data Familiarize yourself with the original study and data! Examine questionnaires and interview protocols when available For examining trends or growth, determine whether the same construct is being measured across time Interview questions may be modified across time Example from an Opinion Research Business (ORB) survey on conflict deaths in Iraq (Spagat & Dougherty, 2010): Yes/No: There has been a murder of a member of my family/relative (February 2007) Yes/No: There has been a death as a result of conflict/violence of a household member (August 2007) Respondents (e.g., parent/guardian) may change over time Different scales may be used across time (e.g., different cognitive measures are used for infants and kindergarteners)
15 Understanding Secondary Data Familiarize yourself with the original study and data! Check study website frequently for errors and/or updates Example from Ongoing panel (i.e. longitudinal) studies generally provide new datasets after each wave of data collection Always use the most up-to-date file! Scores developed using item response theory may be recalibrated at each wave to permit investigation of growth
16 Preparing Secondary Data Document everything! Save all syntax Create an abridged codebook describing the original and recoded variables of interest Step 1: Transfer all potential data of interest to a new file in preferred base program Electronic codebooks (ECBs) greatly facilitate this process Never alter the original datafile! Step 2: Address missing data Identify/label missing values in software program When possible, use knowledge of skip patterns to recode missing data as meaningful values Select method for handling missing data (e.g., multiple imputation, full-information maximum likelihood [FIML])
17 Preparing Secondary Data Step 3: Recode variables Reverse code negatively worded items if creating scale scores Dummy code dichotomous variables into values of 0, 1 (original dataset may use values of 1, 2) Recode other categorical variables (e.g., dummy or effect coding) Combine separate but like variables E.g., ECLS-B contained 2 kindergarten waves (only 75% of children were in kindergarten in 2006); to analyze kindergarteners, need to combine variables from waves 4 and 5 using if-else commands Recode variables so that all responses are based on the same units Example from ECLS-B Preschool Center Director Questionnaire:
18 Preparing Secondary Data Step 4: Create new variables May need to recreate composite variables if disagree with original conceptualization E.g., An SES variable in the original datafile may be constructed from income and parent education variables; secondary researcher may want to construct new SES variable Psychometric work Create scores from individual items using factor analysis or item response theory Unfortunately, individual survey items do not always hang together To avoid potentially biased variance estimates, (a) incorporate measurement models directly into analysis, or (b) output plausible values (e.g., Mislevy et al., 1992)
19 A Brief Overview of Sampling Design
20 Sampling Design Ideally, we want a sample that is perfectly representative of our target population (we want to use sample results to make inferences, or generalizations, about a larger population) Types of probability sampling Simple random sampling Randomly sample individuals Stratified sampling Divide population into strata (groups); within each stratum, randomly sample individuals Cluster sampling Population contains naturally occurring groups (e.g., classrooms); randomly sample groups
21 Sampling Design Simple Random Sampling Cluster Sampling Class 1 Class 2 Class 3 Class 4 Class 5 Class 6 Class 7 Class 8 Class 9 Stratified Sampling Grade 1 Grade 2 Grade 3
22 Sampling Design Simple random sampling Assumed when performing conventional statistical analyses No guarantee of a representative sample May not be feasible (e.g., costly, impractical) Stratified sampling More control over representativeness Allows for intentional oversampling which permits greater statistical precision (i.e., decreases standard errors) Cluster sampling May be necessary (e.g., educational interventions may only be possible at the classroom level) Decreases statistical precision (individuals within groups tend to be more similar so we have less unique information)
23 Sampling Design Statistical analyses should reflect sampling design Point estimates (e.g., means) should be adjusted to take into account unequal sampling probabilities Standard errors should be adjusted to ensure correct level of confidence in point estimates Different statistical approaches exist for handling complex sampling designs Multilevel modeling Application of weights and alternative methods of variance estimation Common approach when analyzing large secondary datasets due to complexity of sampling design Combination of approaches
24 Sampling Design Sampling weights The reciprocal of the inclusion probability the number of population units represented by unit i (p. 39; Lohr, 2010) w i = 1 π i where π i is the probability that unit i is in the sample Necessary for obtaining accurate/generalizable point estimates Construction of sampling weights is complex (based on multiple stages of sampling, non-response, post-stratification, etc.) Thankfully, large secondary datasets generally have preconstructed weights However, multiple weights may exist for any one dataset Appropriate selection and application of weights is the responsibility of the secondary data analyst!
25 Sampling Design Variance estimation Alternative estimation necessary for computing correct standard errors which influence tests of statistical significance Does not influence point estimates Multiple approaches Taylor series linearization method Involves specifying cluster and stratum variables Replication methods Balanced repeated replication (BRR) Jackknife replication (JK1, JK2, JKn) Choice of method depends on sampling design Involves specifying series of replicate weights Other methods (e.g., use of generalized variance functions) Use the approach recommended in the User s manual
26 Analyzing Secondary Data
27 Analyzing Secondary Data Based on research question, identify appropriate statistical analysis Select software package that will implement analysis and account for complex sampling Examine unweighted descriptive statistics to identify coding errors and determine adequacy of sample size Identify weights Make sure missing weights are set to 0 Identify variance estimation method (and corresponding variables) Conduct diagnostic analyses (identify outliers, non-normality, etc.) Conduct primary analysis and interpret results!
28 Analyzing Secondary Data Other considerations Inclusion of covariates E.g., Age at time of child assessment (not always possible to collect data at target age) Analysis of subpopulations (see Lohr, 2010 for more information) Additional specifications necessary (e.g., domain, subpop ) Don t delete cases Protecting confidentiality Some restricted-use datasets require unweighted sample sizes to be rounded and/or estimates based on small sample sizes to be suppressed Analysis involving multiple imputation or plausible values (Asparouhov & Muthén, 2010; Enders, 2010)
29 Analyzing Secondary Data Software Software specifically developed for analyzing complex survey data Generally free Generally user-friendly but may lack flexibility (limited to certain datasets, limited statistical analyses) Useful for initial data exploration (particularly restricted data) Examples NCES tools for computing descriptive statistics, regressions PowerStats: Data Analysis System (DAS): AM Statistical Software Descriptives, regression, some latent variable estimation Relatively easy to incorporate plausible values
30 Analyzing Secondary Data Software General-purpose software that can account for complex sampling Can be expensive (R is free) Generally syntax-based rather than drop-down menu More flexible Examples: SAS (certain analyses require SUDAAN add-on) Stata HTML/default/viewer.htm#statug_introsamp_sect001.htm
31 Analyzing Secondary Data Software General-purpose software that can account for complex sampling Examples: SPSS (requires Complex Samples add-on) R Mplus dex.jsp?topic=%2fcom.ibm.spss.statistics.help%2fsyn_csgl m_.htm Users%20Guide%20v6.pdf (pp , 521) Other software options
32 Example SAS Syntax *Taylor series linearization method; PROC SURVEYMEANS data=yourdata varmethod=taylor; strata stratavar; cluster clustervar; var varofinterest; weight wtvar; run; PROC SURVEYREG data=yourdata varmethod=taylor; strata stratavar; cluster clustervar; model outcomevar=predictorvar; weight wtvar; run; *Jackknife method; PROC SURVEYMEANS data=yourdata varmethod=jk; repweights repwt1-repwtn; var varofinterest; weight wtvar; run; PROC SURVEYREG data=yourdata varmethod=jk; model outcomevar=predictorvar; repweights repwt1-repwtn; weight wtvar; run; *Jackknife syntax varies by type (JK 1, 2, or n)
33 Example Stata Syntax /*Taylor series linearization method*/ svyset [pweight = wtvar], psu(clustervar) strata(stratavar) vce(linearized) svy: mean varofinterest svy: regress outcomevar predictorvar /*Jackknife method*/ svyset [pweight = wtvar], jkrw(repwt1 - repwtn) vce(jack) mse svy: mean varofinterest svy: regress outcomevar predictorvar *Jackknife syntax varies by type (JK 1, 2, or n)
34 Example R Syntax #Taylor series linearization method library(survey) design1 <- svydesign(id=~clustervar, strata=~stratavar, weights=~wtvar, data=yourdatafile) svymean(~varofinterest, design.1) regmodel <- svyglm(outcome~predictor, design=design.1) #Jackknife method library(survey) design.2 <- svrepdesign(repweights=yourdatafile[,repwt1:repwtn], type="jk1", weights=yourdatafile$wtvar) svymean(~varofinterest, design.2) regmodel <- svyglm(outcome~predictor, design=design.2) *Jackknife syntax varies by type (JK 1, 2, or n)
35 Example Mplus Syntax!Taylor series linearization method; TITLE: Example complex sample syntax; DATA: FILE=yourfile; VARIABLE: NAMES=outcomevar predictorvar stratavar clustervar wtvar; USEVARIABLES=outcomevar predictorvar stratavar clustervar wtvar; STRATIFICATION=stratavar; CLUSTER=clustervar; WEIGHT=wtvar; ANALYSIS: TYPE=COMPLEX; MODEL: outcomevar ON predictorvar; OUTPUT: SAMPSTAT;!Jackknife method; TITLE: Example complex sample syntax; DATA: FILE=yourfile; VARIABLE: NAMES=outcomevar predictorvar wtvar repwt1-repwtn; USEVARIABLES=outcomevar predictorvar wtvar repwt1-repwtn; WEIGHT=wtvar; REPWEIGHTS=repwt1-repwtn; ANALYSIS: TYPE=COMPLEX; REPSE=JACKKNIFE1; MODEL: outcomevar ON predictorvar; OUTPUT: SAMPSTAT; *Jackknife syntax varies by type (JK 1, 2, or n)
36 An Illustration of Secondary Data Analysis
37 Illustration Early Childhood Longitudinal Study Kindergarten Class of (ECLS-K) Two research questions: Does the number of children s books in the home predict a child s Tell Stories score as measured in the fall of kindergarten? What is the average trajectory of math achievement as measured in kindergarten through 8 th grade?
38 Illustration: Download Data & Import into Base Program *Download data from Change directory to match location of datafile
39 Illustration: Identify Weights *Descriptions from ECLS-K User s Manual
40 Illustration: Determine Variance Estimation Method *Description from ECLS-K User s Manual
41 LIBNAME spsslib SPSS "C:\Users\nkoziol\Documents\CYFS\ECLSK_SAS_File.por"; Illustration: DATA work.eclsk; Research Question 1 RUN; SET spsslib.spssfile; Conduct simple linear regression in SAS PROC SURVEYMEANS data=eclsk varmethod=jk; LIBNAME repweights spsslib SPSS c1pw1-c1pw90 "C:\Users\nkoziol\Documents\CYFS\ECLSK_SAS_File.por"; / jkcoefs = ; DATA work.eclsk; var c1scsto p1chlboo; weight SET spsslib.spssfile; c1pw0; run; RUN; Necessary for PROC PROC SURVEYREG SURVEYMEANS data=eclsk data=eclsk varmethod=jk varmethod=jk; ; JK2 method model repweights c1scsto=p1chlboo; c1pw1-c1pw90 / jkcoefs = ; repweights var c1scsto c1pw1-c1pw90 p1chlboo; / jkcoefs = ; weight weight c1pw0; c1pw0; run; run; PROC Child SURVEYREG Tell data=eclsk Number of varmethod=jk ; Stories model variable c1scsto=p1chlboo; children s books repweights c1pw1-c1pw90 / jkcoefs = ; weight c1pw0; run;
42 Illustration: Research Question 1 Results with weighting only (no variance adjustment) Results with no weighting and no variance adjustment *Results are for illustration purposes only; please do not cite or distribute.
43 Illustration: Research Question 2 Conduct 2 nd order (quadratic) latent growth model in Mplus
44 Illustration: Research Question 2 *Results are for illustration purposes only; please do not cite or distribute.
45 Illustration: Research Question 2
46 Other Considerations
47 Training Opportunities 2-5 day government- or other institution-sponsored workshops AERA Institute on Statistical Analysis & AERA Faculty Institute y.html cfly.html IES sponsored workshops AIR ICPSR 1 week offerings 1 day pre-annual meeting workshops E.g., Early Childhood Surveys at NCES: The ECLS and NHES Data Users Workshop (2011 SRCD biennial meeting)
48 Funding Opportunities AERA Dissertation and Research Grants NIH Grants E.g., R40 Maternal & Child Health Research Secondary Data Analysis Studies Grants R21 grant mechanism (exploration) IES Grants (exploration; Goal 1) NAEP Secondary Analysis Grants RFP s that encourage use of secondary data, for example, the Social and Behavioral Context for Academic Learning RFP AIR Grants
49 References AIR, NCES, & NSF (2011). National Summer Data Policy Institute training materials. Washington, D.C. Asparouhov, T., & Muthén, B. (2010). Plausible values for latent variables using Mplus. Technical Report. Boslaugh, S. (2007). Secondary data sources for public health: A practical guide. New York, NY: Cambridge. Enders, C. K. (2010). Applied missing data analysis. New York, NY: Guilford. Lohr, S. L. (2010). Sampling: Design and analysis (2 nd Ed.). Boston, MA: Brooks/Cole. Kiecolt, K. J., & Nathan, L. E. (1985). Secondary analysis of survey data. Newbury Park, CA: SAGE. Kish, L. (1965). Survey sampling. New York: Wiley. McCall, R. B., & Appelbaum, M. I. (1991). Some issues of conducting secondary analyses. Developmental Psychology, 27, Mislevy, R. J., Beaton, A. E., Kaplan, B., & Sheehan, K. M. (1992). Estimating population characteristics from sparse matrix samples of item responses. Journal of Educational Measurement, 29, NCES (2011). ECLS-B database training seminar materials. Washington, D.C.
50 References Spagat, M., & Dougherty, J. (2010). Conflict deaths in Iraq: A methodological critique of the ORB survey estimate. Survey Research Methods, 4, Thomas, S. L., & Heck, R. H. (2001). Analysis of large-scale secondary data in higher education research: Potential perils associated with complex sampling designs. Research in Higher Education, 42, Tourangeau, K., Nord, C., Lê, T., Sorongon, A. G., & Najarian, M. (2009). Early Childhood Longitudinal Study, Kindergarten Class of (ECLS-K), Combined User s Manual for the ECLS-K Eighth-Grade and K-8 Full Sample Data Files and Electronic Codebooks (NCES ). National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education. Washington, DC. Trzesniewski, K. H., Donnellan, M. B., & Lucas, R. E. (Eds) (2011). Secondary data analysis: An introduction for psychologists. Washington, D.C.: APA. Vartanian, T. P. (2011). Secondary data analysis. New York, NY: Oxford.
51 Thanks! For more information, please contact: Natalie Koziol,
The SURVEYFREQ Procedure in SAS 9.2: Avoiding FREQuent Mistakes When Analyzing Survey Data ABSTRACT INTRODUCTION SURVEY DESIGN 101 WHY STRATIFY?
The SURVEYFREQ Procedure in SAS 9.2: Avoiding FREQuent Mistakes When Analyzing Survey Data Kathryn Martin, Maternal, Child and Adolescent Health Program, California Department of Public Health, ABSTRACT
Sampling Error Estimation in Design-Based Analysis of the PSID Data
Technical Series Paper #11-05 Sampling Error Estimation in Design-Based Analysis of the PSID Data Steven G. Heeringa, Patricia A. Berglund, Azam Khan Survey Research Center, Institute for Social Research
Survey Data Analysis in Stata
Survey Data Analysis in Stata Jeff Pitblado Associate Director, Statistical Software StataCorp LP Stata Conference DC 2009 J. Pitblado (StataCorp) Survey Data Analysis DC 2009 1 / 44 Outline 1 Types of
Chapter 11 Introduction to Survey Sampling and Analysis Procedures
Chapter 11 Introduction to Survey Sampling and Analysis Procedures Chapter Table of Contents OVERVIEW...149 SurveySampling...150 SurveyDataAnalysis...151 DESIGN INFORMATION FOR SURVEY PROCEDURES...152
Power Calculation Using the Online Variance Almanac (Web VA): A User s Guide
Power Calculation Using the Online Variance Almanac (Web VA): A User s Guide Larry V. Hedges & E.C. Hedberg This research was supported by the National Science Foundation under Award Nos. 0129365 and 0815295.
Statistical Techniques Utilized in Analyzing TIMSS Databases in Science Education from 1996 to 2012: A Methodological Review
Statistical Techniques Utilized in Analyzing TIMSS Databases in Science Education from 1996 to 2012: A Methodological Review Pey-Yan Liou, Ph.D. Yi-Chen Hung National Central University Please address
TIMSS 2011 User Guide for the International Database
TIMSS 2011 User Guide for the International Database Edited by: Pierre Foy, Alka Arora, and Gabrielle M. Stanco TIMSS 2011 User Guide for the International Database Edited by: Pierre Foy, Alka Arora,
Survey Analysis: Options for Missing Data
Survey Analysis: Options for Missing Data Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD Abstract A common situation researchers working with survey data face is the analysis of missing
Youth Risk Behavior Survey (YRBS) Software for Analysis of YRBS Data
Youth Risk Behavior Survey (YRBS) Software for Analysis of YRBS Data CONTENTS Overview 1 Background 1 1. SUDAAN 2 1.1. Analysis capabilities 2 1.2. Data requirements 2 1.3. Variance estimation 2 1.4. Survey
Workshop on Using the National Survey of Children s s Health Dataset: Practical Applications
Workshop on Using the National Survey of Children s s Health Dataset: Practical Applications Julian Luke Stephen Blumberg Centers for Disease Control and Prevention National Center for Health Statistics
Survey Data Analysis in Stata
Survey Data Analysis in Stata Jeff Pitblado Associate Director, Statistical Software StataCorp LP 2009 Canadian Stata Users Group Meeting Outline 1 Types of data 2 2 Survey data characteristics 4 2.1 Single
Missing Data & How to Deal: An overview of missing data. Melissa Humphries Population Research Center
Missing Data & How to Deal: An overview of missing data Melissa Humphries Population Research Center Goals Discuss ways to evaluate and understand missing data Discuss common missing data methods Know
National Longitudinal Study of Adolescent Health. Strategies to Perform a Design-Based Analysis Using the Add Health Data
National Longitudinal Study of Adolescent Health Strategies to Perform a Design-Based Analysis Using the Add Health Data Kim Chantala Joyce Tabor Carolina Population Center University of North Carolina
Software for Analysis of YRBS Data
Youth Risk Behavior Surveillance System (YRBSS) Software for Analysis of YRBS Data June 2014 Where can I get more information? Visit www.cdc.gov/yrbss or call 800 CDC INFO (800 232 4636). CONTENTS Overview
Variance Estimation Guidance, NHIS 2006-2014 (Adapted from the 2006-2014 NHIS Survey Description Documents) Introduction
June 9, 2015 Variance Estimation Guidance, NHIS 2006-2014 (Adapted from the 2006-2014 NHIS Survey Description Documents) Introduction The data collected in the NHIS are obtained through a complex, multistage
Best Practices in Using Large, Complex Samples: The Importance of Using Appropriate Weights and Design Effect Compensation
A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to
Chapter XXI Sampling error estimation for survey data* Donna Brogan Emory University Atlanta, Georgia United States of America.
Chapter XXI Sampling error estimation for survey data* Donna Brogan Emory University Atlanta, Georgia United States of America Abstract Complex sample survey designs deviate from simple random sampling,
Guidelines for Analyzing Add Health Data. Ping Chen Kim Chantala
Guidelines for Analyzing Add Health Data Ping Chen Kim Chantala Carolina Population Center University of North Carolina at Chapel Hill Last Update: March 2014 Table of Contents Overview... 3 Understanding
Statistical and Methodological Issues in the Analysis of Complex Sample Survey Data: Practical Guidance for Trauma Researchers
Journal of Traumatic Stress, Vol. 2, No., October 2008, pp. 440 447 ( C 2008) Statistical and Methodological Issues in the Analysis of Complex Sample Survey Data: Practical Guidance for Trauma Researchers
ECLS-K BASE YEAR DATA FILES AND ELECTRONIC CODEBOOK
ECLS-K BASE YEAR DATA FILES AND ELECTRONIC CODEBOOK Prepared by Westat Rockville, Maryland University of Michigan Department of Education Ann Arbor, Michigan Educational Testing Service Princeton, New
An Assessment of the Effect of Misreporting of Phone Line Information on Key Weighted RDD Estimates
An Assessment of the Effect of Misreporting of Phone Line Information on Key Weighted RDD Estimates Ashley Bowers 1, Jeffrey Gonzalez 2 1 University of Michigan 2 Bureau of Labor Statistics Abstract Random-digit
New SAS Procedures for Analysis of Sample Survey Data
New SAS Procedures for Analysis of Sample Survey Data Anthony An and Donna Watts, SAS Institute Inc, Cary, NC Abstract Researchers use sample surveys to obtain information on a wide variety of issues Many
Running head: SCHOOL COMPUTER USE AND ACADEMIC PERFORMANCE. Using the U.S. PISA results to investigate the relationship between
Computer Use and Academic Performance- PISA 1 Running head: SCHOOL COMPUTER USE AND ACADEMIC PERFORMANCE Using the U.S. PISA results to investigate the relationship between school computer use and student
Psychology 209. Longitudinal Data Analysis and Bayesian Extensions Fall 2012
Instructor: Psychology 209 Longitudinal Data Analysis and Bayesian Extensions Fall 2012 Sarah Depaoli ([email protected]) Office Location: SSM 312A Office Phone: (209) 228-4549 (although email will
Overview of Factor Analysis
Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,
Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.
Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is
Complex Survey Design Using Stata
Complex Survey Design Using Stata 2010 This document provides a basic overview of how to handle complex survey design using Stata. Probability weighting and compensating for clustered and stratified samples
Using Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses
Using Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses G. Gordon Brown, Celia R. Eicheldinger, and James R. Chromy RTI International, Research Triangle Park, NC 27709 Abstract
National Endowment for the Arts. A Technical Research Manual
2012 SPPA PUBLIC-USE DATA FILE USER S GUIDE A Technical Research Manual Prepared by Timothy Triplett Statistical Methods Group Urban Institute September 2013 Table of Contents Introduction... 3 Section
C:\Users\<your_user_name>\AppData\Roaming\IEA\IDBAnalyzerV3
Installing the IDB Analyzer (Version 3.1) Installing the IDB Analyzer (Version 3.1) A current version of the IDB Analyzer is available free of charge from the IEA website (http://www.iea.nl/data.html,
Life Table Analysis using Weighted Survey Data
Life Table Analysis using Weighted Survey Data James G. Booth and Thomas A. Hirschl June 2005 Abstract Formulas for constructing valid pointwise confidence bands for survival distributions, estimated using
The Research Data Centres Information and Technical Bulletin
Catalogue no. 12-002 X No. 2014001 ISSN 1710-2197 The Research Data Centres Information and Technical Bulletin Winter 2014, vol. 6 no. 1 How to obtain more information For information about this product
Disparities in Early Learning and Development: Lessons from the Early Childhood Longitudinal Study Birth Cohort (ECLS-B) EXECUTIVE SUMMARY
Disparities in Early Learning and Development: Lessons from the Early Childhood Longitudinal Study Birth Cohort (ECLS-B) EXECUTIVE SUMMARY Tamara Halle, Nicole Forry, Elizabeth Hair, Kate Perper, Laura
Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program
Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program Department of Mathematics and Statistics Degree Level Expectations, Learning Outcomes, Indicators of
CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA
Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or
INTRODUCTION TO SURVEY DATA ANALYSIS THROUGH STATISTICAL PACKAGES
INTRODUCTION TO SURVEY DATA ANALYSIS THROUGH STATISTICAL PACKAGES Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-110012 1. INTRODUCTION A sample survey is a process for collecting
Archival Data for I-O Research
Using Archival Data for I-O Research: Advantages, Pitfalls, Sources, and Examples 1 Kenneth S. Shultz California State University, San Bernardino Calvin C. Hoffman Alliant University, Los Angeles Roni
SPSS and AM statistical software example.
A detailed example of statistical analysis using the NELS:88 data file and ECB, to perform a longitudinal analysis of 1988 8 th graders in the year 2000: SPSS and AM statistical software example. Overall
Psychology Courses (PSYCH)
Psychology Courses (PSYCH) PSYCH 545 Abnormal Psychology 3 u An introductory survey of abnormal psychology covering the clinical syndromes included in the diagnostic classification system of the American
APPLIED MISSING DATA ANALYSIS
APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview
Data Mining Introduction
Data Mining Introduction Bob Stine Dept of Statistics, School University of Pennsylvania www-stat.wharton.upenn.edu/~stine What is data mining? An insult? Predictive modeling Large, wide data sets, often
How To Teach Social Science To A Class
Date submitted: 18/06/2010 Using Web-based Software to Promote Data Literacy in a Large Enrollment Undergraduate Course Harrison Dekker UC Berkeley Libraries Berkeley, California, USA Meeting: 86. Social
Courses Descriptions. Courses Generally Taken in Program Year One
Courses Descriptions Courses Generally Taken in Program Year One PSY 602 (3 credits): Native Ways of Knowing Covers the appropriate and valid ways of describing and explaining human behavior by using the
CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA
Examples: Mixture Modeling With Longitudinal Data CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA Mixture modeling refers to modeling with categorical latent variables that represent subpopulations
[This document contains corrections to a few typos that were found on the version available through the journal s web page]
Online supplement to Hayes, A. F., & Preacher, K. J. (2014). Statistical mediation analysis with a multicategorical independent variable. British Journal of Mathematical and Statistical Psychology, 67,
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
Courses in the College of Letters and Sciences PSYCHOLOGY COURSES (840)
Courses in the College of Letters and Sciences PSYCHOLOGY COURSES (840) 840-545 Abnormal Psychology -- 3 cr An introductory survey of abnormal psychology covering the clinical syndromes included in the
Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
EILENE A. EDEJER EDUCATION
EILENE A. EDEJER EDUCATION Loyola University Chicago, Ph.D. Research Methodology May 2003 Minor: Urban and Community Policy Research Loyola University Chicago, M.Ed. School/Community Counseling May 1993
Data The estimates presented in the tables originate from the 2013 SCS to the NCVS. The SCS collects information about student and school
This document reports data from the 2013 School Crime Supplement (SCS) of the National Crime Victimization Survey (NCVS). 1 The Web Tables show the extent to which with different personal characteristics
Using Computer Programs for Quantitative Data Analysis
1 Using Computer Programs for Quantitative Data Analysis Brian V. Carolan The Graduate School's Graduate Development Conference, 2 Slides available at: http://www.briancarolan.org/site/courses.html 3 Overview
Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer
Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer Patricia A. Berglund, Institute for Social Research - University of Michigan Wisconsin and Illinois SAS User s Group June 25, 2014 1 Overview
POL 204b: Research and Methodology
POL 204b: Research and Methodology Winter 2010 T 9:00-12:00 SSB104 & 139 Professor Scott Desposato Office: 325 Social Sciences Building Office Hours: W 1:00-3:00 phone: 858-534-0198 email: [email protected]
SOCIOLOGY 311 RESEARCH METHODS
SOCIOLOGY 311 RESEARCH METHODS Marc Schneiberg TTh: 10:30-11:50 Eliot 409, ext. 7495 ETC 205 Office hours TBA Spring 2009 [email protected] Course description: This is a rigorous, workshop-based
Adrianne Frech ACADEMIC POSITIONS. Postdoctoral Fellow, Department of Sociology, 2009-2011 EDUCATION
Adrianne Frech Department of Sociology, MS28 4100 Greenbriar Street #240 Rice University Houston, TX 77098 6100 Main Street (614) 397-2204 Houston, TX Phone: Email: [email protected] ACADEMIC POSITIONS Rice
Inter-Rater Agreement in Analysis of Open-Ended Responses: Lessons from a Mixed Methods Study of Principals 1
Inter-Rater Agreement in Analysis of Open-Ended Responses: Lessons from a Mixed Methods Study of Principals 1 Will J. Jordan and Stephanie R. Miller Temple University Objectives The use of open-ended survey
DAVOOD TOFIGHI POSITION. 2011 present Assistant Professor, School of Psychology, Georgia Institute of Technology EDUCATION
DAVOOD TOFIGHI ASSISTANT PROFESSOR SCHOOL OF PSYCHOLOGY GEORGIA INSTITUTE OF TECHNOLOGY 654 CHERRY STREET ATLANTA, GA 30332 USA TEL: +1 (404) 894-7558 HTTP://WWW.AMP.GATECH.EDU/PEOPLE DAVOOD TOFIGHI POSITION
School of Public Health and Health Services Department of Prevention and Community Health
School of Public Health and Health Services Department of Prevention and Community Health Master of Public Health and Graduate Certificate Health Promotion 2013 2014 Note: All curriculum revisions will
SPPH 501 Analysis of Longitudinal & Correlated Data September, 2012
SPPH 501 Analysis of Longitudinal & Correlated Data September, 2012 TIME & PLACE: Term 1, Tuesday, 1:30-4:30 P.M. LOCATION: INSTRUCTOR: OFFICE: SPPH, Room B104 Dr. Ying MacNab SPPH, Room 134B TELEPHONE:
Ph.D. Biostatistics 2014-2015 Note: All curriculum revisions will be updated immediately on the website http://www.publichealth.gwu.
Columbian College of Arts and Sciences and Milken Institute School of Public Health Ph.D. Biostatistics 2014-2015 Note: All curriculum revisions will be updated immediately on the website http://www.publichealth.gwu.edu
Angela L. Vaughan, Ph. D.
Angela L. Vaughan, Ph. D. Director, First Year Curriculum and Instruction Academic Support & Advising University of Northern Colorado 501 20 th Street Greeley, CO 80639 (970)351 1175 [email protected]
Introduction to Data Analysis in Hierarchical Linear Models
Introduction to Data Analysis in Hierarchical Linear Models April 20, 2007 Noah Shamosh & Frank Farach Social Sciences StatLab Yale University Scope & Prerequisites Strong applied emphasis Focus on HLM
Instructions for Analyzing Data from CAHPS Surveys:
Instructions for Analyzing Data from CAHPS Surveys: Using the CAHPS Analysis Program Version 3.6 The CAHPS Analysis Program...1 Computing Requirements...1 Pre-Analysis Decisions...2 What Does the CAHPS
Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88)
Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88) Introduction The National Educational Longitudinal Survey (NELS:88) followed students from 8 th grade in 1988 to 10 th grade in
Handling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
Chenoa S. Woods, Ph.D.
Chenoa S. Woods, Ph.D. Center for Postsecondary Success Florida State University Tallahassee, FL [email protected] www.chenoawoods.com Education Ph.D. Educational Policy & Social Context, University of California,
Imputing Attendance Data in a Longitudinal Multilevel Panel Data Set
Imputing Attendance Data in a Longitudinal Multilevel Panel Data Set April 2015 SHORT REPORT Baby FACES 2009 This page is left blank for double-sided printing. Imputing Attendance Data in a Longitudinal
Introduction to Longitudinal Data Analysis
Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction
Data Cleaning and Missing Data Analysis
Data Cleaning and Missing Data Analysis Dan Merson [email protected] India McHale [email protected] April 13, 2010 Overview Introduction to SACS What do we mean by Data Cleaning and why do we do it? The SACS
2003 National Survey of College Graduates Nonresponse Bias Analysis 1
2003 National Survey of College Graduates Nonresponse Bias Analysis 1 Michael White U.S. Census Bureau, Washington, DC 20233 Abstract The National Survey of College Graduates (NSCG) is a longitudinal survey
Report of Results and Analysis of Parent Survey Data Collected in Southern West Virginia
Partners in Community Outreach Education Begins at Home Partners in Community Outreach In-Home Family Education Programs Report of Results and Analysis of Parent Survey Data Collected in Southern West
M.A. in School Counseling / 2015 2016
M.A. in School Counseling / 2015 2016 Course of Study for the Master of Arts in School Counseling Initial License (Pre K 8 or 5 12) Candidates for the degree of Master of Arts in School Counseling are
Secondary Data Analysis: A Method of which the Time Has Come
Qualitative and Quantitative Methods in Libraries (QQML) 3:619 626, 2014 Secondary Data Analysis: A Method of which the Time Has Come Melissa P. Johnston, PhD School of Library and Information Studies,
RMTD 404 Introduction to Linear Models
RMTD 404 Introduction to Linear Models Instructor: Ken A., Assistant Professor E-mail: [email protected] Phone: (312) 915-6852 Office: Lewis Towers, Room 1037 Office hour: By appointment Course Content
6 Regression With Survey Data From Complex Samples
6 Regression With Survey Data From Complex Samples Secondary analysis of data from large national surveys figures prominently in social science and public health research, and these surveys use complex
Apply an ecological framework to assess and promote population health.
School of Public and Services Department of Prevention and Community Master of Public and Graduate Certificate Public Communication and Marketing 2013-2014 Note: All curriculum revisions will be updated
Εισαγωγή στην πολυεπίπεδη μοντελοποίηση δεδομένων με το HLM. Βασίλης Παυλόπουλος Τμήμα Ψυχολογίας, Πανεπιστήμιο Αθηνών
Εισαγωγή στην πολυεπίπεδη μοντελοποίηση δεδομένων με το HLM Βασίλης Παυλόπουλος Τμήμα Ψυχολογίας, Πανεπιστήμιο Αθηνών Το υλικό αυτό προέρχεται από workshop που οργανώθηκε σε θερινό σχολείο της Ευρωπαϊκής
STATISTICAL DATA ANALYSIS
STATISTICAL DATA ANALYSIS INTRODUCTION Fethullah Karabiber YTU, Fall of 2012 The role of statistical analysis in science This course discusses some statistical methods, which involve applying statistical
Visualization of Complex Survey Data: Regression Diagnostics
Visualization of Complex Survey Data: Regression Diagnostics Susan Hinkins 1, Edward Mulrow, Fritz Scheuren 3 1 NORC at the University of Chicago, 11 South 5th Ave, Bozeman MT 59715 NORC at the University
Marina Vasilyeva Boston College Campion Hall 239D 140 Commonwealth Avenue Chestnut Hill, MA 02467 (617) 552-1755 (office) vasilyev@bc.
1 Marina Vasilyeva Boston College Campion Hall 239D 140 Commonwealth Avenue Chestnut Hill, MA 02467 (617) 552-1755 (office) [email protected] Education 1995-2000 University of Chicago, Chicago, IL Ph. D.,
Descriptive Methods Ch. 6 and 7
Descriptive Methods Ch. 6 and 7 Purpose of Descriptive Research Purely descriptive research describes the characteristics or behaviors of a given population in a systematic and accurate fashion. Correlational
Curriculum Vitae Lindsay Bell Weixler 262 Warren St NE Washington, DC 20002 [email protected] 504-875-0203
EDUCATION Curriculum Vitae Lindsay Bell Weixler 262 Warren St NE Washington, DC 20002 [email protected] 504-875-0203 University of Michigan, Ann Arbor, MI Ph.D., Developmental Psychology;
A Guide to Understanding and Using Data for Effective Advocacy
A Guide to Understanding and Using Data for Effective Advocacy Voices For Virginia's Children Voices For V Child We encounter data constantly in our daily lives. From newspaper articles to political campaign
A C T R esearcli R e p o rt S eries 2 0 0 5. Using ACT Assessment Scores to Set Benchmarks for College Readiness. IJeff Allen.
A C T R esearcli R e p o rt S eries 2 0 0 5 Using ACT Assessment Scores to Set Benchmarks for College Readiness IJeff Allen Jim Sconing ACT August 2005 For additional copies write: ACT Research Report
ln(p/(1-p)) = α +β*age35plus, where p is the probability or odds of drinking
Dummy Coding for Dummies Kathryn Martin, Maternal, Child and Adolescent Health Program, California Department of Public Health ABSTRACT There are a number of ways to incorporate categorical variables into
Best Practices for Missing Data Management in Counseling Psychology
Journal of Counseling Psychology 2010 American Psychological Association 2010, Vol. 57, No. 1, 1 10 0022-0167/10/$12.00 DOI: 10.1037/a0018082 Best Practices for Missing Data Management in Counseling Psychology
