The Survey of Income and Program Participation (SIPP) * Useful Tools for Data Analysis using the SIPP H. Luke Shaefer University of Michigan School of Social Work National Poverty Center This presentation is part of the NSF Census Research Network project of the Institute for Social Research at the University of Michigan. It is funded by National Science Foundation Grant No. SES 1131500. Useful Tools This presentation will work through a number of examples of analyses that can be undertaken using the SIPP All are drawn from my own Stata syntax, so you use at your own risk (although I think it s all clean) You may be a more efficient programmer than I am Either way, these tools may start to offer some ideas as you learn to really exploit the SIPP data 1
Example: Who Are the Uninsured? SIPP estimates of the uninsured are based on questions about insurance type, three variables in particular: Variable ecdmth ecrmth ehimth emcocov Description Medicaid coverage (includes CHIP) 1 = yes 2 = no Medicare coverage 1 = yes 2 = no All other coverage 1 = yes 2 = no Type of public coverage Who Are the Uninsured? So, for a cross-sectional estimate, you might do something like: delimit;! gen uninsured = 1;! /* Thanks to imputation of public-use SIPP files, we don t have to worry about missing data above! Why would we otherwise? */! replace uninsured = 0 if ecdmth == 1 ehimth == 1 ecrmth == 1;! /*Might as well just keep the reporting month */! keep if srefmon ==4;! /* Assume we already survey set the data */! svy: proportion uninsured;! 2
Who Are the Uninsured? So, for a cross-sectional estimate, you might do something like: Uninsured in a Given Month (2008, W1) Estimate All 16.7% Children 12.7% Young Adults 31.3% Working-age Adults 19.6% Seniors 1% Strength of the SIPP: Leads and Lags Measuring program entry/exit is a primary purposes of the SIPP So how do you identify someone who goes from insured to uninsured, or uninsured to insured? /* Let s order each respondent s data chronologically */! sort sippid swave srefmon;! /* Then use the person identifier and chronologically ordered SIPP data to generate a lag variable for a respondent s insurance status in the previous month. When will this be missing? */! by sippid: gen uninsuredlag = uninsured[_n-1];! svy: tab uninsuredlag uninsured, row col;! This will create an insurance transition matrix that looks like this: Insurance Status Insured month t Uninsured month t Insured month t-1 #, % #, % Uninsured month t-1 #, % #, % 3
528 Medical Care Research and Review 68(5) Table 1. Changes in the Month-to-Month Stability of Children s Health Insurance by Starting Insurance Coverage: 1990-2005 1990-1994 1996-1999 2001-2005 Percent Change From 1990-1994 to 2001-2005 Started with private insurance.665 (.004).675 (.005).600 (.003)* + 9.8% Lost coverage.038 (.001).036 (.001).053 (.001)* + 39.5% Changed source of coverage.015 (.000).015 (.001).034 (.001)* + 126.7% No change stably insured.947 (.001).949 (.001).913 (.001)* + 3.6% Started with public insurance.184 (.003).176 (.004).260 (.003)* + 41.3% Lost coverage.078 (.002).136 (.003)*.125 (.002)* + 60.3% Changed source of coverage.059 (.003).066 (.002).078 (.002)* + 32.2% No change - stably insured.862 (.002).797 (.004)*.797 (.003)* 7.5% Started uninsured.151 (.003).148 (.003).140 (.002)* + 7.3% Gained private.161 (.003).165 (.004).198 (.004)* + 23% Gained Public.085 (.002).123 (.004)*.185 (.004)* + 117.6% No change - stably uninsured.754 (.004).712 (.004)*.617 (.006)* + 18.2% Source: Authors calculations from a pooled sample of the Survey of Income and Program Participation. *Difference with 1990-1994 is statistically significant at or above the.05 level. + Difference with 1996-1999 is statistically significant at or above the.05 level. Hill & Shaefer, (2011) use these robust, adjusted standard errors. Furthermore, in assigning statistical significance in Table 1, we use a two-sided t test of the difference of means (in most cases proportions) for large and independent random samples, because we stack multiple panels of the SIPP. For all observations, Who we compare Are the respondent s Uninsured? insurance state in time t (in the fourth reference month of a wave) to their insurance state in time t + 1 (in the first reference How about month of over the next the wave). course We of focus a year, on health like insurance 2009? state transitions from month First to month load in because the necessary such an analysis waves, faces and fewer keep concerns 2009 about observations. nonrandom attrition than ones based on longer periods. In alternative models, we measured transitions Use the person identifier to track insurance status across the between t and t + 4 (two reporting months), and results were substantively similar. In calendar year. our analysis of the full study period, we stack years into three periods, 1990 to 1994 (pre-schip), Estimates 1996 must to 1999 use (SCHIP the calendar-year implementation), weights, and 2001 to so 2005 we survey (post-schip). set 1990 the to 1994 data and slightly 2001 to differently 2005 are particularly below. good for comparison: both included a mild recession keep if and rhcalyr an economic == 2009;! recovery. The sort second sippid part of swave our analysis srefmon;! focuses on identifying child and family characteristics associated by sippid: with egen transitions uninsuredallyear insurance states = (both min(uninsured);! losing and gaining coverage) in the most by sippid: recent period, egen 2001 uninsured1mnth to 2005. We use = demographic max(uninsured);! characteristics of families in time /* Keep t and a 1 set observation of dynamic characteristics per person, (indicating let s say changes for in January. family characteristics) Respondents that are marked must by be a change present between in January time t and of time the t + year 1. For to a change get a in employment, calendar-year we code a change weight in */! the employment of the family head between the two interviews working keep if rhcalmn (not == working) 1;! full-time at time t to not working (working) svyset ghlfsam [pw = lgtcy1wt], strata(gvarstr)! svy: proportion uninsuredallyear uninsured1mnth! Downloaded from mcr.sagepub.com at UNIV OF MICHIGAN on September 16, 2011 4
Who Are the Uninsured? Uninsured in Calendar Year 2009 (Panel 2008, Wave 1) All Year Ever in Year? All 9.5% 26.1% Children 4.0% 26.8% Young Adults 18.2% 47.1% Working-age Adults 12.5% 27.5% More Leads and Lags How do you identify someone who goes from working in month t-1 to to unemployed in month t? USEFUL CENSUS RECODE FOR LABOR FORCE STATUS: RMESR 1. With a job entire month, worked all weeks 2. With a job entire month, absent from work without pay 1+ weeks, absence not due to layoff 3. With a job entire month, absent from work without pay 1+ weeks, absence due to layoff 4. With a job at least 1 but not all weeks, no time on layoff and no time looking for work 5. With a job at least 1 but not all weeks, remaining weeks on layoff or looking for work 6. No job all month, on layoff or looking for work all weeks 7. No job all month, at least one but not all weeks on layoff or looking for work 8.No job all month, no time on layoff and no time looking for work. 5
More Leads and Lags How do you identify someone who goes from working in month t-1 to to unemployed in month t? USEFUL CENSUS RECODE FOR LABOR FORCE STATUS: RMESR 1. With a job entire month, worked all weeks Working 2. With a job entire month, absent from work without pay 1+ weeks, absence not due to layoff 3. With a job entire month, absent from work without pay 1+ weeks, absence due to layoff 4. With a job at least 1 but not all weeks, no time on layoff and no time looking for work 5. With a job at least 1 but not all weeks, remaining weeks on layoff or looking for work Not Working 6. No job all month, on layoff or looking for work all weeks 7. No job all month, at least one but not all weeks on layoff or looking for work 8.No job all month, no time on layoff and no time looking for work. More Leads and Lags How do you identify someone who goes from working in month t-1 to to unemployed in month t? Append, for example, 2008 panels, waves 1 & 2, let s use reporting months. What does this do? keep if srefmon == 4;! /* Order each respondent s observations chronologically */! sort sippid swave srefmon;! /* Construct a lag variable to see a respondent s work status in t-1, in this case the last reporting month */! by sippid: gen rmesrlag = rmesr[_n-1];! keep if tage >22 & tage <65 & swave == 2;! gen worklag = 0 if rmesrlag <.;! replace worklag = 1 if rmesrlag == 1 rmesrlag == 2 rmesrlag == 3;! /* Don t have to worry about missing data in t because of imputation. Why do we worry about missing data in rmesrlag? */! gen unemployed = 0;! replace unemployed = 1 if rmesr == 5 rmesr == 6 rmesr == 7;! gen worktounemp = 1 if worklag == 1 & unemployed == 1;! 6
Program Participation: Single Mothers Entering Unemployment (Shaefer & Wu, 2009) Single Mothers and UI 221 Fig. 1. Program participation of single mothers entering into unemployment. UI p unemployment insurance; Cash Welfare p Aid to Families with Dependent Children and Temporary Assistance for Needy Families; FSP p Food Stamp Program (now Supplemental NutritionIncome Assistance Program); Comes Any p percentage in of respondents All Shapes who reported participation in any of the three programs. Source: Authors analyses using a pooled sample of the 1990 2004 panels from the Survey of Income and Program Participation. and Sizes $29.50 Lots more of than different single, childless income women variables remember in the prereform period that to $93.95 the less SIPP in theasks postreform lots of period. detailed While questions these point about estimates are not significant, income the sources difference between them is significant at the p!.10 level. thtotinc/tftotinc/tstotinc/tptotinc: Census Changing aggregates Participationall inincome Other Public sources Programs up over into Time a total Two other income forms measure of support that for the low-educated, unit of analysis single mothers may access when entering a spell of unemployment are cash welfare and the FSP. thearn/tfearn/tsearn/tpearn: Reaggregated total Cash welfare was the primary target of the 1996 welfare reform, so the major changes earned leading income to for declines the unit in cash of assistance analysis (AFDC and TANF) caseloads affected low-educated, single mother particularly. With a caseload of 25.7 million individuals in 2005, the FSP served more individuals Other types of income measures: Property, other, public benefits, retirement distributions than TANF (4.5 million), UI (7.9 million), and even the EITC (22.8 million; Scholz, Moffitt, and Cowan 2009). Figure 1 plots the annual rates at which these programs benefits are received by low-educated, single mothers entering a spell of unemployment. Between 1990 and 1993, the rates at which single mothers entering a spell of unemployment received UI benefits were nearly identical to those at which they received AFDC benefits. Rates of AFDC 7
Information on Jobs The SIPP core collects data on up to two jobs per adult respondent, per wave In the event that a person has two jobs, you can use start and end dates for both jobs to see the extent to which they were worked concurrently Available variables of jobs offer extensive information on each job: typical weekly work hours, employer characteristics, union representation, salary hourly, detailed industry and occupation codes Info on employment separations are job-specific Egen Can Be Your Best Friend Let s create a variable with the highest education level in a household in a given month: bysort ssuid shhadid swave srefmon: egen hhed = max(eeducate);! Or an individual s highest educational attainment during the panel: bysort sippid: egen personed = max(eeducate);! Other similar variables can be created for: Ever in poverty, average income during panel Ever uninsured, ever unemployed, ever a part-time worker Presence of a worker in a household 8
Generating a Poverty Rate Let s say you want to calculate a poverty rate: The Census Bureau has got your back! 2001/2004 Panels: Census gives a monthly poverty threshold for household/family units Take total household income / poverty threshold gen inctoneeds = (thtotinc/rhpov)*100! Will generate a income-to-needs ratio where 100 means the household is at the poverty line Negative values for thtotinc are generally reflective of high incomes--i don t include them as low income 1996 (& prior) panels: poverty threshold is annualized replace thpov = rhpov/12 if spanel == 1996! Important Notes eorigin: Condensed in the 2004 panel to 1 = hispanic and 2 = not Sad For <= 1996 & 2001 panels, far more detailed Hispanic Origins would be codes 20 to 28 1996 & 2001 panels included detailed MSA codes 2004 & 2008 panels only include metro status == 1, not metro == 2 9
Sometimes Things are More Complicated than They Seem Let s say you want to identify unmarried working-age mothers: It s easy if you just want the family/sub-family heads: Identify the family reference person keep if rfoklt18 >0 & rfoklt18 <. But rfoklt18 doesn t work if the mother isn t a reference person! Identifying all single mothers: Part 1 Load in your wave file keep if tage <18 srefmon ==4;! drop if epnmom==9999;! /* This is the mom identfier, it s in the kid s record and points to the mother */! gen kid = 1! /* Now we count up the number of kids who point to a given mom */! bysort spanel ssuid epnmom: egen numkids = count(kid);!! keep ssuid epnmom numkids;! /* the mom number is in a different form from epppnum, so convert */! gen zero = 0;! egen epppnum = concat(zero epnmom);! drop epnmom;! keep ssuid epppnum numkids;! sort ssuid epppnum;! duplicates drop;! save mom.dta, replace;! clear;! 10
Identifying all single mothers: Part 2 Now reload your original wave file with all observations keep if srefmon == 4;! sort ssuid epppnum;! merge 1:1 ssuid epppnum using mom.dta!! /* If a woman didn t merge in from working dataset, it s because they don t have kids who are pointing to them, so you can recode a missing value as 0 */! replace numkids = 0 if numkids ==. & esex == 2;! gen singlemom = 0;! replace singlemom = 1 if numkids >0 & ems >=3 & ems <=6 & esex == 2;! 11