The Importance of College Choice: A Study of. Community College Transfer Students in Virginia. Erin Dunlop. March 16, 2011.

Size: px
Start display at page:

Download "The Importance of College Choice: A Study of. Community College Transfer Students in Virginia. Erin Dunlop. March 16, 2011."

Transcription

1 The Importance of College Choice: A Study of Community College Transfer Students in Virginia Erin Dunlop March 16, 2011 Abstract The transition from a community college to a four-year institution is an important step in a community college student's educational pipeline. This paper measures the eect of community college students' four-year college choices on their baccalaureate degree attainment and credits earned in college. Understanding the impact of college quality on student outcomes has broader import as well, and I am able to make use of a unique data set covering all community college students in the state of Virginia in this study. A clear diculty in estimating the causal impact of college quality is that colleges and students are not randomly assigned. This paper uses an instrumental variables strategy which exploits variation in the quality of a student's local higher education market to address the endogeneity of college choice. Because community college students are likely to attend colleges close to their homes, closest four-year college quality can serve as an instrument for actual four-year college quality. The results show that college quality has a signicant eect on degree attainment for community college students; increasing any measure of quality by one standard deviation increases the likelihood of graduation by 7-13 percentage points. PhD Candidate, University of Virginia, contact: erd2r@virginia.edu. I would like to thank Sarah Turner, Leora Friedberg, Steve Stern, and John Pepper, as well as the other faculty and graduate students at the University of Virginia for their invaluable advice. The research reported here was partially supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305B to the University of Virginia. The opinions expressed are those of the author and do not represent views of the U.S. Department of Education. 1

2 1 Introduction The doubling of the college wage premium since the 1980s has lead to a large increase in the proportion of high school graduates who enter college (Goldin and Katz, 2007). 1 But, despite the large increase in college entrants, baccalaureate degree attainment in the United States has only risen by one percentage point over the past three decades. 23% of high school graduates completed a bachelors degree in 1970, and 24% did in (Turner 2004). Taken together, increasing enrollment and stagnant degree attainment indicate a decline in the college completion rate over this period. In turn, researchers and policy makers have increasingly focused on degree attainment, as opposed to college attendance, as an important margin in education policy (Hoxby 2004). As has been widely recognized, low rates of collegiate attainment have long term implications for employment outcomes and economic growth. In 2009, President Obama addressed the need to improve stagnating college degree attainment in the United States by creating the American Graduation Initiative. This initiative emphasized the critical role of community colleges in increasing degree attainment in the United States. Numerous other foundations and researchers working to increase national degree attainment, such as The Bill and Melinda Gates Foundation, also cite community college students as an important pathway to increase degree attainment in the United States. Two reasons motivate the policy concern over the collegiate attainment of students at community college. First, community college students are a substantial portion of the higher education market in the United States. A third of the 18.7 million students who attended college in the U.S. last year attended community college. 2 Second, community college students who transfer to four-year colleges (a small minority in themselves) are 40% less likely to earn a bachelors degree than students who begin at a four-year school. 3 1 Between 1970 and 1999, the high school graduate college enrollment rate rose from 51% to 67%. (Turner 2004) 2 U.S. Census Bureau, Statistic Abstract of the United States: 2010, Table National Center for Education Statistics report, Student Eort and Educational Progress, Table A-21-2 and Table

3 The transition from community colleges to four-year institutions is a primary factor in baccalaureate degree attainment for community college students. However, there is little research evidence on how community college students make their transfer choice among four-year institutions, even as there is a substantial body of evidence on both whether community college students transfer (Bailey and Weininger 2002, Dougherty and Kienzl 2006, Doyle 2010) and how rst-time students make choices between colleges (Bowen, Chingos, McPherson 2009, Avery and Hoxby 2003, Long 2004). While available evidence shows a clear relationship between initial college choice and bachelor's degree completion (Light and Strayer 2000), little evidence indicates how the transfer institutions chosen by community college students impact degree completion. This paper examines community college students' four-year college choices and the impact of these choices on collegiate outcomes. Because community college students are likely to attend colleges close to their homes, their range of four-year college options varies substantially across local markets. Some students will have the choice of a agship university, a comprehensive public college, and a number of private institutions, while other students will have more limited collegiate options. Using this variation across local markets, I investigate how the four-year college that a community college student transfers into impacts baccalaureate degree attainment. 4 To this end, I examine what four-year college characteristics maximize the probability that community college transfer students earn a baccalaureate degree. A clear challenge in investigating the impact of college quality on student outcomes, such as degree attainment, is that college choice is endogenous. To address this issue, I take advantage of the local variation in community college students' higher education markets, and instrument for college quality with quality of the closest four-year college. This is a strong instrument for actual college quality because, unlike traditional four-year students, 4 Using local variation in student's higher education markets is a common estimation strategy in the returns-to-college literature. This source of identication has been used by Card (1995), Kane and Rouse (1995), Rouse (1995), and Currie and Moretti (2003) to name a few. 3

4 it is common for community college students to transfer to the closest four-year college to their home. In fact, in Virginia, nearly half of community college transfer students attend a four-year college that is less than 20 miles from their home. As a robustness check, I also test the validity of four additional instruments: average and maximum college quality within 30 and 45 miles of the student's home. There are several possible threats to the validity of using these instruments, such as either students' residential decisions or selection into attending a community college being related to local four-year college quality. Both of these concerns, and others, are explored in more detail in the Estimation Section. This analysis uses data on the universe of all community college students in the state of Virginia from This comprehensive administrative data set, collected annually by the State Council for Higher Education in Virginia, has not been used extensively in academic research and is an untapped resource for examining the transition from community college to four-year schools. Additionally, I merge in data from other sources, including county-level characteristics from the 2000 Census and college characteristics from the National Center for Education Statistics. I nd that, regardless of the measure of quality or the instrument used, four-year college quality has a large and statistically signicant eect on the probability that community college transfer students earn a baccalaureate degree. If a community college student transfers to a four-year college that is one standard deviation higher in any measure of college quality, holding the other measures of quality constant, the student increases his probability of graduating by 7-13 percentage points. Additionally, transferring to a college that is one standard deviation higher in any measure of expenditures per student, increases the number of four-year credits earned by 7-10 credits. The outline of the rest of the paper is as follows. Section 2 reviews the current literature and highlights this paper's contribution. The estimation strategy is outlined in Section 3, and the data are described in Section 4. Section 5 reports all of the results and the paper concludes in Section 6. 4

5 2 Previous Literature 2.1 Research Context This study contributes to research in higher education on community college transfers, fouryear college choice, and the returns to college quality. This work makes connections across all three areas of research. Economists and other higher education researchers have actively examined the determinants of whether community college students transfer. This literature identies the student characteristics that aect the likelihood of transfer (Bailey and Weininger 2002, Dougherty and Kienzl 2006, Doyle 2010, Dowd and Melguizo 2008), the factors that aect community college students' access to four-year institutions (Cheslock 2005, Dowd, Cheslock, and Melguizo 2008), and whether community college entry deters eventual bachelor's degree attainment (Rouse 1995, Leigh and Gill 2003, 2004, Alfonso 2006, Reynolds 2006). Little of the community college literature examines the eects of four-year college choice. Hilmer (2000) measures the returns to four-year college quality for community college transfer students, but he does not address the endogeneity of college choice. It is reasonable to assume that student ability and college quality are positively correlated, so it is dicult to assess which factor drives positive student outcomes with an OLS regression. Estimating the eect of college choice on community college transfer students, while controlling for the endogeneity of school choice, is my contribution to this literature. The second literature related to this work examines the matching of students more generally to colleges, detailing how students choose which colleges to apply to and attend. The bulk of this recent work evaluates whether low-income or minority students are sometimes under-matched in their college choice, and researchers have found that to be the case (Dillon and Smith 2009, Bowen, Chingos, McPherson 2009, Pallais and Turner 2006). Other researchers have investigated how a specic factor eects college choice decisions. Avery and Hoxby (2003) investigate whether students respond rationally to their menu of nan- 5

6 cial aid options and nd that they generally do, although there are some exceptions. Long (2004) investigates how students have changed their college decisions over time and nds that tuition price is becoming less important while quality is increasingly important. Howell (2004) investigates how armative action and race-sensitive admissions policies aect student enrollment. The majority of this literature uses conditional logit or probit models to estimate the eect of student characteristics on college choices. While this area of research is large, it has exclusively focused on how high school students choose four-year colleges. It is reasonable to assume that community college transfer students use dierent criteria when choosing four-year schools. I am able to add to this literature by exploring how community college transfer students make their four-year college decisions. Lastly, a large area of work analyzes the returns to college quality. These papers dier in both the methods used to address the endogeneity of college choice (which are summarized in more detail in the next section) and in the student outcome examined. The outcomes commonly explored in this research are post-secondary earnings and wages (Dale and Krueger 2002, Long 2008, Black and Smith 2004 and 2006), graduation rates (Light and Strayer 2000, Long 2008) and graduate school attendance (Brewer et al. 1999). This literature has generally found positive eects of college quality on student outcomes, although not all work has found signicant results on earnings (Dale and Krueger 2002). While this research uses multiple estimation methods and examines a variety of student outcomes, it exclusively concentrates on traditional college students, who attend four-year colleges directly after high school. I am able to extend the college quality literature by measuring the returns to college quality for a dierent segment of the higher education market, community college students. 2.2 Identication Context To measure the causal impact of college quality on degree attainment, I instrument for actual college quality with the quality of the closest four-year college. Other studies deal with the non-random matching of students to colleges in a multitude of ways, including 6

7 regression discontinuity (Hoekstra 2009), propensity score matching (Black and Smith 2004), matching students based on school admittance and rejection decisions (Dale and Krueger 2002), modeling the selection process (Brewer, Eide, and Ehrenber 1999, Light and Strayer 2000) and full structural modeling (Howell 2004). Long (2008) estimates the returns to college quality and compares some of the previous methods (Dale and Kruger, Black and Smith) and his own instrumental variable to OLS results. Long instruments for college quality with the average quality of schools in a 175-mile radius of the student's high school. He nds that his instrument is a highly signicant predictor of actual college quality. When comparing the OLS results, which suer from selection bias, to the results using the three dierent estimation strategies that correct for selection, Long concludes that the selection bias in the OLS estimates is minimal. Several papers use distance to the nearest college as an instrument for whether a student attends college, and they have all found that distance is a good predictor of college attendance (Card 1995, Kane and Rouse 1995, Rouse 1995, Currie and Moretti 2003). Card (1995) and Currie and Moretti (2003) nd that distance is the strongest predictor for students whose expected years of education are lowest. This nding supports my assertion that distance is a strong predictor of school choice for community college transfer students. Overall, there is little work at the intersection of research on the outcomes of community college students, the returns to college quality, and four-year college choice. I ll this gap by analyzing the matching of community college students to four-year schools and its eect on attainment. Based on the strong support of this literature, I use an instrumental variables strategy focused on variation in the quality of a student's nearest college options, to address the endogeneity of college choice. 7

8 3 Estimation 3.1 Motivation In this paper, I estimate the causal relationship between college quality and degree attainment. College quality could impact degree attainment through two pathways: college resources and peer eects. College quality measures like the student/faculty ratio and expenditures per student measure the amount of resources available to the student. Smaller class sizes and getting more attention in class help students succeed academically. Having advanced academic facilities and, perhaps, good social opportunities (good food in dining halls, athletics), augment a student's overall college experience. College quality measures, like the acceptance rate, the freshman retention rate, and the average SAT of admitted students, capture the quality of peers at a given school. Educational peer eects have been well documented and it has been shown that attending classes with high ability peers has a positive eect on student outcomes 5. Also, students may be more likely to persist from year to year if their friends are continuing with their education also. To quantify the eect of college quality on students' graduation probabilities, I would like to estimate the following relationship: G i = X i β 1 + Q i β 2 + ε i where G i is an indicator that equals 1 if student i graduates a four-year college, X i are student characteristics, and Q i is the quality of the college that student i attends. A clear diculty in estimating this model is that college quality is endogenous; students choose where to apply and attend and colleges choose whether to admit students. Since I worry that there are some unobserved factors that are correlated with both college quality and whether a student graduates, I use a two-stage least squares model (2SLS) and instrument 5 For a summary on peer eects in higher education see Winston and Zimmerman (2004). 8

9 for college quality. The instrument I use is the quality of the closest four-year college. 6 The theoretical motivation for this instrument is as follows. Students are randomly located across the country, conditional on observables, as an exogenous function of parental location decisions. Students who decide to attend community colleges and, subsequently, the subset of those students who decide to transfer to four-year colleges make these decisions in the context of their local or regional market options. Unlike traditional students who often consider large geographic markets when choosing colleges, community college students often transfer to four-year colleges close to their homes. Assuming that most community college students attend local four-year colleges, students across the country are essentially facing dierent regional markets for higher education, leading to considerable variation in both the number and quality of their four-year college options. After controlling for student and county-level observables, this local variation in four-year college options is plausibly exogenous to the students' collegiate outcomes. For example, in the state of Virginia, Montgomery County and Wise County, both in Southwest Virginia, have similar demographic statistics, including average income per capita, percent of the population below the poverty line, and racial diversity. Presumably, community college students in both of these counties are similar. However, Montgomery County has several nearby large public colleges including Virginia Tech, while in Wise County, there is only one remotely close college, and it is not nearly as high quality as Virginia Tech. In this example, otherwise similar transfer students are receiving a dierent treatment (college quality), due to their plausibly exogenous dierences in hometowns. A possible threat to the validity of this instrument would be if residential choice was endogenously related to college quality; meaning that some parents, perhaps those most concerned with their children's education, choose to live near high quality four-year colleges. There are several reasons why I nd this concern minimal. First, proximity to a fouryear college does not increase a student's chance of being admitted, so living near a high 6 The smallest geographic identier I have in my data is county, so I measure the closest four-year college from the center of the student's home county. 9

10 quality college does not increase the probability that a student will be able to attend that college, conditional on the student applying. It is plausible to assume that parents' residence decisions are related to their own employment and not related to the local college quality, given that the educational advantage of living near a high quality college seems minimal. Second, the measure of a student's home address that I use is not the student's current mailing address; it is the home address used to determine whether the student should receive in-state tuition. 7 If a family, or an individual student, moves close to a high quality fouryear college right before beginning school, my address variable does not capture this change. Section of the Code of Virginia states that for a student to receive in-state status, he must have a domicile that was his continuous residence for at least one year prior to the date of alleged entitlement and that residence primarily for educational purposes shall not confer domiciliary status. For the addresses in my data, students must have shown that they did not move to the address just to attend college, and they must have lived there for at least one year before beginning college. Third, to control for student dierences in counties of varying local four-year college quality, all regressions estimated control for a host of county-level characteristics. It is true that there could still be dierences in county-level unobservables. To make the argument that this is not the case, all of the regressions are also estimated without county-level characteristics. Controlling for county-level characteristics has no eect on any regressions estimated in this paper. The fact that county-level observables do not mitigate any of the college quality eects is evidence that county-level unobservables also do not aect the relationship between college quality and student outcomes. 7 My actual estimation uses information on a student's home county, not actual address, but the county is determined from the home address as described here. 10

11 3.2 Equations The rst stage of the two-stage least squares estimation is as follows: Q i = X i α 1 + C i α 2 + µ i where Q i is the quality of the college student i attends, X i are student characteristics, and C i is the quality of the closest four-year college to student i's home. The second stage is as follows: G i = X i β 1 + Q i β 2 + ε i where Q i are the tted values of college quality from the rst stage. Since G i is a binary variable, estimating a logit or probit model would be preferred, but because an endogenous regressor is being instrumented, a linear probability model is used instead (Hausman 1975). The student covariates in X i are: gender, race, region of Virginia, age dummies, time xed eects, community college GPA, family income, whether the student submitted a FAFSA form, Pell grants received, and what degree, if any, the student earned in community college. 8 Additionally included in X i are county level demographics from the student's home county. The instrument, quality of closest college, is plausibly exogenous, conditional on observed dierences between counties. The county demographics I control for are: percent Black, percent Hispanic, percent who speak another language at home, percent high school graduates, percent college graduates, median home value, median family income, percent below the poverty line, and people per square mile. In this analysis, the coecient of interest, β 2, is the eect of college quality on graduation rates. Following Black and Smith (2006), I use several proxies for college quality because each measure probably contains signicant measurement error. The measures of college quality 8 Ideally I would control for a better measure of ability, like SAT score, but unfortunately, community college students are not required to take the SAT, so this information is not in my data. 11

12 that I use are: student/faculty ratio, percent admitted, freshman retention rate, average SAT, and graduation rate. Additionally, I have six measures of spending per student that also capture college quality: expenditures per FTE, expenditures per student, instructional expenditures per FTE, instructional expenditures per student, average subsidy per FTE, and average subsidy per student. 9 Finally, and also following Black and Smith, I use factor analysis to combine several factors into one. I run factor analysis on all six expenditure variables and nd one signicant factor. Next, I run factor analysis on all of the other college quality measures and nd one factor. Finally, I run factor analysis on all of the expenditure variables plus the college quality variables. Descriptive statistics on the 14 college quality measures used in this analysis are included in Table Closest College as an Instrument For closest college quality to be a valid instrument for actual college quality, it must be highly correlated to actual college quality, and it must also have no eect on students' graduation probabilities, except though its eect on college quality. 10 Closest college quality meets both of these criteria. While high school students choose colleges for a variety of reasons, community college transfers often choose the closest four-year school to their home. By choosing to start at a community college, these students have demonstrated a desire to stay close to home. This choice to attend a local school may be due to a desire to live at home or because of a family obligation. Community college students often face additional costs associated with attending a four-year college that is far from their home. These costs could be pecuniary (such as increased transportation or housing costs) or non-pecuniary (such as the inability to help out at home). As such, closest college quality is highly correlated to actually college 9 FTE stands for full time equivalent students. Average subsidy is the total amount of education and related expense that the institution covers (either endowment of state funding). It is the dierence between education and related expenses and net tuition revenue. 10 See Economic Analysis of Cross Section and Panel Data by Wooldridge, pg

13 quality. Table 2 shows the number of students who transfer from the largest community colleges to the largest four-year schools in Virginia. The cells on the diagonal show the students who attend community college and four-year college in the same town. For most of the community colleges listed, the majority of transferring students attend a four-year college in the same town. It is reasonable to assume that, conditional on student and county level observables, the quality of the closest four-year college does not have a causal eect on a student's probability of graduating directly. Historical happenstance has placed colleges throughout the country and these colleges have increased and decreased their quality over time. Below is a map of Virginia with each county shaded by percent poverty. Every public and private fouryear college is shown on the map. There are two facts to note about the placement of colleges in Virginia. First, both the public and private schools are located throughout the state. Second, the best public schools are located in a variety of areas; Tech is in a low income area, University of Virginia is located in a moderate income area, William and Mary is located in a high income area. The fact that colleges are located randomly based on county observables should make the assumption that they are also located randomly based on unobservable county characteristics easier to believe. 13

14 Community college students often attend the closest four-year college to their homes and it is plausible to assume closest college quality does not directly eect students' graduation probabilities. Thus quality of the closest college is a good instrument for quality of the college attended, and this instrument addresses the endogeneity of school choice. 3.4 Possible Shortcomings of the Instrument A possible problem in using closest college quality as an instrument for actual college quality would occur if selection into attending a community college is not exogenous to four-year college locations. For example, it could be that students near high quality colleges are more likely to transfer to a four-year school, because the returns from attending a four-year college 14

15 are larger. In this case, students with lower average ability would transfer, and this would minimize the returns to quality that I measure. To investigate this possibility, I calculated the average community college GPA of students who transfer from each county. There is very little variation in this statistic. The average is 2.8, the standard deviation is.14, the min is 2.4 and the max is 3.1. I also regressed whether a student transfers to a four-year college on the standard student controls, as well as an indicator for whether the student's closest college is one of the three most selective public schools (University of Virginia, William and Mary, and Virginia Tech). Living near a highly selective school did have a signicant eect on a student's transfer probability, but the eect was small in magnitude, only 2%. Overall, living near a competitive four-year college is only slightly related to a students' transfer probability, so the downward bias in the estimates of the returns to college quality is small. A second possible shortcoming with using closest college quality as an instrument is that it may not be an eective instrument for low ability students. Four-year colleges in Virginia do not have open admissions policies for community college students, so closest college quality may be uncorrelated to actual college quality if students are not admitted at their closest college. In practice, this is not a major concern. Only 6% of the sample are closest to University of Virginia, William and Mary, and Virginia Tech, which are the most dicult colleges to transfer into. The majority of the sample transfers to George Mason University, Old Dominion University, Virginia Commonwealth University, and Radford University, and these are all less selective universities. When the rst stage is estimated separately for students who have high and low community college GPAs, closest college quality is a signicant predictor for both groups, and the magnitudes of the coecients are similar When the sample is only students whose community college GPA is > 2.75, the rst stage instrument coecient is 14.3***. When the sample is students whose community college GPA is < 2.75, the rst stage instrument coecient is 9.0***. When the sample is students whose community college GPA is < 2.0, the rst stage instrument coecient is 10.3***. When the sample is students whose community college GPA is < 1.5, the rst stage instrument coecient is 10.4***. 15

16 3.5 Extensions As an extension to this model, there are several additional instruments that are also strong predictors of college quality. In addition to using closest college quality as an instrument, all of the analyses above are re-estimated using four additional instruments: average college quality within 30 and 45 miles and max college quality within 30 and 45 miles. If a student does not have a college within 30 or 45 miles, this instrument just takes the value of the closest four-year college quality. Estimating the model with dierent instruments serves as a robustness check and testing which instrument is the best predictor is informative as well. For instance, if closest college quality is a better predictor of actually quality than max quality in 30 miles, that is illustrative of how community college transfer students choose schools. 4 Data 4.1 Student Characteristics I use student-level data collected by the State Council for Higher Education in Virginia (SCHEV). Since the early 1980's, SCHEV has collected a census of every college student in the state of Virginia. Students are included in the data-set for as long as they are enrolled in any public or private, two- or four-year, college in Virginia. SCHEV collects data on student nancial aid, family nances, demographics, and college transcripts. The SCHEV data set I use is the population of all community college students in Virginia who began community college between 1994 and The data consist of about 430,500 unique student observations over the nine cohorts of data. Since this paper examines Virginia community college students and their transition to four-year colleges, I dropped all out-ofstate students. 12 I also focus exclusively on traditional community college students. There are ve types 12 This eliminated 9% of the sample. 16

17 of students in my data that I determined are not traditional students, so I eliminate them from my analysis. First, students who only take community college classes over the summer. These are most likely four-year college students who take community college classes during the summer to fulll prerequisite requirements. The second group of students I eliminate are under 18 years old in December of their rst year in community college. These are most likely high school students who are taking a class at a community college because their school does not oer enough AP classes. The third group of students I eliminate never take a community college class for credit. 13 Some students in my sample only audit community college classes or take them pass/fail. The fourth group of students I eliminate are students who begin community college when they are over 35 years old. 14 Adults who attend college later in life are probably very dierent than traditional students who start higher education within a few years of graduating high school. Finally, I eliminate all students who do not take at least 12 credits in community college. 15 Students who spend less than one whole semester in community college can hardly be classied as typical community college students. Additionally, I do not use my nal cohort of data, students who began community college in the school year. I do not want to count students as not earning a degree, just because they took a long time to graduate. Ending my sample with the school year allows all students in my sample eight years to earn a baccalaureate degree. For students in earlier waves of data, I truncate their degree attainment after 8 years also to be consistent. After eliminating my nal cohort of data, as well as out-of-state and non-traditional students, my sample size is roughly 205,000 students. The majority of my estimation focuses on students who transfer to a four-year school. I dene a student as transferring if the student takes at least one class at a four-year college. Because classes at four-year colleges are much more expensive than classes at two-year colleges, it is reasonable to assume that the primary reason a community college student takes a 13 This eliminates about 1% of the sample. 14 This eliminates about 7% of the sample. 15 This eliminates about 10% of the sample. 17

18 class at a four-year college is because he hopes to eventually earn a bachelors degree. In this analysis, I measure the eect that four-year colleges have on all community college students who want to earn bachelors degrees. Therefore, my denition of transferring includes all community college students who ever take a four-year college class. By this denition, about a sixth of my sample transfers, so my sample of transfer students is comprised of 34,103 students. Table 3 displays the descriptive statistics of my sample of community college transfer students. The sample is about 15 percent Black, 8 percent Asian, and 4 percent Hispanic. 46 percent of the sample is years old and 33 percent of the sample is years old. The family income measure is family income as reported on the student's FAFSA form (a Free Application for Federal Student Aid is the form families must submit to apply for federal grants or loans to help pay for college). Unfortunately, the SCHEV data only has family nance information for students who submitted a FAFSA form. Only 70 percent of the community college transfer students in the data submit a FAFSA form. Generally high income families do not submit a FAFSA because they have a very small chance of being awarded nancial aid, so family income information is missing non-randomly in the data. To address this issue, I code all students with missing income information to have an income equal to zero and I include an indicator variable for whether a student submitted a FAFSA form. The average family income of students who submitted a FAFSA is $40,690. About 40 percent of the sample earned an associates degree before transferring. On average, the students transfer after taking 8 credits of remedial classes and 65 traditional community college credits. This average is higher than I expected. 60 credits is the number needed to earn an associates degree, and many students transfer before earning their degree. It is a puzzle why so many students take more than 60 credits at a community college before transferring. The students take an average of 71 credits at a four-year college, indicating that they attend a four-year college for roughly 2-3 years. Below are graphs of the distribution of remedial two-year credits, traditional two-year 18

19 credits, and four-year credits earned by community college transfer students in Virginia. The vast majority of students taking remedial classes only take one class % of students take no remedial classes at all. 24% of students take 12 or more remedial credits. The graph of traditional two-year credit hours shows a mass of students who take hours of community college classes. 17 Possibly students are taking more than 60 hours because they are switching majors or because some technical certications require more than 60 hours. 16 The bars in the graph of remedial credits are 2 credits wide. 17 The bars on this graph are 5 credits wide. 19

20 In the graph of four-year credits earned, there is a mass of students that drop out after a semester, showing that students learn quickly that a four-year college is not the right t. 18 There is a second mass of students around 60 hours. These are most likely students who earn a bachelors degree. 46% of the sample eventually earns a baccalaureate degree. It is necessary for my instrument that there is geographic variation in bachelors degree attainment of community college 18 The bars on this graph are also 5 hours wide. 20

21 students. The map below shows the counties in Virginia shaded by the graduation rate of the community college students from that county. There is large variation in the average rate, from.20 to.81. The counties with the highest graduation rates are in the center of the state, near Virginia Tech and University of Virginia. The lowest graduation rates are in the southwest and southeast. The second panel of Table 3 includes descriptive statistics on the distance between a student's home and his or her four-year college. The SCHEV data does not include information on the student's home address, but it does include county identiers. The distance measures used in this analysis measure the distance from the center of a student's home county, to the four-year college he or she attends. Closest college quality appears to be a good instrument for actual college quality, because a quarter of the sample travels less than 10 miles to college and nearly half the sample travels less than 20 miles. 21

22 The third panel of Table 3 shows descriptive statistics on the county-level variables. The county-level data is from the 2000 census. The community college students in my sample live in counties that are on average 19 percent Black, 31 percent college graduates, and 69 percent homeowners. The average county median home value is $140,000 and the average county median family income is $67,000. Because the SCHEV data were collected for administrative purposes, very few observations are missing and very little of the data is implausibly small or large. The exception is the number of credits earned. The maximum number of two- and four-year credits earned in my sample are 322 and 416 respectively. I forced the maximum number of remedial credits to be 40 and the maximum number of two- and four- year credit to be 120 and 180 respectively. 4.2 College Characteristics The college characteristics data are downloaded from the National Center for Education Statistics' College Navigator website. The data are from the school year. The four-year colleges in Virginia that are included in this analysis are all baccalaureate degree granting institutions that do not have open admission policies. 19 The 36 colleges in Virginia that meet this criteria are summarized in Tables 4 and 5. Like many states, Virginia's colleges vary greatly based on location, student body, expenditures, and costs. The colleges in Virginia are generally small; the average student body is only 5,500 students. Of the 10 largest colleges, all our public. Although the public schools are the largest, they are equally competitive with the private colleges. Of the ve most competitive schools, three are public (William and Mary, University of Virginia, and Virgina Tech) and the other two are private (Washington and Lee and University of Richmond). The proportions of female students at Virginia's colleges range from 0 percent (Hampden-Sydney) to 100 percent (Hollins). The proportions of white students range from 2 percent (Saint Paul's, Virginia State) to 90 percent (Roanoke). Yearly tuition costs range from $4,916 (Old 19 Virginia Union University and Liberty University have open admissions policies and are therefore excluded. 22

23 Dominion) to $36,550 (University of Richmond). Table 5 describes six of the college quality measures used to evaluate colleges in this analysis. 20 Similar to the diversity in college characteristics in Virginia, there are also large dierences in the college quality measures throughout Virginia. The average SAT scores of admitted students vary from 740 (Saint Paul's) to 1395 (Washington and Lee). Freshman retention rates vary from 40 percent (Saint Paul's) to 97 percent (University of Virginia). Expenditures per full time students vary hugely from $16,249 (Blueeld College) to $94,734 (University of Virginia). 5 Results 5.1 Eects of College Quality on Degree Attainment Table 6 displays the coecient on each instrument in its respective rst stage and the associated F statistic of the instrument. Each instrument is used individually, but the coecients are shown in a single column for brevity. In each case, actual college quality is being instrumented with closest college quality, as measured with a dierent metric. Every measure of quality is a strong instrument; all the instruments are signicant predictors of actual college quality, and in every case, the F statistic is over 10. Table 7 presents the basic IV estimates of the eect of college quality on degree attainment. What varies across each model is the measure of college quality. For every specication except the student/faculty ratio, quality has a statistically signicant eect on graduation. What is more, the magnitudes of the eect of quality are consistent across all of the quality measures. Decreasing the student/faculty ratio or increasing the graduation rate by one standard deviation increases a student's graduation probability by 7 percentage points. Increasing the freshman retention rate or expenditures per full time student similarly in- 20 Some additional measures of spending per student are also used, as are some variables created using factor analysis. 23

24 creases a student's graduation probability by 9 percentage points. The average SAT and non-expenditure factor both increase the probability of graduation by 8 percentage points, when increased by a standard deviation. Increasing the total factor similarly increases a student's graduation probability by 13 percentage points. Taken together, if a student attends a four-year college that is one standard deviation higher in any one college quality measure, he or she increases his probability of graduating by 7-13 percentage points. The signs and magnitudes of the other controls in the regressions are generally as expected. Black students and females are 1-3 percentage points less likely to graduate, while Asian students are 7 percentage points more likely to graduate. Age has a large non-linear effect on students' graduation probabilities. Students who are years old are 5 percentage points less likely to graduate then year olds, while year olds are 15 percentage points less likely to graduate. Students' graduation probabilities are increasing in family income, nancial aid, and community college GPA. Students who receive a certicate or an associates degree for a technical certication in community college are less likely to complete a four-year degree, while students who receive an associates degree for bachelors degree credit are more likely to earn a four-year degree. Most of the home-county demographics do not have a signicant eect on students' graduation probabilities, and to the extent that they do, for instance median home income, the eect is very small in magnitude; increasing median home value by $50,000 increases the probability that a student graduates by.1 percentage points. Table 8 shows other specications of the basic IV regression model with additional measures of college quality. In these specications, only the quality coecients are reported because the other coecients vary little. The eects of the percent of admitted students, expenditures per student, the subsidy measures, and the factor measure are similar to the previous eect sizes; a one standard deviation increase, increases graduation probabilities by 5-13 percentage points. The eects of instructional expenditures per FTE and per student are higher; increasing graduation probabilities by percentage points when the measure 24

25 is increased by a standard deviation. The eects of the average subsidy measures do not have a statistically signicant eect on graduation rates, which is consistent with the fact that these variables are reported with considerable measurement error. Table 9 shows the OLS regression results for the seven main college quality variables. Each college quality measure was added individually to the degree attainment regressions, but all of the quality coecients are shown in a single column for simplicity. one might predict that the OLS results should be larger than the IV results. Initially, The OLS results combine the selection eect with the true quality eect and these variables should be positively correlated; the best community college students probably attend the highest quality colleges. In actuality, the bias is in the other direction; most of the IV returns to quality are larger than the OLS returns to quality. Card (1993) nds a similar result using a distance instrument to control for college attendance. Card suggests two possible explanations. First, there is measurement error in the endogenous regressor that is biasing the OLS results down. Card determines this is unlikely in his case, and I believe measurement error is an unlikely explanation here as well. The college a student attends is easy to measure accurately and all my data come from institutions (not self-reports), so extensive measurement error seems unlikely. The second explanation Card gives is that the individuals most aected by the instrument have the highest returns to college. I believe this explanation applies in my case as well. Students most likely swayed to attend a better school just because it is close are probably lower income or less ambitious students. These students probably benet most from a high quality college. Card (2000) reviews the literature on the eects of college choice and nds that for nearly all instrumental variable papers, the estimated IV eect is bigger than the OLS eect. Currie and Moretti (2003) also instrument the decision to attend college with a distance instrument and nd large IV than OLS estimates. To this end, the OLS results have some value in themselves. The IV results are being identied o the average treatment eect, and the population that is most treated by a 25

26 closer college is probably lower income or less ambitious students. The IV results do not have a selection problem because of the instrument, but they might be identied o a nonrepresentative portion of the population. The OLS results are being identied o the entire sample, but likely have a selection issue. The OLS regressions do have a rich set of controls including student- and county-level covariates, so the size of the selection bias is unclear. With the OLS results, the change in probability of graduating due to increasing one of the college quality variables by one standard deviation is between 2 and 8 percentage points. Table 10 explores whether there are non-linear eects to quality. Perhaps as long as students avoid very low quality schools, the increasing returns to quality are minimal. Or the reverse could be true; high quality schools drastically increase a student's graduation probability, but there is not a real dierence between low and middle quality schools. Table 10 shows that for some measures of quality, it appears the latter possibility is most likely; the returns to college quality are greatest for the high quality schools. 5.2 Subgroup Analysis and Additional Instruments To investigate if a subgroup of the sample is driving these results, the basic IV analysis is estimated again on various sub-samples. The rst column of Table 11 displays the results from the full sample. Again, each college quality measure was added individually to the degree attainment regressions, but all of the quality coecients are shown in a single column to simplify the table. The second column of Table 11 shows the eects of college quality for students whose families make less than $35,000 a year. Columns 3 and 4 dier in that in column 3, students are considered high income if their families submitted a FAFSA (so I have income information about them) and they make over $35,000, whereas in column 4, students are considered high income if their families make over $35,000 or they did not submit a FAFSA. For students taking classes at four-year institutions, one of the primary reasons that they do not submit a FAFSA is because their family income is so high that they will not qualify for nancial aid. Student's t-tests reveal that for most measures of quality, 26