The Life Insurance Market: Adverse Selection Revisited. Abstract



Similar documents
Preference Heterogeneity and Insurance Markets: Explaining a Puzzle of Insurance

Private information in life insurance, annuity and health insurance markets

Testing for Adverse Selection Using Micro Data on Automatic Renewal Term Life Insurance

Dynamic Inefficiencies in Insurance Markets: Evidence from long-term care insurance *

Sources of Advantagenous Selection: Evidence from the Medigap Insurance Market

Preference Heterogeneity and Selection in Private Health Insurance: The Case of Australia

Beyond Age and Sex: Enhancing Annuity Pricing

Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88)

Testing Advantageous Selection by Hidden Action: Evidence from Automobile Liability Insurance. Rachel J. Huang. Larry Y. Tzeng. Kili C.

Mortality Assessment Technology: A New Tool for Life Insurance Underwriting

Asymmetric information in insurance markets: evidence and explanations from the long-term care insurance market

NBER WORKING PAPER SERIES MULTIPLE DIMENSIONS OF PRIVATE INFORMATION IN LIFE INSURANCE MARKETS. Xi Wu Li Gan

Multiple dimensions of private information: evidence from the long-term care insurance market. Amy Finkelstein and Kathleen McGarry *

HEALTH INSURANCE COVERAGE AND ADVERSE SELECTION

ADVERSE SELECTION IN INSURANCE MARKETS: POLICYHOLDER EVIDENCE FROM THE U.K. ANNUITY MARKET

Automobile Insurance Policy Options, Rating System, and Risk Classification

Term 10 and Term 20. Insurance Advisor Guide. Guaranteed renewable and convertible

Life Insurance and the Suicide Exclusion

MULTIVARIATE ANALYSIS OF BUYERS AND NON-BUYERS OF THE FEDERAL LONG-TERM CARE INSURANCE PROGRAM

Why Do Entrepreneurs Have Higher Longevity Expectancies?

The Asymmetric Information Problem in Taiwan s Cancer Insurance Market

Chronic Disease and Health Care Spending Among the Elderly

Adverse Selection, Moral Hazard and the Demand for Medigap Insurance

Society of Actuaries Middle Market Life Insurance Segmentation Program (Phase 1: Young Families)

DOES STAYING HEALTHY REDUCE YOUR LIFETIME HEALTH CARE COSTS?

Adverse Selection in Insurance Markets: Policyholder Evidence from the U.K. Annuity Market

Sources of Advantageous Selection: Evidence from the Medigap Insurance Market* Hanming Fang Yale University. Michael Keane Yale University

Now we ve weighed up your application for our protection products, it s only fair we talk you through our assessment process. More than anything, we

EDUCATION AND EXAMINATION COMMITTEE SOCIETY OF ACTUARIES RISK AND INSURANCE. Copyright 2005 by the Society of Actuaries

ADVERSE SELECTION IN INSURANCE MARKETS: POLICYHOLDER EVIDENCE FROM THE U.K. ANNUITY MARKET

The Test for Adverse Selection In Life Insurance Market: The Case of Mellat Insurance Company

Volume Title: Frontiers in the Economics of Aging. Volume URL: Chapter URL:

Rationales for Social Insurance

THERE is a growing body of literature on the incidence

AN ANNUITY THAT PEOPLE MIGHT ACTUALLY BUY

For advisor use only Insurance Advisor Guide

Adverse Selection, Moral Hazard and the Demand for Medigap Insurance

Rising Premiums, Charity Care, and the Decline in Private Health Insurance. Michael Chernew University of Michigan and NBER

Modeling Competition and Market. Equilibrium in Insurance: Empirical. Issues

Why Do Life Insurance Policyholders Lapse? The Roles of Income, Health and Bequest Motive Shocks

New Evidence from Telematics Data

Principles of Life Insurance Underwriting. September 19, 2014 The Gibraltar Life Insurance Co., Ltd. New Business Service Team

Statement on Genetic Testing and Insurance

Life Changes, So Does Life Insurance

An Improved Measure of Risk Aversion

NBER WORKING PAPER SERIES WHY DO LIFE INSURANCE POLICYHOLDERS LAPSE? THE ROLES OF INCOME, HEALTH AND BEQUEST MOTIVE SHOCKS. Hanming Fang Edward Kung

The Intergenerational Wealth Effects of the Social. Security Notch. Preliminary Draft: Please do not cite or distribute. without authors permission

WHY DO PEOPLE LAPSE THEIR LONG-TERM CARE INSURANCE?

Testing for Adverse Selection with Unused Observables

No Lapse Universal Life Product Guide PLATINUM SERIES

Decline in Federal Disability Insurance Screening Stringency and Health Insurance Demand

Enhancement in Predictive Model for Insurance Underwriting

FAIR TRADE IN INSURANCE INDUSTRY: PREMIUM DETERMINATION OF TAIWAN AUTOMOBILE INSURANCE. Emilio Venezian Venezian Associate, Taiwan

Medical Bills and Bankruptcy Filings. Aparna Mathur 1. Abstract. Using PSID data, we estimate the extent to which consumer bankruptcy filings are

Why Do Rich People Buy Life Insurance? By Lisa L. Verdon College of Wooster. January Abstract

Medicare Advantage Stars: Are the Grades Fair?

Includes Tips & Tricks that could save you substantial $$$ and help make sure your claims get paid.

THE RESPONSIBILITY TO SAVE AND CONTRIBUTE TO

Health and Mortality Delta: Assessing the Welfare Cost of Household Insurance Choice

Advantageous Selection in Private Health Insurance: The Case of Australia

Life Insurance and Household Consumption

An Empirical Analysis of Insider Rates vs. Outsider Rates in Bank Lending

New Evidence from Telematic Data

Links Between Early Retirement and Mortality

10 Things You Need to Know About Buying Life Insurance

Efficient Retirement Life Insurance - The Bernheimian Case Study

Analyzing the relationship between health insurance, health costs, and health care utilization

Featured article: Evaluating the Cost of Longevity in Variable Annuity Living Benefits

Selection of a Joint-and-Survivor Pension

Biostatistics: Types of Data Analysis

NBER WORKING PAPER SERIES WHY DO LIFE INSURANCE POLICYHOLDERS LAPSE? THE ROLES OF INCOME, HEALTH AND BEQUEST MOTIVE SHOCKS. Hanming Fang Edward Kung

How To Find Out If A Car Insurance Policy Is A Good Thing For A Car Owner

BETWEEN-GROUP ADVERSE SELECTION: EVIDENCE FROM GROUP CRITICAL ILLNESS INSURANCE

Health Insurance Participation: The Role of Cognitive Ability and Risk Aversion

Genetic Adverse Selection: Evidence from Long-Term Care Insurance and Huntington Disease

Social Security Eligibility and the Labor Supply of Elderly Immigrants. George J. Borjas Harvard University and National Bureau of Economic Research

Do We Drive More Safely When Accidents are More Expensive? Identifying Moral Hazard from Experience Rating Schemes

ECONOMICS AND FINANCE OF PENSIONS Lecture 8

SELECTION EFFECTS IN THE UNITED KINGDOM INDIVIDUAL ANNUITIES MARKET. Amy Finkelstein MIT. James Poterba MIT and NBER

Information Guide Booklet. Life Insurance

The Determinants of Used Rental Car Prices

The Interaction of Partial Public Insurance Programs and Residual Private Insurance Markets: Evidence from the U.S.

Why Do Life Insurance Policyholders Lapse? The Roles of Income, Health and Bequest Motive Shocks

Policy Forum. Understanding the Effects of Medicare Prescription Drug Insurance. About the Authors. By Robert Kaestner and Kelsey McCoy

NBER WORKING PAPER SERIES PRICE ELASTICITY OF DEMAND FOR TERM LIFE INSURANCE AND ADVERSE SELECTION

Depression often coexists with other chronic conditions

Impact of Genetic Testing on Life Insurance

U.S. Individual Life Insurance Persistency

U.S. Individual Life Insurance Persistency

Health Status, Medicare Part D Enrollment, and Prescription Drug Use Among Older Adults

2. Professor, Department of Risk Management and Insurance, National Chengchi. University, Taipei, Taiwan, R.O.C ;

QoL Performer Plus SM. Life Insurance + Optionality = Opportunities that Can Last a Lifetime!

Contents: What is an Annuity?

THE EFFECT OF NO-FAULT ON FATAL ACCIDENT RATES

THE CHARTERED INSURANCE INSTITUTE. Unit P61 Life, critical illness and disability underwriting

Report to the 79 th Legislature. Use of Credit Information by Insurers in Texas

Priority Areas of Australian Clinical Health R&D

ADVERSE SELECTION AND MORAL HAZARD IN INSURANCE: CAN DYNAMIC DATA HELP TO DISTINGUISH?

Generational Aspects of Medicare. David M. Cutler and Louise Sheiner * Abstract

Pregnant and Parenting Youth in Foster Care in Washington State: Comparison to Other Teens and Young Women who Gave Birth

Transcription:

The Life Insurance Market: Adverse Selection Revisited Daifeng He 1 April 23, 2008 (draft) Washington University in St. Louis Abstract This paper finds evidence for the presence of adverse selection in the life insurance market, a conclusion in contrast with the existing literature. In particular, we find a significant and positive correlation between the decision to purchase life insurance and subsequent mortality, conditional on the pricing factors. Individuals who died with in a time window of 12 yea rs from purchase were 18 percent more likely to take up life insurance than those who survived the time window. Moreover, we find that individuals are mo st likely to take up life insurance four to years before their expected death. Methodologically, this paper addresses a sample selection issue and an omitted variable issue that are overlooked in the previous literature. Keywords: adverse selection, private information, life insurance, sample selection, differential mortality, new buyers, health status, pricing factors. JEL Classification numbers : D82 1 Economics Department, Washington University in St. Louis, Campus Box 1208, One Brookings Drive, St. Louis, MO 63130; dhe@artsci.wustl.edu. I am grateful to Sebastian Galiani, Edward Greenberg, Bruce Petersen, Tanika Chakrabuorty, Charles Courtemance and Jeremy Meiners for their comments and suggestions. - - 1 - -

I. Introduction: Empirical testing of contract theory has been a burgeoning area for economic research in recent years (See Chiappori and Salanie 2003 for a review). An important topic in this literature is to empirically test whether a particular insurance market is subject to asymmetric information (either due to adverse selection or moral hazard), and if it is, how to disentangle the effect of adverse selection from that of moral hazard. This literature has generally adopted the conditional correlation approach illustrated in Chiappori et al 2005. In this approach, the test for the presence of asymmetric information in an insurance market involves examining whether the risk outcome is positively correlated with the insurance coverage, conditional on the pricing factors. Although the existence of asymmetric information is the foundation of information economics and contract theory, empirical research so far has provided mixed evidence in various markets. 2 Cawley and Philipson (1999) examine cross sections of life insurance contracts using data from the Health and Retirement Study (HRS) and find a negative or neutral correlation between coverage and mortality risk. They suggest that the insured may not have an information advantage over the insurer, given risk classification. This result, together with evidence for bulk discounts also reported in their paper, has been cited widely as evidence that adverse selection is not prevalent in the life insurance market. 3 In contrast, we find evidence for the presence of adverse selection in the life insurance market. Also using the HRS dataset, we recover a significant positive correlation between mortality outcome and the decision to purchase life insurance, condit ional on risk classification. In particular, individuals with a higher mortality risk (those who died within a time window of 12 years) were 18% more likely to buy life insurance than individuals with a lower mortality risk (those who survived beyond the window). In addition, when we break down the mortality outcome in this window into a set of dummy variables indicating death between two consecutive waves further away from the time of purchase, we find a clear pattern of decreasing monotonicity in the coefficient estimates of those dummy variables: the latter the individual died within this 2 For example, see Chiappori and Salanie 2000, and Cohen 2005 for the auto insurance market; Finkelstein and Poterba 2004 for the annuity market; Cardon and Hendel 2001 for the employer provided health insurance market; and Fang et al 2006 for the Medigap market. 3 For example, see Chiappori and Salanie 2000, de Meza and Webb 2001, Hendel and Lizzerri 2003, Fang et al 2006, Chiappori and Salanie 2008, and Cutler et al 2008. - - 2 - -

window, the less likely she was to take up insurance. This strongly suggests that private information is the source of the adverse selection in the market. Several reasons help explain these distinctively different results than those found in the previous literature. First, we examine the group of the potential new buyers of life insurance, instead of the entire cross-sectional sample. The potential new buyers are individuals who did not own life insurance coverage at the beginning of the sample period. We argue that the potential new buyers are a more relevant group to study due to possible differential mortality between those individuals with coverage and those without coverage. Suppose individuals have private information about their mortality risk. Those with private information about their unfavorable health status (who then make the decision to purchase life insurance based on this information) are more likely to die early and thus are less likely to be found in a cross-sectiona l sample. Higher-risk individuals with coverage are therefore under-represented. 4 This sample selection could potentially bias the estimates of the conditional correlation between insurance coverage and mortality risk in a cross-sectional sample. Consider the following thought experiment. Suppose there were four individuals with the same observable good health five years before a sample was collected. Individuals 1 and 2 chose not to buy coverage since they knew that they were in good health. Individuals 3 and 4 chose to buy insurance since they knew they were in poor health even though they appeared healthy. Five years later, individual 4 died while the others lived on and were randomly drawn into the sample. Assume further that the three surviving individuals lived for an additional five years. A researcher who examine the cross-sectional sample may conclude that adverse selection is not present in the market since the observed mortality does not differ between the two individuals without insurance and the one who purchased insurance, with all three individuals surviv ing the five-year sample period. The real story, however, is that half of the covered died within 10 years while neither of the uncovered died. Second, we use the relevant controls to account for the pricing factors. Individual health status and medical history, and to a lesser extent, family history, weigh heavily when life insurers 4 In survival analysis, left truncation is used to describe the situation that the existence of an individual is unknown to the researchers if she dies before the beginning of the observational period. In our case, the left truncation is not ignorable since the mortality risk of those who are observed in the sample may not be representative of the population of interest. See Kalbfleisch and Prentice 2002, p13-14. - - 3 - -

determine an applicant s insurability and subsequent premium. These important pricing factors, however, are not controlled for in the specifications on mortality risk and coverage in the previous literature. Since individuals health conditions are likely to be correlated with both mortality and the decis ion to purchase life insurance, failing to control for these could potentially bias the estimate of the conditional correlation between mortality risk and life insurance coverage. A priori, the direction of the bias from omitting these variables is not cle ar. On the one hand, individuals with worse observable health status (and subsequently a higher mortality) either might be considered uninsurable by the life insurers or might face a higher premium that could affect their purchase decision. This could induce a spurious negative correlation between mortality risk and coverage, thereby biasing downward the estimate of the conditional correlation between these two variables. On the other hand, individuals in worse observable health may have a higher demand for coverage. This could produce a spurious positive correlation between coverage and mortality, and the estimate of the conditional correlation could be biased upward. We address this omitted variables problem by including a detailed set of controls for health status, medical records and family history that are commonly considered by life insurance underwriters. Third, thanks to the long panel of data now available, we can directly and precisely observe the actual mortality outcome for all individuals in our sample within a time window of 12 years. The previous literature, however, relies on self-perceived mortality risk or estimated actual mortality risk, which likely contains substantial noise. The rest of the paper is organized as follows. Section II provides background information on the life insurance market and a brief review of the literature. Section III describes the dataset used in this paper. Section IV discusses the empirical strategy and presents the results. Finally, section V concludes. II. Background/Literature Review 1. The life insurance market The life insurance market is of particular interest for testing the existence of adverse selection. First, it is an important market due to its large size. In 2004, 77% of American households held life insurance, with total protection reaching $19.1 trillion in 2006. The overall assets of the industry totaled $4.5 trillion, and $4 trillion was invested in the economy, making - - 4 - -

the industry one of the most important sources of investment capital in the U.S. (NAIC 2007, ACLI 2007). Second, moral hazard can be largely ignored in this market since possessing life insurance is unlikely to be an incentive for an individual to die sooner. 5 Life insurance is therefore a clean market to study in the sense that researchers can avoid the difficult task of disentangling the effect of adverse selection from that of moral hazard. Third, life insurance contracts are relatively explicit and simple. The only risk outcome is the death of the policyholders, an event which in principle is easy to verify and measure. 6 There are two basic types of life insurance, term insurance and cash value insurance. Term life insurance pays out a pre-specified award to the beneficiaries upon the death of the insured within the term of the contract. A term policy can be either an individual policy or a group policy. In this paper, we focus on individual term life insurance. 7 2. The underwriting practice of individual term life insurance The life insurance industry has developed a quite uniform underwriting procedure across states. The basic pricing factors include age, gender, personal habits (e.g. tobacco, alcohol and drug use), health status and medical history, family history, and vocation and hobbies (if hazardous). Other factors may include driving records, aviation activities, residence, and frequency and destination of foreign travel etc. 8 Premiums are higher for the elderly, males, those with a history of smoking, drinking, or drug abuse, those with unfavorable health status and/or an unfavorable medica l or family history, those in hazardous vocations, those with high risk hobbies, and so on. Typically, a life insurance agent interviews the applicant after receiving an application. The standard questions about health status and medical history are whether one has been diagnosed with high blood pressure, stroke, cancer, diabetes, high cholesterol, or a series of other conditions. The common question about family history is whether one or both of the parents have died 5 Tseng (2004) finds that the suicide exclusion (i.e. suicide is a coverable risk only after a certain period) in individual life insurance policies affects the timing and method of committing suicide. This however is hardly evidence of moral hazard. One explanation, as the author points out, is that ex ante some of the insureds have sustainable suicide intentions, and they choose to commit suicide after it becomes a covered risk. 6 See Chiappori 2000 for complications in other insurance markets. For example, in the auto insurance market, accidents and claims can differ and the decision to file a claim after an accident is likely to be endogenous. 7 53% of the face amount of all life insurance in force or 76% of the premiums in the U.S. was for individual coverage in 2006, making individual policies the most widely used form of life insurance. Among the new individual insurance purchased, 71% of the face amount was issued for term insurance. See ACLI (2007b). 8 See McGill s Life insurance 2000, Cummins et al 1983 and from the author s phone conversations with the state insurance departments. - - 5 - -

before 60 or 70 due to cardiovascular disease or cancer. 9 The insurer typically also requires a medical examination of the applicant and her permission to release medical records. During the medical examination, a paramedic usually collects blood and urine samples, measures blood pressure, height, and weight, and records the applicant s medical history. The higher the face value of the insurance policy, the more detailed information the insurer generally requires. After gathering the applicant s information, the insurer deducts points for favorable information and adds points for unfavorable information to a base score that is common to all applicants. Based on the final score reached, the insurer classifies an applicant into different risk categories, such as preferred plus, preferred, standard plus and standard risk; there can be subcategories within each of these categories. 10 The premium is largely similar for people in the same risk category given the same age, gender and smoking status. Usually, individuals with five times the base score would be declined by the insurer as uninsurable (McGill s Life Insurance 2000). Appendix A illustrates the general underwriting guidelines of individual term life insurance provided by QuickQuote.com, an online quoting system. It shows how applicants with different pricing characteristics would be generally grouped into different risk categories. The first table includes most of the pricing factors other than medical history, and the second table includes medical history. 11 In both tables, column (1) lists all the requirements that an applicant needs to satisfy in order to qualify for the best risk category of Preferred Plus ; column s (2), (3), and (4) are for categories Preferred, Standard Plus, and Standard respectively. For example, an applicant usually will not be considered for Preferred Plus if she has ever received high blood pressure treatments or her blood pressure readings have ever exceeded 140/85(column 1, the row titled blood pressure in the first table). This individual, however, may still qualify for Preferred if her blood pressure is now under control and her readings have not exceeded 145/80 in the past two years (column 2, the row titled blood pressure in the first table). 3. Related Literature 9 In recent years the weight on family history (except for cardiovascular-renal diseases) has been declining due to the difficulty involved in verifying this information. See McGill s Life Insurance (2000), p 520-521. 10 Some companies have three categories: preferred, standard, and substandard. The specific name of the categories can vary. 11 Age, gender, and factors like height, weight, and BMI are not listed here, since these factors are self-explanatory. - - 6 - -

A general testable implication, derived from the standard economic models of asymmetric information pioneered by Akerlof (1970) and Rothschild and Stiglitz (1976), is a positive correlation between risk outcome and insurance coverage, conditional on the observables used in pricing (Chiappori et al 2005). The empirical evidence so far has provided mixed results in various insurance markets. 12 For example, in the auto insurance market, Chiappori and Salanie (2000) find no evidence for the presence of asymmetric information, while Cohen (2005) finds that individuals who choose lower deductibles have more accidents. In the health insurance market, Cardon and Hendel (2001) dismissed the role of adverse selection as economically insignificant. While in the annuity market, Finkelstein and Poterba (2004) uncovered evidence of adverse selection in some dimensions but not in others. In the life insurance market, Cawley and Philipson (1999) provide evidence against the existence of adverse selection using multiple data sources. In particular, using the HRS dataset, they show that both the self-reported mortality and the estimated actual mortality are negatively or neutrally correlated with coverage, after controlling for age, gender, smoking status, marital status, income and wealt h, and bequest motives. In other words, their results imply that higher-risk individuals are less likely or at least not more likely to have life insurance coverage than the lower-risk individuals. They suggest that a potential explanation for their findings is that individual policyholders may not have better information about their mortality risk than the life insurers after underwriting. Hendel and Lizzeri (2003) study the properties of long -term contracts in the framework of a dynamic model with symmetric learning and one -sided commitment and find empirical support for the predictions of the model using data fro m the life insurance market. In particular, they show that all types of life insurance contracts in their sample involve front-loading (i.e, prepaying the premiums ); that the more front-loaded contracts are associated with lower lapsation and lower present discounted value of premiums over the coverage period. They argue that asymmetric information, among others, is not a plausible alternative explanation for their findings. 12 Given the mixed evidence on adverse selection, a more recent literature has brought up the opposite possibility of advantageous selection. See Meza and Web 2001, Fang et al 2006, Wambach 1997. - - 7 - -

III. Data The dataset that we use is the Health and Retirement Study (HRS). HRS is a nationally representative longitudinal survey of the elderly and the near-elderly in the U.S. It contains rich information on health status, insurance coverage, financial measures, demographics and family structure as well as some informat ion on individual expectations. Our analysis uses the HRS cohort, which consists of individuals born between 1931 and 1941. This cohort has been interviewed biennially since 1992. We obtain data on life insurance coverage from waves 1992 and 1994. We select these early waves instead of more recent data for ease of comparison to the results in the previous literature. These two waves are also the only ones in which the survey explicit ly asked whether a respondent had individual term life insurance. Moreover, using early waves for coverage data allow us to observe actual mortality outcome in a long enough time window. Tracker 2004 provides the mortality data. HRS records a respondent s vital status in each wave into five categories: alive at current wave, presumed alive at current wave, death reported at current wave, death reported at a prior wave, and vital status unknown. 13 We code a respondent as alive in 2004 if she falls into categories 1 or 2 in wave 2004, and dead if she falls into categories 3 or 4. We treat those in category 5 as missing observations. 14 Therefore, we observe the actual mortality outcome in a time window of 12 years. Each wave of HRS contains rich information on demographics, health status and medical history, and some limited information on family history. 15 This information is important for the purpose of testing for the presence of adverse selection in the life insurance market, since life insurers use this information to determine an applicant s insurability and subsequent premium. Whenever possible, we obtain the relevant variables from the RAND HRS. Respondents also reported a self-perceived probability to live to age 75. This information, however, most likely contains substantial noise. HRS also solicits information on individuals 13 For the precise coding criteria, see HRS Tracker 2004, Version 2, January 2007. 14 As a robustness check, we also code a mortality upper bound variable and a mortality lower bound variable as in Cawley and Philipson 1999, treating those with category 5 being dead and alive respectively. All of the results in Section IV remain qualitatively the same. 15 HRS records whether the respondent s mother and father are alive and their age, and if not, the age when they died. We code a parent having died before 60 as an indicator variable for unfavorable family history. We have two indicator variables for family history, one being whether the father had died before 60 and the other being whether the mother had died. These are crude measures. The HRS however does not provide information about the cause of the death of the parents. - - 8 - -

risk aversion through a sequence of gambling questions that supposedly reveal the respondents risk preference (see appendix B for the survey design). Kimball et al (2007) provided imputation of the risk aversion data for the HRS cohort to reduce the measurement errors associated with the raw data. Table 1 provides summary statistics of the relevant variables of our sample based mainly on information from wave 1992. Of the HRS cohort, 24% owned individual term life insurance in 1992 and 27% in 1994. Among the potential new buyers, 19% obtained individual term life insurance between 1992 and 1994. By 2004, about 15% of the cohort had died. The sample was largely balanced in gender and nearly three quarters of the respondents were married. Higher blood pressure, arthritis and back pain were the most common diagnoses. About a tenth of the sample had a hospital stay since age 45, and roughly the same portion of the sample was diagnosed with heart disease. Less than 10% of the sample was diagnosed with diabetes, cancer, lung disease, stroke or asthma. Nearly one third of the sample had a healthy weight with 44% of the sample being overweight and 22% being obese. IV. Empirical Strategy and Results 1. Conditional correlation between risk and coverage among the potential new buyers An ideal sample for the purpose of testing for the presence of adverse selection in the life insurance market would satisfy the followin g requirements. First, it should constitute a random sample of the population below a certain age threshold such that no individuals in the population younger than that age would consider purchasing life insurance. For example, age 20 might be such a threshold if no individuals younger than 20 in the population would have a demand for life insurance since they are unlikely to have dependents at that age. Second, all individuals in such a sample would be followed until the last one dies. A t the end of the sample period, a researcher could then observe the coverage status, mortality outcome and a complete set of pricing factors for all individuals in the sample who are potential customers of the life insurance market. Such a sample would not be subject to selection caused by differential mortality. A positive correlation between the decision to purchase insurance and a proper measure of mortality risk, conditional on the pricing factors, would provide evidence for the existence of adverse selection in the market. - - 9 - -

Of course, such an ideal sample does not exist. The HRS sample used by both the previous literature and this paper differs from the ideal sample in that the HRS sample is likely to suffer from selection bias due to potential differential mortality between individuals with coverage and those without coverage. Respondents in the sample were between 51 and 61 at the time of the first interview, an age by which many individuals had owned life insurance coverage for many years. 16 This sample may disproportionately consist of individuals with a lower mortality risk since higher-risk individuals with coverage were more likely to have died before the survey and thus less likely to be found in the sample. The estimate of the conditional correlation between mortality risk and coverage could therefore be biased downward. This sample selection bias could be partially responsible for the negative or neutral conditional correlation found in the previous literature. One major contribution of this paper is to address the selection bias caused by differential mortality. We define the group of the potential new buyers as individuals who did not own individual term life insurance in the 1992 wave. By the time of the 1994 wave, some individuals from this group had purchased individual term coverage (i.e. new buyers) while the rest remained uncovered (i.e. non -owners ). Distinguishing between the group of the potential new buyers and the entire cross-sectional sample is key to this paper. The main advantage of studying the behavior of the potential new buyers is that this group does not suffer from the potential pitfall of differential mortality, since the data allows for a complete follow-up (at least in a time window permitted by the sample period) of the mort ality of all individuals who were potential customers at the beginning of the sample period. Another important contribution of this paper is that we address the issue of omitted variables in the previous literature by controlling for the relevant pricing factors as exhaustively as possible. In particular, we include a detailed set of control variables for individual health status, medical history and family history that are important pricing factors considered by life insurance underwriters, besides age, gender and smoking status. The previous literature fails to consider 16 Other aspects of the HRS sample falling short of being ideal include a) most of the respondents were still alive by 2004, the end of our sample period; b) we may not observe all the pricing factors considered by a typical life insurer. These two limitations however are not as serious. For a), we define a measure of mortality risk as whether an individual had died in the observational period. Later in the paper we show that individuals seem to only have information advantage over the insurer in the near future. A s for b), HRS is already one of the most comprehensive datasets available and we believe that we have controlled for the majority of the pricing factors. - - 10 - -

these factors in the estimation of the conditional correlation between mortality risk and coverage. Given that health status, medical history and family history are likely to be correlated with both mortality and the decision to purchase life insurance, omitting these variables could induce a spurious correlation between mortality risk and coverage.. If worse health leads to higher premiums and thus less insurance demanded, leaving out the health status would bias downward the estimate of the conditional correlation. If worse health results in higher demand for insurance, the estimate could be biased upward. For the purpose of testing for the presence of adverse selection, the relevant controls should only be the pricing factors considered by the life insurance underwrit ers. Less relevant in this regard are factors like income, wealth, marital status, bequest motives, and whether an individual has close substitutes such as group insurance, since insurers can not price on these factors even though they may observe them. Controlling for these factors serves as useful robustness checks. An important feature of the group of the potential new buyers is that for every individual in this group, we observe a detailed set of pricing factors at the time that the insurance contract was signed. This is important since health status and medical history could change substantially over time, and the HRS dataset does not allow one to retrieve the individual characteristics at the signing time for contracts that had been in force before the sample was collected. In an analysis based on the entire cross-sectional sample, using the current individual characteristics could introduce considerable measurement errors into the pricing control variables. Table 5 shows the changes in the sample mean and standard deviation for the various pricing factors of the HRS cohort between 1992, 1998, and 2004. Almost all the health indicators changed considerably. For example, only 5% of the sample was diagnosed with cancer in 1992, while the corresponding number rose to 9% in 1998 and 15% in 2004, a two- to threefold increase. Note that the actual change in the average health status should be more dramatic than this since individuals with the worst deteriorating health may have disproportionately died and dropped out in later waves. We therefore estimate the following linear probability model new_buyeri = a0 + a 1*mortalityi + Xi*B + ei---(1) where the binary dependent variable new_buyer is set equal to 1 if individual i reported having individual term life insurance in 1994 but not in 1992 (i.e. a new buyer), and 0 if individual i - - 11 - -

reported not having individual term life insurance in 1994 or 1992 (i.e. a non -owner). 17 The variable mortality is an indicator for whether individual i had died by wave 2004, a measure of the mortality risk. a 1 is the parameter of interest and measures the conditional correlation between coverage and mortality risk controlling for the pricing factors. A positive estimate of a 1 suggests the presence of adverse selection. X represents a detailed set of pricing controls measured in 1992. Specifically, X includes the respondent s age, gender, smoking and drinking status (whether individual i has ever smoked, whether she smokes now, and whether she drinks alcohol now), health status and medical history (whether she has been diagnosed with diabetes, high blood pressure, cancer, heart disease, arthritis, lung disease, stroke, asthma, kidney disease, ulcer, high cholesterol, or back pain; whether she has had a hospital stay in the previous 12 months; and whether her BMI indicates that she has healthy weight, is overweight or obese), 18 and family history (whether her father or mother had died before age 60). 19 Table 3 presents the results of our analysis. For the sake of comparison, we first report in column s (1) and (2) the estimation results based on the entire cross-sectional sample in 1992. The dependent variable in these two columns is life_term_i, a binary indicator for whether the respondent held individual term life insurance in wave 1992, instead of being new_buyer. Column (1) does not include controls and column (2) adds the controls used in Cawley and Philipson (1999). 20 The point estimates of a 1 in both columns are negative and statistically insignificant, consistent with the results in the previous literature. The rest of the table presents the estimation results of model (1) based on the group of the potential new buyers. Column (3) does not include any controls, and we estimate a positive a 1 equal to 0.029, significant at the 10% level. This point estimate translates into a 16 percent increase in the insurance take -up rate by individuals with a higher mortality risk (0.029 divided by the sample mean coverage rate of 0.24 among the potential new buyers gives 16% ). The contrast between column s (1) and (3) suggests the importance of addressing the possibility of 17 Starting from wave 1996, HRS explicitly asked whether a respondent has purchased a new life insurance policy within the past two years. We however do not observe whether this new policy is term or cash value, individual or group insurance, therefore we do not use this information on new purchase in the more recent data. 18 Healthyweight is defined for BMI to be between 18.5 and 24.5 and overweight for BMI between 24.5 and 30. BMI above 30 is obese. 19 The data does not contain information on the causes of death of the parents. 20 They control for age, gender, marital status, smoking status, income, wealth and the following proxies for bequest motives: number of grandchildren, number of children, age of the youngest child, average age of children, number of siblings and age of spouse. In column (1) and (2), we use dummies for age instead of a linear age. The results remain almost identical with a linear age. - - 12 - -

sample selection caused by potential differential mo rtality. In column (4) we include the same controls as in column (2) and estimate a positive a 1 equal to 0.034, an 18% increase from the sample mean, though this coefficient estimate is not significant at conventional levels. Column (5) adds the set of controls for health status, medical history and family history that was omitted in the previous literature. The estimate of a 1 jumps to 0.047 and now is significant at the 10% level. This implies that individuals with a higher mortality risk are associated with a 23% increase in the prob ability of taking up insurance, after controlling for pricing factors and bequest motives. Comparing column (4) and column (5) shows that omitting the health controls leads to a downward bias in the estimate of the conditional correlation between risk and coverage. As discussed earlier, for the purpose of testing whether the life insurance market is subject to adverse selection, the relevant controls should only be the pricing factors considered in underwriting. Therefore in column (6), we include only the pricing control set X as defined before. The estimate of a 1 from this specification indicates the magnitude of the prevalence of adverse selection in the individual term life insurance market. Column (6) reports the main result of the paper, a point estimate of the conditional correlation between mortality risk and life insurance coverage equal to 0.034, significant at the 5% level. This result suggests that those with a higher mortality risk are more likely to purchase life insurance than those with a lower mortality risk, after risk classification is properly taken into account. Specifically, those who died within a 12-year window were 18% more likely to take up individual term life insurance than those who survived beyond the 12-year window (0.034 divided by the sample mean coverage rate of 0.19 among the potential new buyers gives 18%), conditional on risk classification. This significant and positive estimate of a 1 remains qualitatively the same even after we add controls for marital status in column (7), income and wealth quartiles in column (8), and possession of group term insurance in column (9). 21 These estimates of the conditional correlation between mortality risk and insurance coverage provide evidence for the presence of adverse selection in the life insurance market. They suggest that even after the stringent underwriting practices, policyholders still hold substantial residual private information about their mortality risk. We also add control variables for risk aversion in column (10), since risk aversion has been 21 An individual s access to close substitutes such as group life insurance may be correlated with both the individuals demand for life insurance and mortality risk if better jobs are more likely to offer group insurance and individuals with better job live longer. - - 13 - -

emphasized in the literature as an important source of selection based on preferences (Chiappori 2008, Finkelstein 2008, Meza and Web 2001, Wambach 1997). Variables ra2, ra3, and ra4 are dummies for the different levels of risk aversion inferred from the gambling questions in the HRS survey, with the omitted category ra0 for the least risk averse and ra4 for the most risk averse (see Appendix B for the survey design). Interestingly, the estimate of the conditional correlation remains quantitatively the same as in column (6), suggesting that selection on risk preference is probably not a concern in the life insurance market. 22 To summarize, Table 3 provides evidence for the presence of adverse selection in the life insurance market. 23 In our preferred specification, the significant and positive estimate of the conditional correlation between the mortality risk and insurance coverage suggests that individuals with a higher mortality risk are 18 percent more likely to take up insurance coverage than those with a lower mortality risk, after risk classification is properly accounted for. The result remains robust even after controlling for marital status, income and wealth quartiles, access to group insurance, and risk aversion. Ideally, we would like to examine the conditional correlation between the amount of insurance purchased and subsequent mortality. HRS however did not ask specifically for the amount of individual term insurance that a respondent held, though it asked her to report the amount of term insurance owned. Since term insurance can be either individual or group and the group market has very different underwriting procedures than the individual market, we can not investigate the conditional correlation on the intensive margin. We also estimate model (2) with a set of dummy indicators for self-perceived mortality risk instead of the actual mortality outcome, as in Cawley and Philipson 1999. Almost no coefficient estimates are significant and we can not detect any sign patterns. Since the self-perceived mortality risk is a somewhat controversial measure of an individual s private information, we choose not to emphasize this information in our analysis (results not reported but available upon request) (Gan, Hurd and McFadden, 2005). Figure 1 display a histogram of the self-perceived mortality risk of the entire HRS sample and clearly indicates several focal points in individuals 22 Coincidentally, Fang et al 2006 also find that risk attitude does not seem playing a role in the Medigap market. We also use the imputed measure of risk aversion by Kimball et al 2007 and results are almost identical. 23 We also estimate Probit versions of model (1) and both the estimates and the standard errors of a 1 are almost identical with the ones reported in Table 3. - - 14 - -

reporting of their self-perceived mortality risk. In particular, nearly half of the respondents reported either 50% or 100% as the chance to live to age 75. These responses are unlikely to represent the true underlying subjective mortality risk since the latter should be continous. They may suggest that individuals have difficulties in answering probability questions thus tend to choose psychologically appealing numbers. The resulting variable therefore may contain considerable measurement errors as a measure of the private information that a respondent hold. The estimates of the conditional correlation between the insurance coverage and risk may suffer from attenuation bias due the measurement errors in the self-perceived mortality risk. 2. Potential bias induced by excluding part of the original sample As argued earlier, we choose to examine the group of the potential new buyers to avoid a possible downward bias in the estimate of the conditional correlation between risk and coverage due to the sample selection issue caused by potential differential mortality. However, restricting the analysis to the group of the potential new buyers may also create another sample selection problem, since this strategy excludes from the analysis those individuals owning individual term life insurance in 1992. Conditional on mortality risk and pricing factors, individuals excluded could have experienced a positive shock in the past purchase decision (i.e. a larger disturbance term in a model describing the past purchase decision prior to the 1992 interview). Individuals included in our main analysis, the group of the potential new buyers, could instead have experienced a negative shock. If these shocks were correlated over time, excluding part of the sample based on the magnitude of the past shocks effectively results in a selection on the dependent variable. As is well known, selection on the dependent variable could potentially introduce severe bias in the coefficient estimates of the explanatory variable s. Using a Heckman two-step procedure, we provide evidence that this potential sample selection does not threaten our empirical strategy. The key to performing a Heckman test of sample selection rests in finding an exclusion restriction variable that predicts the outcome in the selection equation but does not enter the main equation of interest. The HRS cohort reported detailed information on the individual marital history, including the current marital status, the starting year of the current marriage for those who were currently married and the starting and ending years of the most recent marriage for those who were not currently married. In addition, every respondent in this cohort reported the starting and ending years of up to three earliest - - 15 - -

marriages. Since more than 99% of the respondents reported having 4 marriages or less, we are able to construct a complete marital history profile for almost the entire sample. In particular, we create a set of dummy variables indicating whether the respondent was married now (i.e. in wave 1992), five years ago, 10 years ago, and so on up to 30 years ago. Among these variables, we find that marital status five years ago is a significant predictor of individual term life insurance ownership in wave 1992. This variable however should not affect whether an individual chose to obtain coverage or remained uninsured between waves 1992 and 1994, controlling for her marital status in 1992. It therefore serves well as an exclusion restriction variable, and we utilize it to test our main results based on the group of the potential new buyers for bias from excluding part of the cross-sectional sample. 24 Formally, we implement the following Heckman two-step procedure: Prob(selection i =1)=Φ(β 0 + β 1 *mortality i + X i *P + β 3 m + β4*m5 + v i ) ----(2) new_buyer i = a 0 + a 1 *mortality i + X i *B + a 3 *m + γ λ + e i ----(3) where equation (2) is the selection equation and equation (3) is the main equation of interest. The variable selection is a binary indicator that takes the value of 1 if an individual did not own individual term insurance in 1992 and takes the value of 0 if she did. Those with selection equal to 1 are the potential new buyers and are included in our main analysis to recover the conditional correlation between coverage and risk. The variable mortality and th e pricing control set X are defined as before. m represents a dummy variable indicating whether the individual was currently married in 1992 and m5 indicates whether she was married five years before the wave 1992 interview. λ is the inverse Mills ratio estimated from the selection equation. Selection on the dependent variable can cause bias in the coefficient estimates in the main equation if the error term v i in the selection equation is correlated with the error term e i in the main equation. A simple t-test of the coefficient estimate of the inverse Mills ratio λ is a valid test for bias caused by sample selection. Table 4 presents the results of the Heckman test estimating the selection equation as a Probit model and the main equation of interest as a linear probability model. Column (1) reports the result of the first stage and column (2) of the second stage. As shown in column (1), whether an 24 Ideally, we would like to have a continuous variable that can serve as the exclusion restriction variable. We however can not find such a variable. One variable tried but having no predictive power in the selection equation is the earning from past employment. - - 16 - -

individual was married five years ago is a significant predictor of whether she is selected in the first stage. Column (2) presents a statistically insignificant estimate of the coefficient of the inverse Mills ratio (with a p-value of 0.637), thus reliev ing the concern about a potential bias in the estimate of the parameter of interest caused by excluding part of the entire sample. In addition, the estimate of the conditional correlation between risk and mortality is similar to what we obtain in the main analysis. 3. Timing of Purchase The binary indicator for death in a given time window is a coarse measure of the mortality risk, since it treats an individual who died, for example, in 2004 (more than 10 years from the baseline year 1992) and another individual who died in 1994 (almost immediately after 1992) as having the same mortality risk. In this subsection, we break down this mortality risk measure into dummy indicators for mortality outcome between two consecutive waves and estimate the following model: new_buyeri = a 0 + 12 t = a t * mort _ t 2 i + Xi*B + ei ----- (4) where new_buyer and the control set X are defined as in model (1). 25 mort_t is a dummy variable indicating whether a respondent died between waves t-2 and t, with t taking the values of 1994, 1996, 1998, 2000, 2002 and 2004, which correspond to the respective interview waves. An individual who died in a later year is considered to be of lower mortality risk. a t s are the parameters of interest, measuring how likely individuals who died at different points of time would take up insurance relat ive to those who survived beyond the sample period, thus measuring how mortality risk affects the insurance take-up decision. If individuals have information advantage over the insurers and they incorporate this private information in the decision to purchase life insurance, a t s should be monotonically decreasing as t grows since they should try to make the purchase shortly before their death. Table 5 presents the estimation results of model (4). The columns here correspond to the ones in Table 3. Columns (1) and (2) report results based on the entire cross-sectional sample. None of the estimates of the parameters of interest was significant, nor was monotonicity observed. Column s (3) to (10) document the results from examining the group of the potential new buyers. 25 Again, estimating Probit versions of model (4) produces very similar results as the linear probability models. - - 17 - -

With the preferred specification with only the pricing control set X included, column (6) shows the coefficient estimates of mort_1994, mort_1996 and mort_1998 to be 0.131, 0.086 and 0. 067, corresponding to a 69%, 45% or 35% increase in the take-up rate for those individuals who died within two, four or six years respectively since the base time period. The first two estimates are significant at the 5% level and the last one is marginally significant with a p-value of 0.11. The coefficient estimates for mort_2000, mort_2002, and mort_2004 are not statistically different from zero. The rest of the columns similarly reveal a clear pattern of decreasing monotonicity in the coefficient estimates of the dummy indicators for mortality risk. This consistent pattern across different specifications strongly suggests that an individual have private information about their mortality risk and that she utilizes this information in the decision to purchase life insurance shortly before their death. The above breakdown of the mortality risk measure also serves as an interesting diagnosis of the timing of purchase by individuals with private information. The results suggest that, individuals are most likely to take up life insurance approximately four to six years before their expected death. 26 We therefore define an alternative measure of mortality risk as an indicator for whether a potential new buyer had died by wave 1998. 27 Table 7 presents the estimation results of model (1) from the main analysis using this more refined measure of mortality risk. The columns here correspond to the ones in Tables 3 and 6. Again, the estimates of the parameter of interest from the cross-sectional sample are still insignificant. And just as expected, we estimate a much larger conditional correlation between risk and coverage using this new measure with the group of potential new buyers. In the preferred specification with only the pricing controls included, the point estimate 0.086 implies a 45% increase in the take-up rate among individuals who died within four to six years after purchase, significant at the 1% level. The estimate remains robust after controlling for marital status, income and wealth quartiles, access to group coverage and risk aversion. V. Conclusion 26 This finding may actually explain the fact that 5-year level term insurance accounts for the highest market share of all individual term life insurance (34.8%) (LIMRA, 1997). 27 The length of the time window of 12 years in the main analysis is somewhat arbitrary in the sense that it is determined by the length of the sample period. - - 18 - -

This paper finds evidence for the presence of adverse selection in the life insurance market, contrary to the conclusion reached in the previous literature. Individuals with a higher mortality risk were about 18% or 45% more likely to take up individual term life insurance than those with lower mortality risk, depending on the length of the time windows in which mortality risk is defined. We also present evidence suggesting that the residual private information held by the insured is more precise in the nearer future, and that this information dissipates rapidly as time passes. This study has the potential to revise the existing view of the informational nature of the life insurance market, which has been perceived as an important example of markets without adverse selection. The major methodological contributions of this paper are twofold. First, we address the possible sample selection issue caused by potential differential mortality between individuals with coverage and those without coverage, focusing the analysis on the group of the potential new buyers. Second, this paper addresses the omitted variables problem by including a detailed set of relevant controls for health status, medical history and family history that are important pricing factors considered in life insurance underwriting. One caveat of our strategy is that we may create another sample selection issue by basing the analysis on the group of the potential new buyers, thus excluding part of the original cross-sectional sample. The conclusion that this sample selection does not threaten our empirical strategy depends on the strong distributional assumption of normality of the error terms in the Heckman two-step procedure. We would also like to have a continuous exclusion restriction variable in the test rather than the dummy variable that we use, if we could have found one. We therefore do not claim our results as definitive. Some of the findings in the previous literature remain convincing, for example, the evidence for bulk discounts illustrated in Cawley and Philipson (1999). It however remains to be understood why multiple contracts are allowed in life/annuity insurance markets while not in most other insurance markets, since multiple contracting directly implies non-convex pricing. The lesson from this paper is that we need better data and empirical strategies or even better theory before reaching a definitive conclusion about the informational nature of the life insurance market. - - 19 - -

- - 20 - -