1 AN ECONOMETRIC ANALYSIS OF SMOKING BEHAVIOUR IN IRELAND Aileen Murphy, Department of Economics, UCC, Ireland. DEPARTMENT OF ECONOMICS WORKING PAPER SERIES
2 AN ECONOMETRIC ANALYSIS OF SMOKING BEHAVIOUR IN IRELAND Aileen Murphy, Department of Economics, UCC, Ireland. SECTION 1: INTRODUCTION Smoking is a dangerous habit, and despite vast information being widely available detailing the consequences, people continue to try tobacco products and become addicted. This addictive behaviour is one of the largest avoidable causes of death and disability in the developed world; with over 6,000 deaths annually in Ireland (VHI, 2005). While the number of smokers in recent times has decreased, there are still a significant number of regular smokers. The profile of the smoker is however changing (Layte et al, 2002). Firstly on an international level, the smoking epidemic is reaching its final stages in developed countries but just taking off in developing countries (Edwards, 2004). Secondly, on an Irish perspective the numbers of smokers has decreased but this decrease has not been seen uniformly across social classes, (Layte, et al, 2002). This paper, similar to previous literature (Layte et al 2002, 2004), focuses on the economics of smoking from an individual perspective through identifying what socioeconomic factors influence smoking behaviour and its frequency in Ireland. While Layte et al (2002) employed data from the 1998 living in Ireland survey and employed OLS; this paper employs data from the 2001 Living in Ireland Survey and employs count data models, as OLS may possibly yields inconsistent, biased estimators. Layte et al (2002) identified a negative relationship between age and the probability of smoking yet a positive also identified that being female had a negative relationship with the number of cigarettes smoked daily. Ensuing from this Layte et al (2004), discovered that being currently unemployed did not have an influence on the probability of being a smokier, while having been unemployed in the previous five years was a considerable positive indicator of smoking, similar findings result in a Spanish study by Manrique et al (2004). Tobacco was 2
3 once viewed as a normal good, has become an inferior good, preferences have changed and income is spent on alternative goods. Layte et al (2004) identified that people on lower incomes were 50% more likely to smoke than those in top income brackets: demonstrating a transformation in the income/consumption relationship with respect to tobacco for Ireland. However the OLS regression (Layte et al 2002) revealed a statistically insignificant result for the relationship between the log of equivalent and the number of cigarettes smoked daily. Subsequent studies conducted (Layte et al (2004) and Conniffe (1995)) demonstrate that income is no longer related to smoking to any significant degree in Ireland. In contrast, Manrique et al (2004) found Spanish households with more income available were more likely to smoke. Thus the literature reveals a positive relationship between cigarette smoking and age. While negative relationships are identified between cigarettes smoked and being female. In addition, lower social class groupings and low education levels were found to have a positive relationship with the probability of smoking. Whereas age and being female were found to have a negative relationship with the probability of smoking. This paper will identify the socioeconomic factors, which influence the frequency of cigarettes smoked daily common among smokers in Ireland. The data source is discussed in Section 2. Section 3 considers the econometric techniques utilised in the study. Section 4 reports and discusses the main results. Section 5 concludes the paper. SECTION 2: DATA The Living in Ireland Survey is the Irish component of the European Community Household Panel (ECHP), consisting of cross sectional surveys conducted by Eurostat annually from 1994 to The survey involves both a household and individual questionnaire. For the purpose of this paper the individual questionnaire for the 8 th wave (2001) is utilised. This questionnaire was completed by each adult (persons ages 16 or over) in the household. The Register of Electors was used as the sampling frame for the study. As this sample is of persons 3
4 not households, so to become representative of the population at the time the sample is reweighted from the actual population. Since 1998 Eurostat have included a number of questions relating to smoking behaviour these form the basis of the dependent variables as follows. With respect to the dependent variable for the study s first question, whether or not someone smokes, the questionnaire asks respondents if they smoke and how many cigarettes do/ did they smoke on average each day. Answers are integer in measure, ranging from 0 to 95 respectively. The dependent variables include income, gender, age education, marital status, medical insurance, medical card eligibility and occupation. Table 1 contains a description and the summary statistics for the dependent and independent variables used. Table 1: Description of Variables & Summary Statistics Variable Name Mean Std. Dev Min Max DEPENDENT VARIABLES f1: Number of cigarettes smoked daily INDEPENDENT VARIABLES age gender log income Employment Medical card Medical insurance Education no formal exams junior cert leaving cert diplomadegree higher degree Social Class Variable unknown unskilled manual managerial
5 SECTION 3: METHODOLOGY This section outlines the econometric models employed to determine the socioeconomic factors, which influence the frequency of Irish smoking behaviour The number of cigarettes smoked, are reported as integer count variables with a high rate of zeros and low values. While Layte et al (2002) employed OLS this paper employs countdata models, as OLS could yield inefficient, inconsistent, biased estimates (Long, 1997). Secondly, while 2part or hurdle models are appropriate for categorical dependent variables, they are problematic (Jones, 2007). As highlighted by Santos Silva and Windmeijer (2001), the 2part model is not appropriate where it cannot be assumed that there is a single incidence of smoking for each period of observation in the data. The most basic alternative to OLS available is the Poisson model which enables the probability of a count, such as number of tobacco products smoked daily, to be determined by a Poisson distribution as follows (Nolan et al 2003): Pr(Y=y i ) = exp (λ l ) λ l yi, y i = 0,1,2,. (3.1) y i! Where y i are the observed frequencies of the event Y, which is frequency of smoking cigarettes/cigars/pipes. λ l is a function of the set of independent variables. The Poisson model s central characteristic is that the conditional mean of the outcome is equal to the conditional variance. This however is not the case in practice, where often the conditional variance exceeds the conditional mean, resulting in overdispersion (Long, 1997), which would lead to consistent, but inefficient parameter values. To resolve this problem a negative binomial regression model is introduced. As an extension of the Poisson model with an additional parameter, it allows the conditional variance of y to exceed the conditional mean, thus tolerates overdispersion. The probability distribution of the negative binomial is as follows: 5
6 Prob( Y= y j / u) = e  λje(uj) λ j uj y j! (3.2) Where y is the probability of smoking and j equals cigarettes, cigars or pipes. e uj has a gamma distribution with mean and variance equal respectively to 1 and α There is often considerable unobservable heterogeneity between individuals who report zero levels of consumption. For example with respect to smoking, there are nonsmokers who will never smoke. In addition there are potential smokers who may report zero consumption levels due to economic decisions such as financial barriers but may smoke in the future. As it is not possible to differentiate between nonsmokers and potential/actual smokers due to lack of information, standard methods cannot be employed. Consequently models using a mixture of discrete distributions (Greene, 1994) and the zero altered models (Lambert, 1992) can be used to estimate these models, (Grootendorst, 1995). The zeroinflated model assumes there are two unobserved groups, that is to say the zeros can be divided into two groups. The first is the Always0 Group, which has an outcome of 0 and a probability of 1, which includes the nonsmokers. The second group, Not Always0 Group might have a zero count, but there is a nonzero probability that they have a positive count, this group includes potential and actual smokers, (Long and Freese, 2001). This proposed model consists of two behavioural processes, (Grootendorst, 1995). Firstly there is a splitting function, which estimates a conditional probability that an individual is either a nonsmoker or potential smoker. Secondly, there is a Poisson or negative binomial model of the potential/actual smoker. For nonsmokers zero consumption is automatic, irrespective of price. However, potential and actual smokers do respond to prices and income in their consumption decisions. The probability q i of being a nonsmoker is made conditional on a vector of covariates z i : q i = prob (nonsmoker) = F(z i` δ) (3.3) Where F(z i δ) is a cumulative distribution function. 6
7 Either the Poisson or negative binomial could be used to model the consumption for potential smokers (Grootendorst, 1995). Where the mean of the distribution λ i is made condition on another set of covariates, x i using the transformation λ i = exp (z i`β). To assess the performance of each of the alternative count models in modelling the frequency of smoking behaviour and to select one, the Vuong (1989) nonnested model selection criterion is first applied (Green, 2000:891 and Grootendorst, 1995). To test two competing probability models (f 1 and f 2 ) the statistic V is computed  V = N 1/2 m (3.4) s m Where, m i = log [f 1 (y i ) / f 2 (y i )] This statistic tests the null hypothesis that E[m i ] =0. An advantage of this test is its ability to discriminate between the different models; a large positive value, e.g. greater than 1.96 favours model 1 and a large negative value favours model 2 (Grootendorst, 1995). This statistic will be used to compare the Poisson model and zero inflated Poisson and the negative binomial model and zero inflated negative binomial. A likelihood ratio test using a Chisquared statistic is used to compare the Poisson model and negative binomial regression model and the zero inflated Poisson and zero inflated negative binomial respectively. 7
8 SECTION 4: RESULTS: Factors affecting Frequency of Smoking In examining factors influencing the frequency of smoking behaviour, count data models are employed, as the dependent variable is an integer. To determine the appropriate count model the likelihood ratio tests and Vuong statistics are employed, results of the specification test are presented in Table 2. With respect to the frequency of cigarettes smoked daily a likelihood ratio test (χ=8.2) between the Poisson and negative binomial models favours the negative binomial. This demonstrates the presence of overdispersion in the data. Using the Vuong test for nonnested hypotheses (v =35.83), the negative binomial is rejected in favour of the zeroinflated negative binomial. Applying a second likelihood ratio test (χ= ) rejects the zeroinflated Poisson model in favour of the zeroinflated negative binomial. Consequently the zeroinflated negative binomial will be used for the regression. The likelihood ratio, , rejects the null hypothesis of no over dispersion, indicating that the zero inflated negative binomial can improve goodness of fit over the zero inflated poisson. Results from the zero inflated negative binomial are presented in Table 2. The zeroinflated negative binomial consists to two groups: The group labelled Binary Equation contained in the Logit, includes the coefficients for the factor change in the odds of being in the Always0 group compared to the Not Always0 group. Being male decreases the odds of not having the opportunity to smoke cigarettes by 17%, holding all else constant. With regard to education, having a junior certificate or equivalent, increases the odds of not having the opportunity to smoke cigarettes by 46% in comparison to having no formal exams. Having a diploma or degree, increases the odds of not having the opportunity to smoke cigarettes by 63%, in comparison to having no formal exams. Having a higher degree, increases the odds of not having the opportunity to smoke cigarettes by 139%. This relationship is consistent with Mondon et al (2003) and Dedobbeleer et al (2004) Layte (2002). Being married, relative to no being married decreases the odds of not have not having the opportunity to smoke cigarettes by 20% relative to not being married. 8
9 Table 2: Zero Inflated Negative Binomial Frequency of Cigarette Smoking Negative binomial Logit Variable Coefficient Z Statistic Count Eqn % Change Coefficient Z Statistic Binary Eqn % Change age * male * * manual unskilled unknown loginc * Junior cert * * Leaving cert Diploma/degree * * Higher degree * married * * employment * Medical card * * Medical ins * constant LRchi McFadden's R2: McFadden's Adj R2: Model Specification Tests: Likelihood Ratio test: H 0 Poisson, H 1 Negative binomial: χ 2 = 8.2e +04 Vuong Nonnested test: H 0 Poisson, H 1 ZIP: Z = Vuong Nonnested test: H 0 Negative binomial, H 1 ZINB: Z = Likelihood Ratio test: H 0 ZIP, H 1 ZINB: χ 2 = n = 6518 * indicates significance at 5% level The second group, labelled Count Equation, contained in the Negative Binomial, consists of the coefficients for the percentage change in the expected count for those in the Not Always0 group. This group contains individuals who have the opportunity to smoke cigarettes. 9
10 Among those who have the opportunity to smoke cigarettes, being male increases the mean number of cigarettes smoked by 27%, holding all other factors constant. This positive relationship is consistent with Layte et al (2002). A positive relationship between age and the expected number of cigarettes smoked daily is revealed. As age increases, the expected number of cigarettes increases by 0.5% holding all else constant. This relationship is consistent with Layte et al (2002), indicating that older smokers, smoke more. With respect to education, attaining the junior certificate decreases the expected rate of smoking cigarettes by 6.3%, holding all other factors constant. In addition, having diploma/degree decreases the expected rate of cigarettes smoked by 10%, in comparison to having no formal education. Having a higher degree decreases the expected number of cigarettes smoked by 10%, relative to having to exams, holding all else constant. Similar to Layte et al (2002) the log of income was found to be statistical insignificant at the 5% level of significance. However having a medical card increases the mean number of cigarettes smoked by 6.5%, holding all else constant. As eligibility for a medical card is means tested, having a medical card is a proxy for lower income status. Thus, this positive relationship between medical card and number of cigarette smoked daily is inconsistent with previous Irish literature. Being married increases the mean number of cigarettes smoked by 14% relative to be unmarried, holding all else constant. Being employed for more than 15 hours per week, relative to not having employment decreases the expected mean of cigarettes smoked daily by 16%, holding all else constant. The direction of this relationship is consistent with Layte et al (2004). 10
11 SECTION 5: CONCLUSIONS This study identifies the socioeconomic factors which influence the frequency of smoking, thereby identifying commonalities among smokers. While the results revealed were in line with previous studies, in particular Layte et al (2002), a more comprehensive set of factors were controlled for using more recent data. Secondly, this study employs count data models as the methods employed in previous studies, OLS, are inappropriate due to the potential for inconsistent, biased estimators. The model specification results advocated the use of the Zero inflated Negative Binomial model. This count model revealed positive relationships between age, males, medical card eligibility, employment and being married, and the expected number of cigarettes smoked daily. While higher educational attainment had a negative relationship with the expected number of cigarettes smoked daily. In conclusion the utilisation of count data models illustrates the commonalities among those in society most who are at risk from being addicted to smoking and having high smoking frequency. To protect current and future generations especially in such prosperous times where financial barriers are a weak deterrent for young people especially, such commonalities should be acknowledged in the formulation of antismoking campaigns. Thus striving to further reduce and eliminate smoking, in line with previous campaigns and initiatives. 11
12 REFERENCES: Conniffe, D. (1995). Models of Irish Tobacco Consumption. The Economic and Social Review. 26 (4): Dedobbeleer, et al, (2004) Gender and the Social Context of Smoking Behaviour. Social Science and Medicines 58: 112 Edwards, R. (2004).The Problem of Tobacco Smoking. British Medical Journal. 328: Green, W.H., (2000) Econometric Analysis. Fourth Edition. Mc Graw Hill. New York. Green,W.H., (1994) Accounting for excess zeroes and sample selection in Poisson and negative binomial regression models. Department of Economics Working Paper EC Stern School of Business, New York University. Grootendorst, P.V., (1995) A Comparison of Alternative models of prescription drug utilisation. Health Economics 4: Jones, A.M (2007) Panel data methods and applications to health economics. HEDG Working Paper 07/18 Lambert, D. (1992). Zero inflated poisson regression with an application to defects in manufacturing. Technometrics 34:114 Layte, R., Russell, H., McCoy, S. (2002). The Economics and Marketing of Tobacco: An Overview of the Existing Published Evidence. Policy Research Series. Number 46. Economic and Social Research Institute, Dublin Layte, R., Whelan, C. (2004) Explaining Social Class Differentials in Smoking: The Role of Education Working Paper No. 12. Economic and Social Research Institute, Dublin Long, J.S. (1997). Regression models for categorical and limited dependent variables. Advanced Quantitative techniques in the social sciences No. 7. Sage Publications, Thousands Oaks, C.A Long, J.S., Freese, J.(2001). Regression models for categorical dependent variables using Stata. Stata Press, College Station, Tex 12
13 Manrique, J., Jensen, H.H., (2004) Consumption of Tobacco and Alcoholic beverages among Spanish Consumers. Southwestern Economic Review. 31 (1) :4156 Mondon, C.S.S. et al, (2003). Partners and Own Education: Does Who You Live With Matter for SelfAssess Health, Smoking and Excessive Alcohol Consumption? Social Science & Medicine 57: Nolan, A., Nolan, B. (2003). A Cross Sectional Analysis of the Utilisation of GP Services in Ireland: Working Paper No. 1. Economic and Social Research Institute. Dublin. Pan, Z., (2004). Socioeconomic predictors of smoking and smoking frequency in urban China: evidence of smoking as a social function. Health Promotion International. 19 (3): Santos Silva, J.M.C, Windmeijer, F. (2001) Twopart multiple spell models for health care. Journal of Econometrics. 104: VHI, (2001, 2005) Smoking why you should quit. Available from Internet: URL: cited 13/1/06 Vuong, Q. (1989) Likelihood ratio tests for model selection and nonnested hypothesis. Econometrica 57:
More information