How much do E-mail users feel annoying with spam mail? : measuring the inconvenience costs of spam

How much do E-mail users feel annoying with spam mail? : measuring the inconvenience costs of spam Yuri Park Yuri Park *1, Yeonbae Kim 1, Jeong-Dong Lee 1, Jongsu Lee 1 1 Ph. D. Candidate, research professor, associate professor, assistant professor, respectively, in the Techno-Economics and Policy Program, Seoul National University, Seoul, Korea * Corresponding author (Techno-Economics and Policy Program, Seoul National University, Shillim-Dong, Kwanak-Ku, Seoul 151-742, South Korea; E-mail: koon1225@snu.ac.kr; Phone: +82-2-880-8890; Fax: +82-2-886-8220) JEL classification: L86; C11; C25; C42; D18 Key Words: Spam mail, negative externality, inconvenience cost, conjoint analysis I. INTRODUCTION Some thirty billion e-mail messages per day travelled the Internet in 2003; by 2006 that number could rise to sixty billion, and more than half of those messages could be spam (OECD, 2003). Definitions of spam vary, but broadly speaking, we can say that spam comprises all unsolicited or unwanted commercial electronic messages sent to a large number of users without regard to the identity of the individual user (ITU, 2004). We believe most people are annoyed with the volume of obscene, commercial, or other types of spam they receive. Besides the psychological costs involved, those on the receiving end of spam incur various costs such as decreased labour productivity, 1

wasted time, the potential to lose useful e-mail mixed in with spam, and wasted bandwidth occupied by spam. Perhaps the most serious problem is that spam could eventually play a part in collapsing the foundation of trust supporting online society. Part of the reason e-mail boxes are flooded with spam is that in determining which mailing lists to target with their advertising, spammers do not take into consideration the costs consumers incur in processing the messages. Spammers therefore impose a negative externality on e-mail users; they send messages to users who may not want them but are forced to read them in order to find out other valuable messages. Moreover, the marginal costs of sending spam mail are so low that spammers have an economic incentive to send bulk spam mailings as long as just a few receivers respond. If the costs to society of a particular mailing, including sending and receiving costs, are greater than the expected benefits to spammers and e-mail users of the messages (for example, benefits accruing to firms and to the consumers of the goods that are sold by means of the mailing), then from a social planner s point of view more messages are being sent than is optimal. We call this situation excessive message sending (Shiman, 1996, p. 37). Shiman uses theoretical modelling to show how excessive message sending exists when firms use the e-mail system to advertise their goods. Given that spam constitutes a negative externality, we need an intervention such as a spam-control measure to internalise the externality. Van Zandt (2004) and Loder et al. (2004) suggest various alternatives to protect against spam for example, an attention bond mechanism 1 ; increasing communications costs; and other various 1 Attention bond mechanism is an economic solution to spam that allocates e-mail receivers attention. Each receiver set the size of attention bond that can be adjusted to receivers opportunity costs and e- mail is delivered after receivers permission. 2

tools exist for the purpose of controlling spam, but these have limitations, such as false positive determinations and difficulty to enforce those tools.. At this point no one can say what the best solution is. In order to address the spam problem and evaluate the effectiveness of spamcontrol alternatives we first need to measure the magnitude of spam costs. Researchers have attempted to estimate the social costs imposed by spam in various ways; as might be expected, governments as well as the business world have a keen interest in this project. 2 Studies show that the various costs of spam are considerable; however, in method these studies have heretofore merely added up possible costs rather than performing a true estimation. In addition, most studies calculate the costs of dealing with spam as opportunity costs for labour. They do not consider e-mail users disutility despite the certainty that receivers of spam are annoyed by having to filter, delete, or read spam messages even when they are not at work. To our knowledge, almost no one is looking at estimating the disutility of spam receivers 3, perhaps because it is tough to quantify such intangible losses. In this paper, we use conjoint analysis of statedpreference data to estimate the inconvenience costs incurred by e-mail users who receive spam. Using stated-preference data gathered from e-mail users, we can directly 2 According to Ferris Research (2003) spam cost U.S. corporations more than $10 billion in 2003. The Korean Information Security Agency (2003) and Nara Research (2004) reported that in Korea spam cost $11 billion and $43 billion in 2003 and 2004 respectively. 3 Yoo et al. (2003) estimated consumers willingness to pay for a spam-blocking program using contingent valuation method. 3

elicit users valuation of the negative effects of spam, taking into account such costs as time spent, loss of useful mail, intangible psychological distress, decreased labour productivity, and inconvenience of having to avoid using e-mail. Such an estimate of the inconvenience costs of spam should prove valuable to further research into the spam phenomenon. II. METHODOLOGY Conjoint Analysis Conjoint analysis enables us to use stated choices of survey respondents to measure their preferences in hypothetical situations. In such analysis, levels of attributes describing a good or service are combined to build descriptions of hypothetical bundles. Respondents are asked to state their preferences for each alternative card by ranking, rating, or choosing one alternative card on which a hypothetical bundle of attributes is described (Alvarez-Farizo and Hanley, 2002). In order to estimate the cost of spam in terms of inconvenience, we combined five service attributes (and their levels) to build a description of a hypothetical e-mail service package (Table 1). We defined spam for the respondents as unwanted, commercial, bulk e-mail. First attribute identify the volume of spam e-mail messages received each day. We set the volume range from 10 to 50 messages per day. We assume only two types of spam which constitute a major portion of spam mailings (KISA, 2003): spam messages with commercial content and those with obscene content. Use (or non-use) of an antispam program is included as an attribute because 4

such programs are an important tool in protecting against spam. Useful e-mail may not be delivered because the amount of spam exceeds the capacity limit of e-mail storage and for that reason e-mail storage capacity is an important factor and included as an attribute. Service price is included as an attribute in order to evaluate the other attributes in 110monetary terms. The survey was designed to ask respondents to rankorder a set of conjoint cards-bundles of attributes-according to their preference. Table 1. Attributes and Levels Measured by the Conjoint Analysis Cards Attribute Level Description Number of spam messages Antispam program 10 messages 30 messages 50 messages Self-conducted Offered by e-mail service provider Number of spam e-mail messages delivered per day. Each spam message consists of commercial and obscene contents. Respondent is using an antispam program provided by government or a company. Respondent is using an antispam program run by an e-mail service provider. Respondent uses no antispam program. None E-mail 5 Mbyte storage 20 Mbyte The storage capacity of the e-mail box. capacity 50 Mbyte 500 won * /month (US$0.42/ month) Monthly cost in order to use e-mail E-mail 1,500 won * /month (US$1.26/ month) service condition written in conjoint service price 2,500 won * /month (US$2.1/ month) cards * As of June 21, 2005, US$1 is equivalent to 1,013.4 Korean won. The survey was administered to 1,000 residents of Seoul, Korea, in May 2004; the sample was drawn on the basis of age and sex distribution in the population of Seoul. Responses were obtained face-to-face by well-trained interviewers. The size of the sample used for empirical analysis was 537; the screening question excluded 463 respondents who did not use an e-mail service. Model Specification 5

In this survey, each respondent was asked to rank the alternative cards in order of preference. In the contingent ranking conjoint analysis, a rank-ordered logit model is generally used for the estimation (Layton, 2000; Calfee et al., 2001). Although this model has an advantage in that the ranking (choice) probability has a simple closed form, it imposes restrictions on the ordering of preferences, such that the coefficients for each attribute are estimated to be the same across all consumers. Therefore we use the random coefficient model for the purpose of estimation. The random coefficient discrete choice model captures preference variation by introducing stochastic terms into the coefficients created by deviations from mean preferences, and allowing these terms to be correlated with each other. With this method, the stochastic component of the utility function is correlated with the choice alternatives through the model s attributes. That is, the model does not impose the independent of irrelevant alternatives property (Calfee et al., 2001). Procedures for estimating random coefficient discrete choice models have been developed within both the classical and Bayesian frameworks. Classical methods of estimation are generally based on the maximum likelihood estimation. Problems related to the computational burden of calculating the integration of multivariate (normal) density functions are overcome using the simulated maximum likelihood estimation method. Applications of the random coefficient discrete choice model using the classical approach are presented by Brownstone and Train (1999), Hensher (2001), Layton (2000), and Calfee et al. (2001). In particular, Layton (2000) and Calfee et al. (2001) used the classical approach to estimate the random coefficient discrete choice model in the framework of contingent ranking conjoint analysis. Allenby and Rossi (1999), Chiang et al. (1999), Huber and Train (2001), and Train (2003) have developed Bayesian approaches to random coefficient discrete 6

choice modeling. These methods construct a Markov chain Gibbs sampler that can be used for drawing directly from the exact posterior distribution and perform finite sample likelihood inference to any degree of accuracy (McColluch and Rossi, 1994). These procedures have certain advantages over the classical approach. First, one can avoid direct evaluation of the nontrivial likelihood function and the associated problem of approximating the choice probabilities that arise in applying the classical method. In addition, mathematical properties of the multinomial model do not guarantee convergence of the maximum likelihood estimation process to the global maximum, and the solution obtained by the nonlinear programming optimizer may depend critically on the location of the starting point of search for the solution. The Bayesian procedures, therefore, offer an advantage, since they do not involve maximization of the likelihood function. Second, the results of Bayesian procedures can be interpreted simultaneously from both the Bayesian and classical perspectives, drawing on the insights afforded by each tradition. The Bernstein von Mises theorem states that, under the conditions maintained in this study s methods, the mean of the Bayesian posterior is a classical estimator asymptotically equivalent to the maximum likelihood estimator. The theorem also establishes that the covariance of the posterior is the asymptotic covariance of this estimator (Train and Sonnier, 2003). Third, the desirable estimation properties, such as consistency and efficiency, can be attained under more relaxed conditions using Bayesian procedures, as compared to the classical methods. Maximum simulated likelihood is consistent only if the number of drawings used in a simulation is considered to increase with the sample size. Efficiency is attained only if the number of drawings increases faster than the square root of the sample size. In contrast, Bayesian estimators are consistent for a fixed 7

number of drawings used in the simulation and are efficient if the number of drawings rises at any rate with the sample size (Train, 2003). According to the random utility framework proposed by McFadden (1974), we assume that an individual i faces a choice among J alternatives in each of T choice sets in a survey, and is asked to rank the alternatives in order of preference. In the empirical setting, the alternatives are e-mail services in Korea. We can then represent the utility derived by an individual choosing alternatives j in a choice set t as follows: U = β x + ε (1) ijt iz ijt ijt where x ijt is the vector of attributes associated with alternative j, the coefficients of attribute vector, xijt and ε ijt is a random disturbance. The random disturbance ( ε ijt β iz is a vector of ) is assumed to have an independent and identical extreme value distribution. The coefficients vector, β i, is assumed to be distributed normally across the population with mean vector b and variance-covariance matrix W; that is, βi follows unbounded normal distribution. In our setting of contingent ranking conjoint analysis, we can adopt the same procedure for Bayesian estimation as in Train (2003). The only difference is that we calculate the probability of the individual s sequence of rankings, used in the Metropolis Hasting (M-H) algorithm, instead of the probability based on the response of the most preferred choice in Train (2003). The probability of an individual i s observed sequence of rankings among alternatives is 8

Lr ( { r,..., r} β ) J 1 β x i i1 it k= J β x t= 1 j= 1 e k= j T e ijt = = (3) ikt where r it = { ri t, ri 2,..., r 1 t ijt } is the vector of individual s (i s) ranking responses of the choice sets in the descending order of preference in choice set t. The unbounded normal distribution for the price coefficient has some undesirable properties. For example, the normal distribution for a price coefficient implies that some share of the population actually prefers higher prices. The existence of price coefficients with the wrong sign renders the model unusable for calculating the WTP (willingness to pay) and other welfare measures. Also if the distribution of price coefficients overlaps 0 (zero), then the WTP becomes infinitely large for some customers (Train and Sonnier, 2003). In this study, we assume that the price coefficient has a log-normal distribution. This distribution has better properties in that it restricts the price coefficient for all respondents to have the same sign, and the price coefficient cannot have the value of zero. This distribution can be obtained as a simple transformation of normal distribution of β, C = exp( β). The unbounded normal distribution is also inappropriate in case of the coefficient of a desirable attribute that is being valued by all customers. For example, it is implausible that there would be users who dislike higher quality of service. That is, users who choose low-quality service level priced the same as is higher-quality service level are unlikely to exist in real world. Accordingly we also assume anti-spam program and e-mail storage capacity? have a log-normal distribution. When a transformation is used for bounded distributions of coefficients, the utility function is specified as follows: 9

U = C( β ) x + ε (4) ijt i ijt ijt where C is a transformation function. There are minor changes in the estimation procedure using this transformation. The probability of the individual s sequence of ranking, which is used in M-H algorithm, should be changed based on transformed β i (Train and Sonnier, 2003). III. ESTIMATION RESULTS We generated 20,000 draws with the Gibbs sampling. The first ten thousand draws were discarded. The draws in every tenth iteration of the second ten thousand draws were retained. One thousand retained draws were used to draw inferences. The means of the one thousand draws of b and of the diagonal elements of W are shown in Table 2. From the Bayesian perspective, these are posterior means of b and the diagonal elements of W. From the classical perspective, they represent the estimated mean and variance of the βi -s in the population. Generally, every estimated parameteris statistically significant. Table 3. Estimation results (before transformation) Attribute Variable Mean (b) of β Variance (W) of β Number of Spam messages SPAM -0.0118 0.0255 (0.0090) a (0.0022) Antispam program (self-conducted) SELF 1.2361 (0.7786) 58.3292 (15.7222) Antispam program (offred by e-mail service provider) PROV 0.7010 (0.5224) 6.3966 (2.4187) 10

E-mail storage capacity CAP 0.0355 (0.0071) 0.0194 (0.0015) E-mail service price PRICE b 4.3728 (0.3224) a (Posterior) standard deviations in parentheses. b The PRICE variables are entered as the negative of PRICE 11.3156 (1.8689) Since b and W are the mean and variance of the β i in the population from the classical perspective, the distribution of coefficient of each variable is obtained through the simulation process on the estimated values of b and W. Two thousand draws of β i were taken from a normal distribution with the mean equal to the estimated value of b and variance equal to the value of W. Each draw of β i was then transformed to obtain a draw of coefficients shown in Table 3. The negative sign of SPAM means people don t like increasing volume of spam mail. Mean of coefficient for anti-spam program shows people prefer self-conducting to offering by e-mail service provider but the variance of SELF is very large compare to PROV. So we know that people prefer the anti-spam program offered by e-mail service provider based on median estimates. Table 3. Transformed random coefficient estimates Variable Mean Variance Median SPAM 0.1142 0.0181 0.1286 SELF 4.3826 17.4814 0.2781 PROV 1.4416 3.0271 0.3897 CAP 0.0367 0.0204 0.0332 PRICE 2.6092 27.7390 0.0130 We now calculate the welfare associated with changes in the level of attributes. Change in the level of compensating variation associated with one unit increase of each attribute is the ratio of the coefficient for the attribute to the corresponding coefficient for the price (coefficient). We calculate the marginal willingness to pay 11

(MWTP) of individual i from each draw of parameters ( β ) in the same simulation described earlier, and we obtain distributions of MWTP for a change in each attribute 4. (See Table 4) i Table 4. Marginal Willingness to Pay (MWTP) in US$/month Attribute Median SPAM 0.0676 Anti-spam SELF 4.2464 Program (base: none) PROV 5.5727 CAP 0.0095 WTP for decreasing one spam message a day is US$0.0676 per a month. That is, e-mail users are willing to pay, on average, about US$0.00225 per message for a one unit decrease per day. That is understood as the marginal externality cost of one spam mail received. And we find that people are willing to pay US$4.25 and US$5.57 per month for a self-conducted antispam program and a service-provider offered antispam program, respectively. The preference of e-mail users for antispam program doesn t show large difference but having an antispam program is an important factor, as it has a relatively high WTP. WTP for a 1-megabyte increase in storage capacity is US$0.0095 per month, and that roughly matches the price in the real world. (A user of the popular Hotmail e-mail service pays about US$19.95 per year for 2 gigabytes of capacity.) IV. CONCLUSION 4 Since we use utility function which does not include income term, the compensating variation for 12

In this paper, users overall inconvenience cost created by spam is estimated based on stated-preference data using conjoint analysis. First, we estimate the marginal unit cost to the receiver of spam to be about US$0.00225) for each spam e-mail message. The social costs of receiving spam can be calculated based on research into current basic statistics regarding spam that is, how many e-mail messages are delivered, how large a share spam occupies of all e-mail messages, how many e-mail accounts people have, how many people use an e-mail service, and so on. We infer that e-commerce may sink into atrophy if consumers get to the point where they are unlikely to buy online products or services because they receive too many fraudulent or deceitful advertising e-mail messages. 5 That is, we can envision a scenario under which the trust that supports online society is undermined. If an extremely low marginal sending cost and the existence of externality are enabling the flood of spam, the best solution is to internalise that externality. One way to accomplish that is to increase the sending cost to cover the social costs of spam including the users inconvenience cost. Hence we see the value of an estimate of the marginal externality cost of spam. Governments and businesses in countries worldwide are seeking solutions to the problems caused by spam and are considering various technological or legislative measures. The results of this study should prove essential for comparing alternatives and for making judgements regarding mitigating spam and increasing social welfare. attribute change is equal to MWTP. 5 According to the International Telecommunications Union (2004), some 36 percent of spam consists of an attempt to deceptively sell something or perpetrate a scam or fraud. 13

REFERENCES Allenby, G.M. and P.E. Rossi (1999) Marketing models of consumer heterogeneity, Journal of Econometrics 89, 57 78. Alvarez-Farizo, B., and Hanley, N. (2002) Using conjoint analysis to quantify public preferences over the environmental impacts of wind farms: An example from Spain. Energy Policy, 30, 107 16. Brownstone, D. and K. Train (1999) Forecasting new product penetration with flexible substitution patterns, Journal of Econometrics 89, 109 129. Calfee, J., Winston, C., and Stempski, R. (2001) Econometric issues in estimating consumer preferences from stated preference data: A case study of the value of automobile travel time. Review of Economics and Statistics, 83, 699 707. Chapman, R. G., and Staelin, R. (1982) Exploiting rank ordered choice set data within the stochastic utility model. Journal of Marketing Resource, 22, 288 301. Chiang, J., S. Chib and C. Narasimhan (1999) Markov chain Monte Carlo and models of consideration set and parameter heterogeneity, Journal of Econometrics 89, 223 248. Ferris Research (2003) Spam control: Problems and opportunities?, San Francisco, Ferris Research. Huber, J. and K. Train (2001) On the Similarity of Classical and Bayesian Estimates of Individual Mean Partworths, Marketing Letters 12, 257-267. International Telecommunication Union (ITU) (2004) Spam in the information society: Building frameworks for international cooperation. Geneva, International Telecommunication Union. Korean Information Security Agency (KISA) (2003) Spam mail: Circulated volume and its damages, Seoul, Korean Information Security Agency. 14

Layton, D. F. (2000) Random coefficient models for stated preference surveys, Journal of Environmental Economics and Management 40, 21-36. McColluch R. and P.E. Rossi (1994) An exact likelihood analysis of the multinomial probit model, Journal of Econometrics 64, 207 240. McFadden, D. (1974) Conditional logit analysis of qualitative choice behavior, in P. Zarembka (Eds.), Frontiers in Econometrics Academic Press, NY. Nara Research (2004) A survey on the current situation of spam mail, Seoul, Nara Research. Organisation for Economic Co-operation and Development (OECD) (2003) Background paper for the OECD workshop on SPAM. Paris, Organisation for Economic Co-operation and Development. Loder, T., Alstyne, M. V., and Wash, R. (2004) Information asymmetry and thwarting spam. Technical report, University of Michigan. Shiman, D. R. (1996) When E-mail becomes junk mail: The welfare implications of the advancement of communications technology. Review of Industrial Organization, 11, 35 48. Train, T.E. (2003) Discrete choice methods with simulation. Cambridge University Press, Cambridge. Train, K. and G. Sonnier (2003) Mixed logit with bounded distribution of partworths, working paper (University of California, Berkeley and Los Angeles) Van Zandt, T. (2004) Information overload in a network of targeted communication. Rand Journal of Economics, 35, 542 60. Yoo S. H., Kwak S. J., and Shin C. O. (2003) Measuring the inconvenience costs of spam mail in Korea using the contingent valuation method. Applied Economics Letters, under review. 15