Estimating risk preferences of bettors with different bet Eberhard Feess Helge Müller Christoph Schumacher Abstract This paper utilizes unique betting market data that includes actual bet to estimate preferences of representative bettors. Our descriptive statistics support the intuition that bet are to a large degree decreasing in odds. This indicates that accounting for different bet is crucial as there is no per se-reason to assume that bets on longshots are associated with higher risk. In fact, the model with actual bet differs considerably from the one with identical bets: Most coefficients for probability weightings are insignificant when assuming identical bet, but highly significant with the actual bet. Moreover, allowing for different bet reduces the standard errors for all coefficients substantially. The performance of models based on cumulative prospect theory increases sharply when we allow for loss aversion or for two different parameters for the probability both in the gain and in the loss domain instead of just one. We find strong evidence for risk neutral bettors in all specifications that perform well. Keywords: betting markets, favorite-longshot bias, estimation of risk preferences, overweighting of small probabilities, behavioral finance JEL-classification: D14, D81, G02, G11 Corresponding author. Frankfurt School of Finance & Management, D- 60314 Frankfurt, Germany. email: e.feess@fs.de Frankfurt School of Finance & Management, email: h.mueller@fs.de School of Economics and Finance, Massey University, Auckland, New Zealand, email: C.Schumacher@massey.co.nz 1
1 Introduction Sports betting data is useful for estimating risk preferences due to the fact that the risk of different bets is uncorrelated and returns are realized soon after the bets are placed. Consequently, sports betting data has often been applied for estimating which of the canonical models for choices under risk (in particular expected utility theory, EUT, and cumulative prospect theory, CPT) can best explain a given data set. A relatively robust result is that different specifications of CPT perform better than EUT, a finding that coincides with most experimental results on behavior under risk (see e.g. the overview by Harrison and Rutström, 2008). Most of the literature adopts the so-called representative bettor approach which usually estimates the parameters of the bettors utility functional under three assumptions: (i) odds are such that bettors are indifferent between betting on the different available choices for an event, (ii) all bettors are identical, and (iii) bet are independent of odds. While assumption (i) is just an equilibrium condition, assumptions (ii) and (iii) are counterfactual and are adopted either for simplicity or due to data limitations. A few recent studies drop assumption (ii) and estimate individual preferences either by using data on individual betting behavior (Andrikogiannopoulou, 2010) or by using aggregated data (Chiappori et al., 2012, Gandhi and Serrano-Padial, 2012). Our work is complementary since we keep the assumption of a representative bettor, but drop assumption (iii) by allowing for different bet. To the best of our knowledge, our paper is the first that utilizes a large data set with about 800,000 observations which includes the actual bet to compare EUT and CPT within the representative bettor approach (see the literature review below). The descriptive statistics already reveal that accounting for different bet is important as those are to such a large degree decreasing in odds that the correlation coefficient between odds and the variance of return, which may be seen as a rough proxy for risk, is significantly decreasing in odds. This is not surprising as few (non-professional) bettors would be willing to bet large amounts on New Zealand winning the soccer world cup. In fact, we find that models including the actual bet perform far better than those with the counterfactual assumption of identical bet. In line with most of the data analyzed in the literature (see the overview in Ottaviani and Sørensen, 2008), 1 we observe a pronounced favorite-longshot bias (FLB) expressing that returns are, on average, decreasing in odds. Our analysis strongly suggests that bettors are risk-neutral, and that the FLB can well be explained by a consistent overweighting of small probabilities, by different probability weighting in the gain- and in the loss-domain, and by a lower weighting for losses than for gains (i.e. the opposite of loss aversion). Of course, the usual caveat that bettors are unlikely to be representative for the 1 A few papers find a reversed FLB, that is, markets where average returns are higher when betting on longshots (see in particular Woodland and Woodland, 1994, and Sobel and Raines, 2003). Boulier et al. (2006) find no significant impact of odds on returns. 2
population as a whole also applies to our paper, i.e. there is a sample selection bias (see analogously Andrikogiannopoulou, 2010 and Chiappori et al., 2012). However, this is the case for all real-world data including casino gambling, game shows or investment behavior in financial markets. While the early literature estimated the parameters of the representative bettor s utility functional only for EUT (Weitzman, 1965, Ali, 1977), the research following the seminal paper by Jullien and Salanié (2000) has shown that models based on CPT outperform EUT (see the overviews in Jullien and Salanié, 2008, and Ottaviani and Sørensen, 2008). These findings are strongly reinforced in our estimations with the actual bet as risk-seeking behavior can be excluded as an explanation of the FLB. In their seminal paper on the comparison of EUT and CPT, Jullien and Salanié (2000) estimate the parameters for the different specifications of the utility functions and compare how good they explain the success probabilities observed in the data. To demonstrate the impact of bet, we use specifications that are rather close to theirs, albeit with some extensions. In particular, we allow for differences in the probability weighting in the gain and in the loss domain which turns out to be important, and we estimate the success probabilities assumed by bettors in their indifference conditions from the true probabilities for odds in the whole data set. Snowberg and Wolfers (2010) use exotic bets (for instance bets on the winner and the second best horse, called "exacta") for discriminating between explanations based on utility functions and on probability weighting. In their data set, expected returns are basically the same for odds between 5 and 10. Therefore, under the probability weighting model, betting on an exacta with odds of 5 for the first and odds of 10 for the second horse should yield the same expected return as betting on an exacta in which the horse with odds of 10 wins and the other horse runs second, which should not be the case for EUT where the FLB is explained by the curvature of the utility function. In our data set, the FLB is robust for all odds, i.e. expected returns are not "almost" the same for odds between 5 and 10 as in Snowberg and Wolfers (2010). To our knowledge, Bradley (2003) and Kopriva (2009) are the only other papers accounting for the impact of different bet on odds within the representative bettor approach. Bradley (2003) develops a model where bettors must be indifferent among all bets including the (exogenously given) odds and bet bettors can choose so as to maximize their utility for the respective odds. As he has no data on bet, however, Bradley (2003) cannot separate the parameters of the utility and the weighting function. He then concludes that, with linear weighting functions, the representative bettor would be risk seeking only for gains, but risk averse for losses. We find an overweighting of small probabilities both in the gain and in the loss domain, but to different degrees. Kopriva (2009) uses data from betfair.com where bettors can post limit orders stipulating at which odds they would be willing to trade. Similar to stock markets, it depends on the clearing price if bets are put through. As we do, Kopriva (2009) confirms that bet are largely decreasing in odds, and he demonstrates that results change consid- 3
erably when controlling for bet. However, he estimates the representative bettor s preferences only with EUT but not with CPT. Thus, our paper is the first that extends the comparison of EUT and CPT within the representative bettor approach to different bet. While our paper sticks to the representative bettor approach and extends to different bet, some recent contributions account for the heterogeneity of bettors. Gandhi and Serrano-Padial (2012) as well as Chiappori et al. (2012) do not have data on individual betting behavior, but estimate the heterogeneity of bettors from aggregated data. Gandhi and Serrano-Padial (2012) challenge the predominant view that CPT outperforms EUT by estimating a model where risk-loving (casual) bettors responsible for the FLB coexist with professional bettors entering the market to benefit from the overbetting on longshots. In their model, a few bettors prone to the FLB are sufficient to induce the overpricing of longshots, and in their data set, most of the overpricing can be attributed to longshots with extremely low success probabilities. This is not the case for our particularly large original data set with more than five million observations. Chiappori et al. (2012) estimate the preferences from aggregated data with a large data set from different horse races. They find a considerable heterogeneity and, by contrast to Gandhi and Serrano-Padial (2012), that EUT performs rather poorly in explaining the data also when allowing for heterogeneity. Both papers do not have data on bet, and hence need to assume that those are independent of odds. While the two papers just mentioned estimate heterogenous preferences from aggregated data, Andrikogiannopoulou (2010) is the only paper we are aware of which uses individual data for estimating individual risk preferences. Her data set contains the bets of 100 randomly selected bettors from a leading online betting company for around 11,000 soccer matches. All bets have three possibilities, home team win, draw, or away team win. As Chiappori et al. (2012) do, she finds a large heterogeneity in preferences. While many papers on betting markets use parimutuel data where the total amount bet on the correct outcome is divided among bettors relative to their shares, our data set is on fixed-odds-betting where odds are set by a the bookmaker (see section 2). 2 Thus, payments depend on the odds at the time bets are placed, and not on the odds at the end of the betting period. An advantage of fixed-odds-betting is that bettors know the odds at the time of betting which is not the case for parimutuel betting where expectations on the behavior of other bettors need to be formed. A disadvantage of fixed odds, however, is that it is not clear whether the FLB is a result of the behavior of bettors or driven by profit-maximizing bookmakers (see Levitt, 2004, and Direr, 2013, for models on optimizing bookmakers). As our paper focuses on the indifference condition of bettors for the odds given, we do not have to be concerned about which of the bookmaker-models fits best. The remainder of our paper is organized as follows: Section 2 describes the data set. 2 Data from fixed-odds betting is also used by Andrikogiannopoulou (2010), Andrikogiannopoulou and Papakonstantinou (2011) and Kopriva (2009), for instance. 4
Section 3 explains the methodology. Section 4 presents our results for the benchmark model, and discusses robustness checks. We conclude in section 5. 2 Data Our data set was compiled in close cooperation with the New Zealand Racing Board (NZRB) which is the only licensed betting agency in New Zealand. The initial data contain all 5,136,660 fixed-odds bets placed at the agency between August 2006 and April 2009. For each event included in our analysis, we have information on all odds, the number of bets on each possible outcome, the outcome itself and the bet for each bet. From outcomes, we calculate success probabilities for odds. To include bet in the representative bettor approach, we calculate the average bet for the respective odds. Many events in our data set have more than two possible outcomes as we do not only have Head-to-Head-competitions such as in tennis, but also games where draws are possible (such as in Soccer, for instance), and sailing or golf with many possible winners. Since the estimation of risk preferences requires that bettors are indifferent among all outcomes, the processor time is disproportionately increasing in the number of possible outcomes. 3 Therefore, we restrict attention to events with a maximum of six outcomes. When calculating the number of bets and average bet per outcome, we furthermore excluded all bets where either the average bet size or the average potential gain exceeded 500 NZ$. 4 There are two reasons for this: first, we are interested in the risk preferences of the average casual bettor, and we do not want to mix up casual and professional bettors. Second, large single bets would otherwise dominate (and potentially distort) the results when using average bet. We also ran estimations with many different thresholds and for a model including all bets, and results are qualitatively robust (results for several specifications are presented in the Appendix). We exclude events with less than five bets and events with irregularities that could not be clarified with the bookmaker. The data set we work with contains 24,266 events and 791,906 bets, on average 32.63 bets per event and 10.12 bets per outcome. Table 1 shows descriptive statistics, disaggregated by sports, and reveals a pronounced FLB. 3 Each estimation took around thirty minutes, and the number of indifference equations with n outcomes is ( ) n 4 2 The exchange rate of the NZ$ to the US$ fluctuated over the observation period, but on average, 1NZ$ was about 65 Cent. 5
Obs. Table 1: Descriptive statistics Return on favorites Bet Return Odds (50%- Thresh.) Return on longshots (50%- Thresh.) Return on favorites (50%- Prob.) Return on longshots (50%- Prob.) All Bets 791,906 5.30 57.46-0.14-0.08-0.21-0.07-0.17 Am. Football 13,390 4.03 79.53-0.04-0.06-0.01-0.04-0.05 Baseball 22,386 2.90 109.04-0.13-0.12-0.15-0.08-0.17 Basketball 39,207 3.23 88.62-0.12-0.10-0.20-0.06-0.18 Cricket 89,486 4.07 37.54-0.12-0.07-0.18-0.08-0.14 Football 58,997 5.78 46.93-0.10-0.08-0.12-0.03-0.12 Greyhounds 25,542 7.54 46.43-0.20-0.05-0.28 0.00-0.23 Harness 12,289 8.67 55.07-0.16-0.13-0.18-0.06-0.18 Netball 13,406 3.58 56.93-0.17-0.07-0.31-0.03-0.24 Rugby League 123,276 5.30 48.07-0.18-0.07-0.26-0.07-0.21 Rugby Union 253,010 5.22 58.97-0.14-0.09-0.20-0.10-0.16 Tennis 42,324 2.72 98.25-0.14-0.07-0.39-0.09-0.23 Thoroughbred 78,932 8.57 50.35-0.17-0.05-0.21-0.02-0.18 Others 20,021 6.52 40.41-0.12-0.09-0.16-0.06-0.15 To illustrate the impact of odds on the return of bets, we distinguish between favorites defined as the bottom fifty percent lowest odds placed and longshots defined by the other fifty percent. As shown in table 1, the return on favorites is -8% compared to -21% for longshots. In the last two columns, we define favorites as bets with odds below two (i.e., neglecting the take-out rate, with a winning probability above 50%), and the results are qualitatively the same. Except for American Football, there is a FLB for all kinds of sports, but the average losses differ largely among them. As our paper is concerned with the impact of bet, it is next instructive to consider the relation between odds and bet as shown in figure 1. 6
Figure 1: odds and bet In figure 1, we have simply correlated bet and odds. Figure 1 confirms the straightforward hypothesis that bet are too a large extent decreasing in odds. When we adopt the variance of return as a rough indicator for the risk associated with a bet, we find that the correlation between odds and variance for our data set is significantly negative with a correlation coefficient of -0.0665. Thus, our data set shows that there is no per se-reason for assuming that bettors betting on longshots are more risk-loving than those betting on favorites. These simple observations reinforce our view that it is important to account for bet when estimating preferences from betting data. 3 Methodology Following the representative bettor approach, we estimate the risk preferences of bettors by assuming that, for odds given, they are indifferent among all outcomes for a specific event. As usual, we do not require that bettors need to be indifferent among the outcomes of different events. Thus, we assume that bettors first decide whether to bet and on which event, and then they choose the outcome and the bet size. Also following the literature, we do not model why people invest at all in risky assets with negative expected value. As mentioned by Jullien and Salanié (2000), betting may involve some intrinsic benefit from gambling due to the (positive) tension during the event. We build two kinds of models, both of them with the same set of preference functions which we will introduce in section 4. In the first model, we assume that bet are the same for all odds. Then, we turn to our new model where the representative bettor is indifferent between lotteries consisting of odds and bet. For bet, we use the average bet size in the data set after excluding bets where either the bet size or the return in case of success is above 500 NZ$ for the reasons discussed in section 2. To model the indifference condition of the representative bettor, two main decisions need to be made. The first decision refers to the preference functions themselves. In order to highlight the importance of accounting for different bet and to alleviate comparability, we proceed closely along the lines of the seminal paper by Jullien and Salanié (2000). However, by contrast to them, we allow for different probability weighting for gains and losses because it will turn out that this improves the quality of the estimations considerably. As mentioned, we then estimate the parameters by minimizing the utility differences of the alternative bets for each event, thereby using the average amounts invested on the respective outcomes in the model with different bet. A detailed description of our programming is available on request. The second decision to be made refers to the probability distribution over outcomes used in the indifference conditions. To see the point, consider the simplest model with identical bet, and take expected utility as an example. Define W as the representa- 7
tive bettor s initial wealth, and p G and p L = 1 p G as the probability of success and loss, respectively, when betting on outcome i, and R i as return per $ in case of success which just equals the odds q i. Furthermore, denote the bet size as b i, θ as a parameter vector that captures the preferences of bettors, and u( ) as the utility of a specific outcome. Then, the value V i ( ) of betting on outcome i for a representative bettor is V i (W, p G, R i, b i, θ) = p G u(w + b i R i, θ) + p i u(w b i, θ) For each model, the parameters of the utility functional are estimated so as to minimize the difference between the utilities from the different bets. Thereby, an important question is how p G should be calculated. If odds perfectly reflected success probabilities, then the ratio of success probabilities for two bets would simply be inverse to odds, that is, p1 G p 2 G = q2 q 1. However, as betting markets are not efficient and since our data exhibits a strong FLB, using these probabilities in the indifference conditions would not be convincing. For our estimations, we follow part of the literature including Snowberg and Wolfers (2010), Gandhi and Serrano-Padial (2012), and Chiappori et al. (2012) by using the actual percentage of successful bets for the respective odds. These probabilities can be seen as the "correct" probabilities over all events in our data set. Note that we thereby stipulate implicitly that bettors are aware of the actual success probabilities, and that they assume that, for identical odds, the success probability is the same for all events. Beside our approach using the actual winning probabilities for the respective odds in the data set, some other concepts have been adopted in the literature. Jullien and Salanié (2000) estimate the probabilities from the indifference conditions themselves by using the adding-up constraint i p i = 1. We could have followed this procedure for the model with identical bet, but not so for the model with different bet as solving explicitly for the probabilities in the indifference conditions is only feasible when bet cancel out. Both approaches seem to have their benefits: On the one hand, estimating the probabilities from the indifference conditions allows running regressions to see how good the estimated probabilities fit the success probabilities in the data. By contrast, when comparing the quality of the different preference models, our approach needs to rely exclusively on standard errors and the likelihoods for the respective estimations. On the other hand, when estimating the preferences of rational bettors, it seems a reasonable starting point to assume that their estimation of the probabilities themselves is not systematically biased. This view implies that non-linear probability weighting is seen as a "preference" rather than as a "misperception". A third approach is taken by Andrikogiannopoulou (2010) who uses the inverse of odds, corrected by the bookmaker s commission, to calculate the probabilities for the different outcomes in her estimation of individual risk preferences. While our framework implicitly assumes that bettors are fully aware that different odds are associated with different expected returns (and are in this sense distorted), her assumption implies that 8
bettors are unaware of the distortion, and that the probabilities bettors use for evaluating their chances are equal to those quoted by the bookmaker. 4 Preference functions and results For each event, we define p i as the percentage of successful bets with odds q i in the whole data set. This implies that the bettors expectations are unbiased which, of course, does not exclude non-linear probability weighting. Since we have no information about the wealth of bettors, we follow the literature by using the Cara utility function u i = 1 e θx i θ. With b i as bet size, we have x i = b i (q i 1) in case of success and x i = b i in case of losses. θ > 0 (θ < 0) expresses risk-averse (risk-loving) preferences. We denote the probability weighting for gains and losses as π G and π L, respectively. In our benchmark model, we exclude odds where the potential average gain exceeds 500 NZ$. For instance, odds of 20 would be excluded if the average bet size were above 25NZ$. In our data, we have 5952 observations for odds of 20 with an average bet size of 9.32. The purpose of our threshold is to exclude uneven odds with relatively few bets where the average bet size is dominated by some large bettors which cannot be seen as representative. We have considered many different thresholds, and we report the results with exclusion thresholds of 100 and 300 as well as results with all observations in the Appendix. 9
Table 2: Estimations for the benchmark model (500NZ T hreshold) α EUT observed Significant difference? θ.005*** (0.001).002*** (.0000) *** Observations 98527 98527 Log-Likelihood 36381.3 47921.3 Power functions without observed Significant difference? loss aversion θ.0002(.003).0000*** (.0000) - α G 1.032(.037).904*** (.0006) *** α L.512*** (.110) 1.305*** (.003) *** Observations 98527 98527 Log-Likelihood 36352.8 41421.6 Power functions with observed Significant difference? loss aversion θ.0003(.003).0000(.0000) - α G.994(.040).993*** (.0008) - α L.956(.293).964*** (.002) - λ.781*** (.080).781*** (.001) - Observations 98527 98527 Log-Likelihood 36351.2 36360.9 Prelec (1998) observed Significant difference? θ.0006(.005).0000(.0000) - αg 1.854(.152).855*** (.001) - αg 2 1.068(.121) 1.066*** (.001) - αl 1 1.125(.381) 1.128*** (.001) - αl 2.923(.229).922*** (.002) - Observations 98527 98527 Log-Likelihood 36351.2 36356.4 Lattimore et al. (1992) observed Significant difference? θ.0000(.004).0000(.0000) - αg 1 1.316(.534) 1.276*** (.006) - αg 2 1.000(.096).993*** (.002) - αl 1.810(.236).809*** (.004) - αl 2 1.005(.354).981*** (.003) - Observations 98527 98527 Log-Likelihood 36351.2 36364.2 Camerer and Ho (1994) observed Significant difference? θ.001(.002).0000 (.0000) *** α G.954*** (.011).969*** (.0006) * α L.805*** (.042).656*** (.0009) *** Observations 98527 98527 Log-Likelihood 36351.4 40661.9 α : standard errors in parentheses; *, **, and *** denote significance at the 10%, 5% and 1% level, respectively 10
For all the models subsequently discussed, we have 98,527 observations compared to 791,906 observations in table 1. The reason for the lower number is that each single bet is an observation in table 1, while each indifference condition is an observation in our estimations. 5 Concerning the preference functions, we proceed closely along the lines of Jullien and Salanié (2000), the seminal paper for the representative bettor approach. This allows us to illustrate the importance of allowing for odds-dependent bet in a straightforward way. However, they estimate just one single coefficient for gains and losses in the specifications following Prelec (1998) and Lattimore et al. (1992). Since our results show that the coefficients vary considerably between gains and losses, we estimate two different parameters. For all preference functions, the first column in table 2 presents results with equal bet and the second column results for the estimations with the average bet in our data set. The third column shows which coefficients differ significantly for the specifications with equal and with different bet. Model I. Expected utility We start with EUT where the value of betting on outcome i is simply 1 e θb iri V i ( ) = p G θ + p L 1 e θb i θ since π g = p G and π L = p L. With identical bet, the existence of a FLB necessarily leads to the result that bettors are risk-loving (see already Ali, 1977), and we estimate θ = 0.005 which is significantly different from zero at the 1%-level. Jullien and Salanié (2000) find a higher degree of risk-loving (θ = 0.055), but this can also be attributed to the different sports compared to our data set. Accordingly, Kopriva (2009) estimates θ = 0.036 for tennis, θ = 0.015 for soccer and θ = 0.003 for horse races. 6 In any case, our results are in line with the literature. With the average bet from our data, the representative bettor must be indifferent between all combinations of odds and bet observed. Note that, by contrast to identical bet, it cannot be taken for granted that risk-loving individuals prefer bets with higher odds when bet are decreasing in odds. To see this, suppose for the moment that bets are fair, i.e. that odds reflect winning probabilities, and suppose that a bettor invests 1$ on a longshot. Then, for all bets with lower odds, there is exactly one amount b i such that the bettor is indifferent between this bet and betting 1$ on the longshot. Thus, without knowing the bet, we can no longer say whether the FLB should be more or less pronounced when bettors become more risk-loving. 7 5 For instance, in a soccer match with three outcomes and ten bets on each outcome, we have thirty observations in table 1, and three indifference conditions (observations) for our estimations. 6 We also disaggregated all of our estimations by sports. Results available on request. 7 As an example, consider a longshot with p L G = 1 and fair odds of 10, and a favorite with 10 pf G = 1 2 11
An in our view very interesting point is that, even within EUT, the existence of a FLB requires risk-loving preferences with different bet if and only if bets are fair, i.e. with a take-out rate of zero. Irrespectively of bet, all risk-neutral bettors would prefer favorites even with different bet simply because favorites yield higher expected returns than longshots. However, the existence of a FLB is logically compatible with EUT and risk-neutrality when taking different bet and positive take-out rates into account. Even a risk-neutral bettor can then be indifferent between favorites and longshots because, with unfair bets, there are countervailing effects: on the one hand, the existence of a FLB makes betting on favorites ceteris paribus superior to betting on longshots. But on the other hand, the larger bet size for the favorite leads ceteris paribus (i.e. when neglecting the FLB) to higher losses since the bookmaker s take outrate refers to a larger amount. We will get back to this observation in our concluding section. With the bet from our data set, we estimate θ = 0.002 (significant at the 1%- level), so that the degree of risk-loving is lower. 8 The difference between our results with and without bet is also significant at the 1%-level. Furthermore, the standard error for the specification with bet is much lower compared to the specification without bet which indicates that adding the bet increases the efficiency of the model. Note that, by contrast to the standard errors, comparing the log-likelihoods of the two specifications does not provide useful information on the quality of the estimations as they include different exogenous variables, and because the average values for the bet are largely dispersed. Thus, the lower log-likelihood for the specification with bet does not mean that the model performs worse. Model II. Power function without loss aversion We now proceed to several preference function that have been used for testing cumulative prospect theory (CPT) as introduced by Kahneman and Tversky (1979). The distinctive features of CPT are that gains and losses are defined with respect to a reference point, 9 that individuals behave risk averse for gains and risk-loving for losses, an inversely S-shaped probability weighting expressing that low probabilities are overand odds of 2. With θ = 1.2, for instance, betting 1$ on the longshot gives expected utility of V L ( ) = 1 1 e 1.2 (10 1) + 9 1 e 1.2 ( 1) 10 1.2 10 = 1.6568. Betting amount a F on the favorite gives expected utility 1.2 of V F ( ) = 1 1 e 1.2 (2 1) af + 1 1 e 1.2 ( af ) = 1.6568 which yields the same utility of -1.6568 when 2 1.2 2 1.2 betting 1.4655$. Note that this means that bettors with a higher degree of risk aversion than θ = 1.2 prefer betting 1$ on the longshot compared to betting 0.9917$ on the favorite. 8 This is in line with findings by Kopriva who reports lower degrees of risk-loving when taking different bet into account, and for his data set, the assumption of risk neutrality can only be rejected for horse races with θ = 0.0005. 9 We follow the betting literature by assuming that the reference point is zero income which, however, is neither an innocent nor a straightforward assumption. In particular, reference points may depend on odds since a bettor who bets on a longshot with odds of 20 (and imputed fair probability of 5%) may see the loss as reference point, while a bettor betting on a favorite with odds of 1.02 may expect to win. See for reference-dependent attitudes towards risk e.g. Köszegi and Rabin (2007). 12
weighted compared to high probabilities, and that losses are more important than gains (loss aversion). We follow Jullien and Salanié (2000) by starting with the simplest power function without loss aversion where the probability weighting functions for gains and losses are π i = p α i i, i = G, L. 10 For α i < 1, small probabilities are overweighted compared to large probabilities. Therefore, α i < 1, i = G, L makes betting on longshots more attractive. In our benchmark-model where all odds with potential average gains of more than 500NZ$ are excluded, the specification without bet does not reject that bettors are risk-neutral, i.e. θ is not significantly different from zero. The probability weighting for gains, α G, is insignificant as well, but the probability weighting for losses is significantly different from one at the 1%-level (α L = 0.512 ). Thus, this specification suggests that the FLB in our data is driven by a strong overweighting of small probabilities for losses. Qualitatively, this is still in line with Jullien and Salanié (2000) who find an even stronger effect with α L = 0.318 while the probability weighting for gains is also insignificant for their data set. Considering the log-likelihood indicates that the simplest power functionmodel without loss aversion is no large improvement compared to the EUT-model, and the standard error for θ is even larger. Integrating bet, however, leads to a large reduction in the standard error for all three coefficients. As a consequence, θ is now significantly different from zero even though the estimated degree of risk-loving is economically meaningless (θ = 0.00008). α G is now significantly smaller than one while α L is significantly larger than one, so that the FLB would be triggered by the overweighting of small probabilities for gains instead of for losses. Thus, when relying on the power function-model without loss aversion, integrating different bet leads to a qualitatively different explanation of the FLB compared to the model with identical bet. The log-likelihood is now also far below the one for the respective EUT-model, so that integrating different bet is important. However, comparing the log-likelihood to the other specifications with bet discussed below, shows that the powerfunction-model without loss aversion performs rather poor. By contrast to the comparison of the log-likelihood for models with and without bet, comparing the value for specifications with bet is sensible. Model III. Power function with loss aversion We now apply the same weighting function as before, but we assume that gains are weighted with one and losses with λ where λ > 1 expresses loss aversion. As long as we assume identical bet, loss aversion makes favorites more attractive since losses are the same in both cases, and because the loss probability is higher for longshots. Without bet, all parameters except λ are insignificant, i.e. we cannot reject that 10 We acknowledge that, from a theoretical point of view, the specifications used in this and in other papers are not free of problems (see Law and Peel, 2007). In a model with fixed-amounts gambling, Law and Peel (2009) demonstrate that, in CPT, the relationship between the skewness of return and expected return is far from clear-cut. By use of examples, they show that, depending on the probability levels, the trade-off between expected return and win probability may changes signs. 13
bettors are risk-neutral and apply linear probability weighting for both gains and losses. Our coefficient λ = 0.781 is significant at the 1%-level and expresses that losses are underweighted compared to gains. Concerning the log-likelihood, the model without bet and loss aversion does not perform better than EUT or the powerfunctionmodel without loss aversion. Jullien and Salanié (2000) do not estimate power functions with loss aversion, and the general literature on power functions with loss aversion which does not refer to betting markets yields rather mixed results. Our finding is in line with Harrison and Rutström (2008) who report λ = 0.732, and Harinck et al. (2007) who also report λ < 1, while several other studies find coefficients which do not significantly differ from one (Kermer et al., 2006, Erev et al., 2008, Ert and Erev, 2008, Nicolau, 2012, and Yechiam and Telpaz, 2013) or coefficients supporting loss aversion (Gill and Prowse, 2012; see Camerer, 2000, for an overview). Here, adding different bet has no strong impact on the coefficients, but the model performs much better. Notably, the log-likelihood increases sharply compared to the powerfunction without loss aversion for which we found a large difference in the coefficients with and without bet. All standard errors are now far lower resulting in significant coefficients for the probability weighting both in the gain and in the loss domain. Both coefficients are significantly smaller than one, so that, according to this model, the FLB is explained by an overweighting of small probabilities for gains and losses. The model strongly suggests that the overbetting of longshots is not driven by risk seeking since θ is insignificant even though the standard error is extremely small. All in all, we find that when using power functions, allowing for a different weighting of gains and losses is very important, and that allowing for different bet reduces all standard errors sharply. Model IV. Prelec (1998)-specification In the following two models, the probability weighting for gains and losses is captured by two parameters each. These two models perform equally well as the power-function with loss aversion, and far better than the other specifications considered. In the Prelec (1998)-specification, the probability weighting function fpr gains and losses are π i = e α1 i (ln((p i) α2 i )), i = G, L. In case αl 1 = α2 L = 1, the Prelec-specification is identical to the power-function without loss aversion. The advantage of having two parameters for both gains and for losses is that it allows for different changes in probability weighting at different levels. Similar to power functions, the overweighting of small probabilities is larger when α j i, j = 1, 2, are small. Loss aversion is not modelled explicitly. Both the quality and the results of the Prelec-model are in many respects similar to those of the Power-model with loss aversion. First, all coefficients do not differ significantly between the model with and without bet. Second, θ is again not significantly different from zero even though the standard error for θ is extremely small 14
in the model with different bet. Third, the model with different bet again performs much better with respect to the standard errors. As a result, all coefficients are insignificant for the model without bet, but significant at the 1%-level when bet are included. Fourth, the log-likelihood for the Prelec-model is basically the same as for the power function with loss aversion, and thus far higher than for the power function without loss aversion. The comparison to the results in Jullien and Salanié (2000), who also estimate a Prelec-model, is necessarily rough as they do not differentiate between the coefficients for gains and losses (i.e. αg 1 = α1 L and α2 G = α2 L ). This given, their probability weighting parameters are in line with our results, but they find risk-loving preferences, significant at the 1%-level. Our results, however, clearly show that allowing for different probability weighting is important since all parameters estimated for gains are significantly different from the parameters for losses. Thus, imposing identical probability weighting implies an important loss of information. Model V. Lattimore, Baker and Witte (1992)-specification In the Lattimore-Baker-Witte (1992)-specification, the probability weighting is π i = α 1 i pα2 i i α 1 i pα2 i i +(1 p i ) α2 i, where α j i, j = 1, 2 < 1 makes betting on longshots more attractive due to the overweighting of small probabilities for gains and losses. To a large degree, results are similar to the Prelec-specification of the probability weighting function. The model finds no evidence at all for risk-loving preferences and explains the FLB by probability weighting. All coefficients are insignificant without bet and significant at the 1%-level with bet since all standard errors are far lower. All coefficients do not significantly differ from each other with and without bet. The log-likelihood can be seen as identical to the last two specifications. The comparison to Jullien and Salanié (2000) is again rough as those do not allow for different probability weighting for gains and losses. This time, we find that αg 1 is significantly larger than αl 1, but there are no significant differences between α2 G and α2 L. We cannot say whether the fact that Jullien and Salanié (2000) report significant evidence for risk-loving must be attributed to their data, the fact that they assume identical bet, or that they do not allow for differences in the probability weighting in gains and losses. Model VI. Camerer and Ho (1994)-specification Finally, we use the weighting function π i = which has been applied (p α i i +(1 p i ) α i) α 1 i by many authors including Tversky and Kahneman (1992) and Camerer and Ho (1994). We follow Jullien and Salanié (2000) by referring to it as the Camerer-Ho-model. In the specifications used so far, all probabilities are necessarily either over- and underweighted (albeit at a different degree which allows capturing the overweighting of small probabil- p α i i 15
ities relative to large probabilities). The Camerer-Ho-specification allows to overweight small probabilities and to underweight high probabilities (for instance, for α = 0.5, all probabilities up to around 0.278 are overweighted). Where the underweighting kicks in depends quantitatively on α i : the higher α i, the fewer probabilities are overweighted. The smaller a i, the lower is the relation between high and small probabilities, and the more attractive is betting on longshots. Compared to the last two models, the log-likelihood of the Camerer-Ho-specification is rather low which we attribute to the fact that just one parameter is used for estimating the probability weighting in the gain- and loss-domain, respectively. With bet, the degree of risk-loving is significant due to an extremely low standard error, but economically meaningless. Consequently, this model as well supports explanations based on the overweighting of small probabilities with coefficients of around 0.95 for gains and lower coefficients for losses. To see this, note that α G = 0.5 means that small winning probabilities are overweighted while large probabilities are underweighted, but only to a moderate degree. For losses, the effects are more pronounced. Comparing the different models, it might at first glance appear conspicuous that the estimated coefficients with and without bet differ largely for the models that perform poorly (EUT, Powerfunctions without loss aversion), while they are very close together for the models with lower log. likelihoods (Powerfunction with loss aversion, Prelec and Lattimore). To see the reason, recall that, for these latter high-performing models, we find that bettors are risk-neutral, and that the coefficients for probability weighting are highly significant with bet but insignificant without bet. Now consider a simple example with two bets on longshots, the first one with odds of 3, a winning probability of 30 In the Appendix, we report our estimations when we use all bets (no threshold), and when we apply lower thresholds of 100NZ$ and 300NZ$, respectively, instead of 500NZ$. For the three specifications which give the by far highest log-likelihoods (powerfunction with loss aversion, Prelec, and Lattimore, Baker and Witte) the coefficients are surprisingly robust both for the specifications with and without different bet. 11 These three models with the highest log-likelihoods share the following results: Allowing for different bet reduces the standard errors for all coefficients to a large degree. As a consequence, most coefficients for probability weighting are insignificant when we counterfactually assume identical bet, and highly significant with the bet from our data set. Next, all estimations strongly suggest that the existence of a favoritelongshot bias has nothing to do with risk-loving-preferences; θ is very close to zero and insignificant throughout, although the standard error is extremely small. Thus, our results strongly indicate that the representative bettor is risk-neutral. Consequently, the EUT-model where θ is inevitably negative, performs very poorly which can be attributed 11 For the power function without loss aversion, results vary considerably when assuming identical bet but are relatively robust with bet, and the results for the Camerer and Ho-specification are partly robust. 16
to the fact that risk-loving-preferences is the only explanation for the FLB the model allows. Finally, comparing the five models based on CPT shows that modelling loss aversion explicitly (for the powerfunction-specification) or allowing for two parameters for probability weighting for both the gain- and the loss-domain is crucial for improving the model s explanatory power. 5 Conclusion Due to a lack of data, the literature estimating the preferences of a representative bettor has assumed that bet are independent of odds. Our data set confirms the intuition that, in reality, bet are largely decreasing in odds, so that accounting for different bet is important. We have estimated preferences both under the counterfactual assumption of equal bet and with the average bet in our data set. We have shown that allowing for different bet reduces the standard errors for all coefficients to a large degree. Estimating the preferences for the specifications commonly used in the literature shows that those models based on cumulative prospect theory which include the possibility of loss aversion or which allow for two different parameters for the probability weighting both in the gain- and in the loss-domain outperform other specifications. Our analysis rejects the hypothesis that the favorite-longshot bias can be explained by risk-seeking bettors and strongly supports the hypothesis that the overbetting of longshots can well be explained by risk-neutral bettors with an overweighting of small probabilities both in the gain and in the loss domain. In this paper, we have closely followed the literature, in particular the canonical paper by Jullien and Salanié (2000), in order to focus exclusively on the importance of integrating odds-dependent bet into the representative bettor approach. There are at least two promising ways to exploit our data set for further research. First, instead of extending the representative bettor approach by adding different bet, we next aim at estimating individual preferences which can then be aggregated by different attributes such as age and gender. We are only aware of one such analysis (Andrikogiannopoulou, 2010) and due to the size of our data set, our analysis will allow for far more robust results. Second, we have already mentioned that the existence of a FLB is consistent with risk-neutral bettors with linear probability weighting when we account for different bet and the bookmaker s take-out rate: Betting on longshots yields lower losses per Dollar, but with larger bet, the bookmaker s take out rate matters more when betting on favorites. In our view, this simple insight is important as it means that the FLB does in fact neither require risk-loving preferences nor non-linear probability weighting but "only" that bettors have an intrinsic benefit from betting, which needs to be assumed anyway (see Jullien and Salanié 2000, and Snowberg and Wolfers, 2010) in order to explain why people bet at all. Intuitively, one would assume that this 17
intrinsic benefit (tension) is increasing in the bettors potential gain which then requires higher bet for favorites as found in all data sets including bet (Kopriva, 2008, Andrikogiannopoulou, 2010). We hence aim at re-running all of our estimations by including an intrinsic benefit which is increasing at a decreasing rate in the potential gain of bets placed. 18
References [1] Ali, M. (1977). Probability and Utility Estimates for Racetrack Bettors. Journal of Political Economy, 85(3), 803-15. [2] Andrikogiannopoulou, A. (2010). Estimating Risk Preferences from a Large Panel of Real-World Betting Choices. Job market paper, Princeton University, working paper. [3] Andrikogiannopoulou, A. & Papakonstantinou, F. (2011). Market Efficiency and Behavioral Biases in the Sports Betting Market, HEC Geneva, working paper. [4] Boulier, B.L, Stekler, H.O. & Amundson, S. (2006). Testing the efficiency of the National Football League betting market. Applied Economics, 38, 279-284. [5] Bradley, I. (2003). The representative bettor, bet size, and prospect theory, Economics Letters, 78(3), 409-413. [6] Camerer, C.F. (2000). Prospect theory in the wild: Evidence from the field. In: Kahneman, D. Tversky, A. (eds.), Choices, values, and frames (pp. 288-300), Cambridge University Press. [7] Camerer, C.F. & Ho, T.-H. (1994). Violations of the Betweenness Axiom and Nonlinearity in Probability. Journal of Risk and Uncertainty 8 (2), 167-96. [8] Chiappori, P.A., Gandhi, A., Salanié, B. & Salanié, F. (2012). From Aggregate Betting Data to Individual Risk Preferences, Columbia University, working paper. [9] Direr, A. (2013). Are betting markets efficient? Evidence from European Football Championships. Applied Economics, 45, 343-356. [10] Erev, I., Ert, E., & Yechiam, E. (2008). Loss aversion, diminishing sensitivity, and the effect of experience on repeated decisions. Journal of Behavioral Decision Making, 21, 575-597. [11] Ert, E., & Erev, I. (2008). The rejection of attractive gambles, loss aversion, and the lemon avoidance heuristic. Journal of Economic Psychology, 29, 715-723. [12] Gandhi, A. & Serrano-Padial, R. (2012). Does Belief Heterogeneity Explain Asset Prices: The Case of the Longshot Bias, University of Wisconsin-Madison, working paper. [13] Gill, D. & Prowse, V. (2012). "A structural analysis of disappointment aversion in a real effort competition". American Economic Review 102 (1): 469 503. 19
[14] Harinck, F., Van Dijk, E., Van Beest, I., & Mersmann, P. (2007). When gains loom larger than losses: Reversed loss aversion for small amounts of money. Psychological Science, 18, 1099-1105. [15] Harrison, G.W. & Rutström, E.E. (2008). Risk Aversion in the Laboratory. In J.C. Cox and G.W. Harrison (eds.), Risk Aversion in Experiments, Bingley: Emerald, Research in: Experimental Economics, 12. [16] Jullien, B. & Salanié, B. (2000). Estimating Preferences under Risk:The Case of Racetrack Bettors, Journal of Political Economy, 108(3), 503 530. [17] Jullien, B., & Salanié, B. (2008). Empirical Evidence on the Preferences of Racetrack Bettors. In D. B. Hausch W. T. Ziemba (Eds.), Handbook of sports and lottery markets, 27 49. Amsterdam: Elsevier. [18] Kahneman, D. & Tversky, A. (1979). Prospect theory: An analysis of decisions under risk. Econometrica, 47 (2), 263-292. [19] Kermer, D.A., Driver-Linn, E., Wilson, T.D. & Gilbert, D.T. (2006). Loss aversion is an affective forecasting error. Psychological Science, 17, 649-653. [20] Kopriva, F. (2009). Constant Bet Size? Don t Bet on It! Testing Expected Utility Theory on Betfair Data, CERGE-EI Working Paper series. [21] Köszegi, B. & Rabin, M. (2007). Reference-Dependent Risk Attitudes. American Economic Review, 97(4), 1047-1073. [22] Lattimore, P., Baker, J.R. & Witte, A.D. (1992). The Influence of Probability on Risky Choice: A Parametric Examination. Journal of Economic Behavior and and Organization 17 (2), 377-400. [23] Law, D. & Peel, D.A. (2007). Gambling and nonexpected utility: the perils of the power function, Applied Economics Letters, 14, 79-82. [24] Law, D. & Peel, D.A. (2009). Skewness as an explanation of gambling in cumulative prospect theory. Gambling and nonexpected utility: the perils of the power function, Applied Economics, 41, 685-689. [25] Levitt, S.D. (2004). Why are gambling markets organised so differently from financial markets? The Economic Journal 114 (495), 223 246. [26] Nicolau, J.L. (2012). Battle Royal: Zero-price effect vs relative vs referent thinking, Marketing Letters, 23, 3, 661-669. [27] Ottaviani, M. & Soerensen, P.N. (2008). The Favorite-longshot Bias: An Overview of the Main Explanations, in D. B. Hausch W. T. Ziemba (Eds.), Handbook of sports and lottery markets, 83 102. Amsterdam: Elsevier. 20
[28] Prelec, D. (1998). The Probability Weighting Function. Econometrica 66 (2), 497-527. [29] Snowberg, E. & Wolfers, J. (2010)..Explaining the Favorite-Longshot Bias: Is It Risk-Love or Misperceptions?, Journal of Political Economy, 118, 723 746. [30] Sobel, R.S. & Raines, S.T. (2003). An Examination of the Empirical Derivatives of the Favourite-Longshot Bias in Racetrack Betting, Applied Economics, 35(4), 371-385. [31] Tversky, A. & Kahneman, D. (1992): Cumulative Prospect Theory: An Analysis of Decision under Uncertainty. Journal of Risk and Uncertainty, 5, 297-323. [32] Weitzman, M. (1965). Utility Analysis and Group Behavior: An Empirical Study. Journal of Political Economy, 73(1), 18-26. [33] Woodland, L. & Woodland, B. (1994). Market efficiency and the favourite-longshot bias: The baseball betting market, Journal of Finance, 49, 269 279. [34] Yechiam, E. & Telpaz, A. (2013). Losses induce consistency in risk taking even without loss aversion. Journal of Behavioral Decision Making, forthcoming. 21
Appendix Table 3: Robustness checks α Average bet size < 100 Average bet size < 300 All bets EUT observed observed observed θ.018*** (.003).004*** (.000).008*** (.001).002*** (.000).002*** (0.0004).002*** (.000) Observations 51966 51966 89657 89657 108429 108429 Log-Likelihood 19182.1 21143.3 33102.7 41880.2 40046.5 54776.9 Powerfunctions without loss aversion Average bet size < 100 Average bet size < 300 All bets observed observed observed θ.015 (.026).0004*** (.000).003 (.006).0002** (.000).0004 (0.001).0000*** (.000) α G 1.100 (.130).893*** (.002) 1.043 (.051).903*** (.0008) 1.018 (0.021).903*** (.0004) α L.409*** (.196) 1.196*** (.005).490*** (.128) 1.266*** (.003).549*** (0.084) 1.360*** (.003) Observations 51966 51966 89657 89657 108429 108429 Log-Likelihood 19173.3 20103.9 33080 36873.8 40006.5 47448.1 Powerfunctions with loss aversion Average bet size < 100 Average bet size < 300 All bets observed observed observed θ.001 (.027).0000 (.0000).0004(.006).0000 (.0000).0003 (0.001).0000*** (.000) α G.989* (.149).992** (.003).993 (.055).993*** (.001).994 (0.024).993*** (.0003) α L.971 (.574).966*** (.006).960 (.336).965*** (.002).956 (0.240).964*** (.0006) λ.785 (.148).783*** (.005).782* (.088).781*** (.002).781*** (0.075).781*** (.0005) Observations 51966 51966 89657 89657 98527 98527 Log-Likelihood 19172.7 19173.2 33078.6 33083.2 36351.2 36360.9 Average bet size < 100 Average bet size < 300 All bets Prelec (1998) observed observed observed θ.002 (.050).0000 (.0000).0008 (.010).0000 (.000).0002 (.001).0000* (.000) α1 G.846 (.402).853*** (.006).851 (.190).854*** (.002).865(.102).854*** (.0005) α2 G 1.078 (.459) 1.068*** (.006) 1.071 (.166) 1.066*** (.002) 1.059 (.066) 1.066*** (.0005) α1 L 1.130 (.584) 1.127*** (.006) 1.128 (.414) 1.128*** (.002) 1.112 (.0.342) 1.128*** (.0004) α2 L.937 (.650).925*** (.010).928 (.293).923*** (.003).907 (.143).924*** (.0008) Observations 51966 51966 89657 89657 108429 108429 Log-Likelihood 19172.7 19172.9 33078.6 33080.8 40004.5 40057.9 Average bet size < 100 Average bet size < 300 All bets Camerer and Ho (1994) observed observed observed θ.0009 (.016).001*** (.000).002 (.004).0001*** (.000).0007 (.001).0000*** (.000) α G.955 (.045) 1.017*** (.002).953*** (.016).978*** (.0008).957*** (.007).958*** (.0005) α L.808*** (.070).701*** (.004).807*** (.047).657*** (.001).798*** (.036).665*** (.0008) Observations 51966 51966 89657 89657 108429 108429 Log-Likelihood 19172.7 20527.0 33078.8 36749.2 40004.8 45502.6 Average bet size < 100 Average bet size < 300 All bets Latimore et al. (1992) observed observed observed θ.0001 (.049).0000 (.0000).0000 (.008).0000 (.0000) 0.0000 (.001).0000 (.0000) α1 G 1.312 (1.805) 1.275*** (.025) 1.315 (.704) 1.277*** (.009) 1.316 (.324) 1.289*** (.002) α2 G.999 (.462).992 (.007) 1.000 (.140).993*** (.002) 1.000 (.046).996*** (.0006) α1 L.809 (.553).808*** (.015).810 (.263).809*** (.005).8105 (.219).816*** (.001) α2 L 1.003 (.974).981 (.012).005 (.441).981*** (.004) 1.006 (.239).988*** (.001) Observations 51966 51966 89657 89657 108429 108429 Log-Likelihood 19172.7 19173.3 33078.6 33084.6 40004.5 40065.3 α : standard errors in parentheses; *, **, and *** denote significance at the 10%, 5% and 1% level, respectively