Power law distribution of dividends in horse races

EUROPHYSICS LETTERS 15 February 2001 Europhys. Lett., 53 (4), pp. 419 425 (2001) Power law dstrbuton of dvdends n horse races K. Park and E. Domany Department of Physcs of Complex Systems, Wezmann Insttute of Scence Rehovot 76100, Israel (receved 8 August 2000; accepted 3 November 2000) PACS. 02.50.-r Probablty theory, stochastc processes, and statstcs. PACS. 05.40.-a Fluctuaton phenomena, random processes, nose and Brownan moton. PACS. 89.90.+n Other topcs of general nterest to physcsts. Abstract. We dscovered that the dstrbuton of dvdends n Korean horse races follows a power law. A smple model of bettng s proposed, whch reproduces the observed dstrbuton. The model provdes a mechansm to arrve at the true underlyng wnnng probabltes, whch are ntally unknown, n a self-organzed collectve fashon, through the dynamc process of bettng. Numercal smulatons yeld excellent agreement wth the emprcal data. Durng the past decade, power law dstrbutons were dscovered n a wde varety of phenomena. Quanttatve analyss has been performed on dstrbutons observed n dverse areas, such as fsh school sze [1], heart dynamcs [2], frequency of jams n Internet traffc [3], sze of war [4], scalng n currency exchange [5, 6] and stock market prce changes [7, 8], just to cte from a random selecton of dscplnes. In the physcal scences, power law scalng s usually assocated wth crtcal behavor or wth scale free growth processes. A mechansm producng power law dstrbutons has been found also n the study of stochastc processes nvolvng multplcatve nose [9, 10]. Snce the semnal work of Bak et al. [11], t has been temptng to try to connect such power laws wth self-organzed crtcalty (SOC). In exstng models of SOC, such as sandple, sldng-block, forest-fre, evoluton, etc., a dynamc process plays a central role, yeldng avalanches whose szes follow a power law frequency dstrbuton. Elucdatng general mechansms underlyng power law behavors n bologcal and socal scences s a challengng task. We present here emprcal evdence for a power law dstrbuton n a new context; that of horse racng. We also propose a smple model to explan the phenomenon. We beleve that the model s relevant beyond the specfc context of horse racng, as t mght explan how power law dstrbutons emerge n socal scences, especally n a large assembly of nterconnected human actvtes. Although the mechansm of producng power laws s qute dfferent, the problem and the proposed model have several smlartes wth those of foregn exchange [5, 6], stock market [7, 8], and the mnorty game [12]: many ndependent agents partcpate, and each makes decsons accordng to hs own strategy, based on prevous results. The actvty results n a change of an ndex, such as prce, exchange rate, and dvdend, whch s known to all. c EDP Scences

420 EUROPHYSICS LETTERS We studed the dstrbutons of the values of the wnnng dvdends n horse races held from 1996 to 1999 n Korea, obtaned from the database of the Korea Racng Assocaton [13]. The dvdend s determned n an nterestng way n these races. Bettng s open for a tme nterval before the race, durng whch people can bet on the horse of ther choce. Durng the process of bettng, n, the total amount of money that has been bet so far on each horse s dsplayed on a board, whch s updated at tme steps of 30 seconds. The dvdend, f,s determned at every tme step accordng to the smple equaton f =(1 r) Z n, Z n (1) and s also dsplayed. Here r s a tax collected by the racng organzaton. Bettng stops at a fxed tme before the race. The fnal value reached by f s the amount pad (per unt money nvested) to a person who has pcked the wnnng horse. Ths knd of gamblng s unque n that the dvdend on each horse s determned only after the entre process of bettng has been completed. In general, a horse whch s beleved to have a hgh probablty of wnnng, p, yelds a lower dvdend than horses wth lttle chance. Therefore t can be expected that races yeldng hgher dvdends occur less often than those yeldng lower ones, provded the bettors have relable nformaton on the horses. We collected and tabulated the values of the wnnng dvdends; we defned bns lmted by nteger values of f (the frst bn for 1 <f<2, the second for 2 <f<3, etc.), and evaluated the frequency of occurrence of the wnnng dvdends n each bn. We found that over a consderable range ther dstrbuton follows a power law, P (f) f x, x 1.70, (2) as shown n fg. 1. The value x 1.70 was obtaned by a best ft to the data n the range 2 f 40. We plotted here the relatve frequency of occurrence for all values of f, for races wth N =12andN = 14 horses, as well as for the sum of all races, rrespectve of N. The database contans records of 1447 races wth N = 14, 652 wth N = 12; the number of races s consderably lower for other values of N, and hence the correspondng frequency plots are much noser and not shown. For 8 N 14 the dstrbuton s effectvely ndependent of the number of horses N. We look for models that reproduce the two man features of the emprcal data: a) that the races yeldng the lowest dvdends occur most frequently, and b) the dstrbuton has the form of a power law over a sgnfcant range of f values. We assume that the basc rule that governs people s bettng strateges s to pck at each moment t that horse for whch the expectaton value of the gan, f p =(1 r)zp /n,s maxmal. The smplest model we can thnk of assumes that there exsts a fxed set of probabltes p (of horse wnnng the race). Assume that these p are accessble to everyone n the stadum and that all bettors use the same nformaton and bettng strategy. These assumptons defne a dynamcal bettng process, whch has a stable attractor: the state of equal constant gan for each horse,.e., p f = C. It s trval to see that ths stuaton s ndeed stable aganst perturbatons; e.g., ncreasng n /Z decreases the dvdend of horse, so that f p decreases and at the next tme steps no bets wll be placed on horse. Ths smplest model therefore yelds the fnal dvdends f = C/p,horsej wll wn the race wth probablty p j and the wnnng dvdend wll be f wnner = C/p wnner. If the wnnng probabltes are generated from any reasonable dstrbuton, ths smplest model fals to reproduce both aspects a) and b) mentoned above: the resultng dvdend dstrbuton has a peak at some ntermedate

K. Park et al.: Power law dstrbuton of dvdends n horse races 421 1 0.1 Total N=14 N=12-1.70 P~f P(f) 0.01 0.001 0.0001 1 10 100 1000 f Fg. 1 Number densty dstrbuton of the values of dvdends n horse races n Korea from January 6, 1996 to December 19, 1999, for races wth N =12andN = 14 horses, and for the sum of all races. A best ft for a power law s also shown. f 1/p and we do not get the observed power law decay for large f. To obtan the power law dstrbuton of dvdends for larger f, we have to tune carefully the dstrbuton from whch the wnnng probabltes are selected for small p. Moreover, n order to have the peak at the lowest f, n a large fracton of the races there should be a most lkely wnner, whose chance of wnnng domnates all the other horses. However, nobody, especally no horse racng organzaton, would tolerate such a trval race. Usually a rule of handcap s mplemented to equalze the chance of wnnng for all the horses; the horses wth better records carry heaver burdens than others n the race. The resultng dffcultes of predcton are ponted out n ref. [14]. Although several knds of nformaton magaznes are sold at the racng stadum whch summarze the detals of prevous races, ground condtons, weather, handcap, harmony wth jockey, etc., the expectatons of the dfferent experts are not consstent wth each other for most races. Hence one cannot assume that there exsts relable nformaton on the wnnng probabltes, whch s shared and used by all bettors. Nevertheless, our man concluson, whch wll be substantated later, s that there ndeed exsts a most-lkely wnner, n spte of the handcaps, even though the nformaton s ntally known to none. To explan ths clam, we have to fnd out how a large fracton of the bettors dentfes ths lkely wnner n the absence of relable nformaton. And what knd of dstrbuton of probablty of wnnng for each race wll yeld the emprcal power law dstrbuton of dvdends? To address these questons, we ntroduce a smple model of bettng, n whch people rely both on some ntal nave guess and on the nformaton dsplayed on the board. There mght be professonal gamblers equpped wth accurate nformaton; they are, however, a small mnorty. Most bettors are amateurs who play the races just for ther lesure and fun. For them the magaznes are the only avalable pror nformaton, and the dvdends dsplayed on the board affect ther decson consderably, snce they reflect the nformaton avalable to all other people, ncludng the professonals. It s natural for ths large majorty of amateurs to regard the horse wth the hghest n as the best. We therefore assume that ntally people bet only on the bass of some nave expectatons, based on the magaznes, etc. We denote ths ntal nave guess of the probablty that horse. For smplcty, bettng s assumed to occur sequentally and wth the same amount of money at each tme step. Further, we assume that after every bet, the board s mmedately updated. The followng bettng dynamcs s assumed and nvestgated: At tme t, wll wn by p (0)

422 EUROPHYSICS LETTERS the bettor vews the nformaton dsplayed on the board (f (t) s or, equvalently n (t) s). The bettor uses ths nformaton to update hs current estmates for the probabltes of wnnng, P (t) =F [p (0),f (t)], (3) and places hs bet on horse j, for whch (at that tme) the expected gan s maxmal: n (t +1)=n (t)+δ,j, (4) where j s max f (t)p (t). The dvdends are updated on the board and at the next tme step people bet usng the modfed probabltes and dvdends, and so on. Due to the tax, the maxmal gan may correspond to a dvdend whch s lower than unty. In such cases we add a restrcton that moves the bet to the next szed gan. The results wth and wthout ths restrcton do not yeld notceable dfference n numercal smulatons [15]. Ths process converges to a set of wnnng probabltes P ; the fnal assumpton we take s that these are the true probabltes, accordng to whch the wnner s selected. It s mportant to state that n all but a sngle specal case smulatons of the dynamc process descrbed above converged to sets of P and f such that P f = const, rrespectvely of the models F and dstrbutons of the ntal probabltes that were used. However, fndng a model whch leads to the observed dstrbuton of wnnng dvdends s not an easy task. The dynamc process descrbed above s completely determnstc. The only stochastc components are that of ntalzaton,.e. when the ntal probabltes p (0) are drawn from some dstrbuton Q(p (0) ), and the fnal step of selectng the wnnng horse accordng to the fnal probabltes P. The dstrbuton of wnnng dvdends s determned by these probabltes and s ndependent of the dynamc process that generated them. and b) the model F, see eq. (3), used to update the probabltes. There are many plausble choces for Q. We do not know how the horse racng organzaton arranges the lst of horses n each race from a pool of hundreds of horses; presumably the organzer hmself does not have any dea of the probablty dstrbuton of wnnng. We hope to fnd a model whch converges n a self-organzed way (.e. through the bettng dynamcs, regardless of the dstrbuton of ntal probabltes), to a set of P whch reproduce the observed power law dstrbuton of wnnng dvdends. We are left wth the freedom of choosng a) the dstrbuton Q of the ntal p (0) Numercal smulatons. Our smulatons were done for N = 12 and 14 horses n each race. The value r =0.2 was used for the tax. For each race we frst generated a set of ntal wnnng probabltes n the followng way. A set of N random ndependent numbers x were generated from a normal dstrbuton. The ntal probabltes were determned by = x / x. Hereafter we wll refer to ths as the prmtve dstrbuton. p (0) Addtve correcton. We analyzed varous forms of the functon F [p (0),n /Z]. We frst tred an addtve correcton of the ntal probablty, P = (1 λ)p(0) + λg(n /Z) 1 λ + λ g(n /Z), (5) where λ s a parameter that controls the weght gven to the correcton due to the updated dvdends; g(x) s an ncreasng functon of x. We tred the form g(x) =x α, (6)

K. Park et al.: Power law dstrbuton of dvdends n horse races 423 and nvestgated three cases: α =1,α>1, and α<1. When α = 1, the expected gan at tme t s gven by [ ] f P =(1 r) (1 λ)z p(0) + λ. (7) n In the dynamcs generated by ths functon the object to be maxmzed by the bettor depends only on f p (0), just as n the smplest model consdered above, wthout correctons. Indeed we fnd convergence at long tmes to the same fnal state, wth p (0) /n const. Smulatons wth α = 1 and the prmtve ntal dstrbuton for N = 12, 14 horses yeld dvdend dstrbutons that dffer from the emprcal observaton [15]; a pronounced maxmum s observed at f 5. On ths bass we can rule out the addtve model wth α =1. For the case α>1 we found two regmes n our smulatons. For λ>0.8 we found that after some transent always one horse domnates the bettng; as n /Z ncreases, the ncrease of P domnates the decrease of f and hence ts expected gan ncreases. All people bet on the same horse: ths stuaton does not lead to the observed dstrbuton of dvdends. For smaller values of λ, ncreased bettng on a horse decreases ts expected gan, and the fnal n /Z are agan dstrbuted so as to equalze the values f P. Our smulatons wth 0.0 λ 0.6 yelded results n qualtatve agreement wth the prevous case, of α = 1. In the ntermedate regon of λ, the races fall nto one of two classes: one won by the horse wth the domnant bets,.e., f 1, and the second won by other horses. The dstrbuton of dvdends for the second class becomes noser and noser as λ grows and fnally for λ>0.8 t becomes dscontnuous, so we are left wth the frst class. For the last possblty, α<1, the gan f P s a monotoncally decreasng functon of n. For ncreasng λ the values of the n s become more and more unform, tendng to n 1/N, and the correspondng dstrbuton of dvdends narrows, untl at λ = 1 t reaches a deltafuncton wth all dvdends takng the value (1 r)n. We thus conclude that the addtve rule (5) wth the smple form (6) for g does not produce the observed dstrbuton of the wnnng dvdends. We now turn to a multplcatve rule, of the form / P = p (0) g(n /Z) p (0) j g(n j /Z), (8) agan wth g(x) =x α. It s easy to show that α should be less than unty; otherwse the horse wth the ntally maxmal wnnng probablty takes all the bets. Therefore we lmted our smulatons to 0 α 1. We found that for α = 0 the dstrbuton of wnnng dvdends does not peak n the lowest nterval. The frequency of the lowest dvdend ncreases wth α. The best fts to the observed data were obtaned usng α c =0.63 and 0.65 for N =12andN = 14 horses, respectvely. The results of our smulatons, performed for these values of α, are presented n fg. 2(a) and (b), together wth the real data. In fg. 2(a), the smulaton data are averaged over 10 samples of 5000 races wth Z = 10000 for each race. The error bars have been estmated by the standard devaton. For a fxed value of Z, dvdends are found only at dscrete values f n = rz/n, wth nteger n. Ths makes the dstrbuton of dvdends dscontnuous. Note that we have omtted the ponts of zero occurrence n fg. 2. In real stuatons the fluctuatons n Z make the dstrbuton contnuous, and the tal s absorbed nto the desred power law scalng n the entre range. We have verfed that as Z s ncreased, the tal decreases, as s shown n j

424 EUROPHYSICS LETTERS 1 0.1 Real data Smulaton 1 0.1 Real data Z=10000 Z=50000 0.01 P(f) 0.001 0.01 P(f) 0.001 0.0001 (a) 1e-05 1 10 100 1000 f 0.0001 (b) 1 10 100 1000 f Fg. 2 Number densty dstrbutons of real data and the best-ftted smulatons, wth (a) α c =0.63 for N =12and(b)α c =0.65 for N = 14. The smulaton data are averaged over 10 samples of 5000 races. The error bars have been estmated by the standard devaton. fg. 2(b); hence the long tal n the regon f>100 s a fnte-sze effect. The mnmal value of P (f) s determned by the number of total races n the smulaton. [ ] 1/(1 αc) The fnal dstrbuton of n s for each race s such that P f const and hence p (0) /n const. Thus from eq. (8) the self-organzed probablty of wnnng reduces to [ ] 1/(1 αc) P. (9) p (0) Ths s an nterestng result; t means that n the course of the bettng dynamcs, the probablty of wnnng evolves n a way that enhances dfferences: the wnnng probabltes of ntally better horses are much more ncreased and those of ntally worse horses become even more suppressed. Thus, n the fnal states, there ndeed exsts a most lkely wnner whose probablty of wnnng domnates those of all the other horses, gvng rse to the observed maxmum at the lowest bn of dvdends. To test the extent to whch our results depend on the dstrbuton of ntal wnnng probabltes, we replaced the normal dstrbuton for the x (from whch the p (0) = x / x are constructed), by a unform dstrbuton 0 x 1. Smulatons wth ths dstrbuton and the multplcatve rule agan yelded the observed dstrbuton of dvdends. Detaled smulatons wth varous ntal dstrbutons wll be gven elsewhere [15]. In general, the value of α c ncreases as the varance among the ntal probabltes decreases. For bounded dstrbutons, such as the unform one, the best fts were found near α c =0.9. We summarze and dscuss the few assumptons that were made by our smple model of horse racng: ) The board of dvdends s updated on-lne and all bets are for the same amount. In fact bettng s open for 30 mnutes, durng whch the board s updated 60 tmes. The possble effects of havng a fnte number of updates, at dscrete ntervals, wll be presented elsewhere [15]. ) All bettors are ratonal and use the same optmal strategy. We dd not allow any nose n the decson, such as that whch may be ntroduced by random bettors. ) All the bettors use the same ntal guess for the wnnng probabltes. We have reasons to beleve that the amplfyng mechansm descrbed above, whch s generated by the bettng process, wll reduce the effect of allowng nose nto the ntal guess. Nevertheless, an mproved model should consder the effects of both knds of nose. Ths s the strongest assumpton.

K. Park et al.: Power law dstrbuton of dvdends n horse races 425 There s an nterestng analogy between ths model and one of stock markets, n whch traders are dvded nto the two classes, ratonal and nosy [16]. We may also classfy people nto professonals, who come wth pror knowledge, and amateurs, who use only the nformaton on the board. It s mportant to note that the role of the amateur s dfferent from that of a nosy trader n stock market models; here the true probablty of wnnng s determned through the decson of all the people, so that professonals alone cannot apply an optmal strategy. We thank I. Kanter for dscussons; KP thanks H. Cho for teachng hm about horse racng. REFERENCES [1] Bonabeau E. and Dagorn L., Phys. Rev. E, 51 (1995) R5220. [2] Peng C.-K., Metus J., Hausdorff J., Havln S., Stanley H. E. and Goldberger A. L., Phys. Rev. Lett., 70 (1993) 1343. [3] Takayasu M., Takayasu H. and Sato T., Physca A, 233 (1996) 824. [4] Roberts D. C. and Turcotte D. L., Fractals, 6 (1998) 351. [5] Ghashghae S., Breymann W., Penke J., Talkner P. and Dodge Y., Nature, 381 (1996) 767. [6] Gallucco S., Caldarell G., Marsl M. and Zhang Y.-C., Physca A, 245 (1997) 423. [7] Mantegna R. N. and Stanley H. E., Nature, 376 (1995) 46. [8] Bak P., Chen K., Schenkman J. A. and Woodford M., Rc. Economch, 47 (1993) 3. [9] Levy M. and Solomon S., Int. J. Mod. Phys. C, 7 (1996) 595. [10] Amaral L. A. N., Buldyrev S. V., Havln S., Salnger M. A. and Stanley H. E., Phys. Rev. Lett., 80 (1998) 1385. [11] Bak P., Tang C. and Wesenfeld K., Phys. Rev. Lett., 59 (1987) 381. [12] Challet D. and Zhang Y.-C., Physca A, 246 (1997) 407; 256 (1998) 514. [13] Data used here s avalable at http://www.kra.or.kr. where 4283 races from January 6, 1996 to December 19, 1999 are collected. [14] de Waal D. J., S. Afr. Stat. J., 32 (1998) 83. [15] Park K., n preparaton. [16] Bak P., Paczusk M. and Shubk M., Physca A, 246 (1997) 430.