Maximum Entropy, Parallel Computation and Lotteries



Similar documents
Understanding Financial Management: A Practical Guide Guideline Answers to the Concept Check Questions

Periodic Review Probabilistic Multi-Item Inventory System with Zero Lead Time under Constraints and Varying Order Cost

Money Math for Teens. Introduction to Earning Interest: 11th and 12th Grades Version

Annuities and loan. repayments. Syllabus reference Financial mathematics 5 Annuities and loan. repayments

Learning Objectives. Chapter 2 Pricing of Bonds. Future Value (FV)

Finance Practice Problems

On the Optimality and Interconnection of Valiant Load-Balancing Networks

Two degree of freedom systems. Equations of motion for forced vibration Free vibration analysis of an undamped system

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Your organization has a Class B IP address of Before you implement subnetting, the Network ID and Host ID are divided as follows:

OPTIMALLY EFFICIENT MULTI AUTHORITY SECRET BALLOT E-ELECTION SCHEME

Paper SD-07. Key words: upper tolerance limit, macros, order statistics, sample size, confidence, coverage, binomial

Incremental calculation of weighted mean and variance

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

Soving Recurrence Relations

Properties of MLE: consistency, asymptotic normality. Fisher information.

A probabilistic proof of a binomial identity

I. Chi-squared Distributions

CHAPTER 11 Financial mathematics

1. MATHEMATICAL INDUCTION

Lesson 15 ANOVA (analysis of variance)

Derivation of Annuity and Perpetuity Formulae. A. Present Value of an Annuity (Deferred Payment or Ordinary Annuity)

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

The dinner table problem: the rectangular case

High-Performance Computing and Quantum Processing

The Binomial Distribution

Present Value Factor To bring one dollar in the future back to present, one uses the Present Value Factor (PVF): Concept 9: Present Value

STUDENT RESPONSE TO ANNUITY FORMULA DERIVATION

CS103X: Discrete Structures Homework 4 Solutions

The Binomial Multi- Section Transformer

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

between Modern Degree Model Logistics Industry in Gansu Province 2. Measurement Model 1. Introduction 2.1 Synergetic Degree

CHAPTER 3 THE TIME VALUE OF MONEY

THE PRINCIPLE OF THE ACTIVE JMC SCATTERER. Seppo Uosukainen

Confidence Intervals for One Mean

Asymptotic Growth of Functions

Financing Terms in the EOQ Model

Estimating Probability Distributions by Observing Betting Practices

3. Greatest Common Divisor - Least Common Multiple

PENSION ANNUITY. Policy Conditions Document reference: PPAS1(7) This is an important document. Please keep it in a safe place.

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Logistic Regression, AdaBoost and Bregman Distances

Notes on Power System Load Flow Analysis using an Excel Workbook

Domain 1: Designing a SQL Server Instance and a Database Solution

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8

AN IMPLEMENTATION OF BINARY AND FLOATING POINT CHROMOSOME REPRESENTATION IN GENETIC ALGORITHM

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

Estimating Surface Normals in Noisy Point Cloud Data

Scheduling Hadoop Jobs to Meet Deadlines

Overview of some probability distributions.

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

Modified Line Search Method for Global Optimization

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

The Stable Marriage Problem

Lecture 4: Cheeger s Inequality

INVESTMENT PERFORMANCE COUNCIL (IPC)

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

Cooley-Tukey. Tukey FFT Algorithms. FFT Algorithms. Cooley

BENEFIT-COST ANALYSIS Financial and Economic Appraisal using Spreadsheets

Questions & Answers Chapter 10 Software Reliability Prediction, Allocation and Demonstration Testing

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Multiplexers and Demultiplexers

Infinite Sequences and Series

1 Computing the Standard Deviation of Sample Means

Effect of Contention Window on the Performance of IEEE WLANs

NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff,

MARTINGALES AND A BASIC APPLICATION

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

Department of Computer Science, University of Otago

On Formula to Compute Primes. and the n th Prime

ANNUITIES SOFTWARE ASSIGNMENT TABLE OF CONTENTS... 1 ANNUITIES SOFTWARE ASSIGNMENT... 2 WHAT IS AN ANNUITY?... 2 EXAMPLE QUESTIONS...

Section 11.3: The Integral Test

Asian Development Bank Institute. ADBI Working Paper Series

Solving Logarithms and Exponential Equations

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2

Peer-to-Peer File Sharing Game using Correlated Equilibrium

Chapter 3 Savings, Present Value and Ricardian Equivalence

Strategic Remanufacturing Decision in a Supply Chain with an External Local Remanufacturer

(VCP-310)

Sequences and Series

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE The absolute value of the complex number z a bi is

Streamline Compositional Simulation of Gas Injections Dacun Li, University of Texas of the Permian Basin

Basic Elements of Arithmetic Sequences and Series

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

FXA Candidates should be able to : Describe how a mass creates a gravitational field in the space around it.

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL.

Simple Annuities Present Value.


TO: Users of the ACTEX Review Seminar on DVD for SOA Exam FM/CAS Exam 2

BINOMIAL EXPANSIONS In this section. Some Examples. Obtaining the Coefficients

Episode 401: Newton s law of universal gravitation

1 Correlation and Regression Analysis

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Transcription:

Maximum Etopy, Paallel Computatio ad Lotteies S.J. Cox Depatmet of Electoics ad Compute Sciece, Uivesity of Southampto, UK. G.J. Daiell Depatmet of Physics ad Astoomy, Uivesity of Southampto, UK. D.A. Nicole, Depatmet of Electoics ad Compute Sciece, Uivesity of Southampto, UK. Abstact By pickig upopula sets of umbes i a lottey, it is possible to icease oe s expected wiigs. We have used the Maximum Etopy method to estimate the pobability of each of the 14 millio tickets beig chose by playes i the UK Natioal Lottey. We discuss the paallel solutio of the o-liea system of equatios o a vaiety of platfoms ad give esults which idicate the etus achieved by a sydicate buyig a lage umbe of tickets each week. Keywods: Maximum Etopy, Lottey, Paallel Computatio, Commodity Supecomputig. 1 Itoductio I may lotteies the pizes which playes wi deped o the umbe of othe wies. I the example of the UK Natioal lottey, which we use thoughout this pape, playes pick 6 umbes fom 1 to 49. A simila system opeates i may lotteies acoss the wold: the Floida state lottey also allows choice of 6 umbes fom 49, whilst i the Califoia State lottey (Supelotto) playes pick 6 fom 51. Fo the UK Natioal lottey, 6 mai ad a bous umbe ae daw evey Wedesday ad Satuday. Playes ae awaded a fixed 10 pize if they match 3 of the mai umbes. The pizes i the othe categoies deped o the umbes of wies ad ae typically 62 fo a 4-match, 1500 fo 5-match, 100 000 fo matchig 5 of the mai umbes ad the bous, while a typical ackpot wie eceives aoud 2000000 [1, 2, 3]. The pize fud is made up of 45 pece fom evey poud ticket bought. I a pevious pape [4] we applied the Maximum Etopy method to elicit stuctue i playes choices of umbes statig fom the published umbes of pize wies. We estimated the populaity of each of the 14 millio tickets, fom which we computed the populaity of idividual umbes ad pais of umbes. A cude calculatio showed that it is possible to double oe s expected wiigs whe puchasig a sigle upopula ticket. I this pape we focus o the paallel solutio of the system of o-liea equatios which esult fom the applicatio of the Maximum Etopy method ad show the etus to a sydicate buyig a lage umbe of tickets. The layout of the pape is as follows. I sectio 2 we discuss the applicatio of the 1

Maximum Etopy method. We discuss the atue of the paallel solutio of the esultig set of o-liea equatios ad give some of the fist scietific esults usig Fota with MPI o a commodity cluste of DEC Alpha wokstatios uig Widows NT [5] i sectio 3. The ew esults we peset i sectio 4 show that a sydicate which buys aoud 75000 tickets pe week would beefit fom choosig the upopula umbes which we ca idetify. We daw ou coclusios i sectio 5. 2 Applicatio of Maximum Etopy We wish to detemie the pobability of each of the possible 13 983 816 tickets beig bought, subect to the costaits that the pobabilities ae cosistet with the umbes of wies obseved i the daws so fa. The data is available fom a idepedet iteet souce [6]. Jayes Maximum Etopy Piciple says that if oe is foced to assig pobabilities, p i, usig limited ifomatio, oe should do so by maximisig the etopy of the distibutio: S = pi log pi, (1) i subect to the costaits of kow expectatio values [7]. This Maximum Etopy distibutio is the most cosevative assigmet i the sese that it does ot pemit oe to daw ay coclusios ot waated by the data. [8]. We use the followig otatio [4]. Each ticket is deoted by a sigle idex t, which is a abbeviatio fo six umbes, chose without epetitio, fom the set of iteges {1, 2,..., 49}. P(t) is the pobability that a playe chooses the ticket labelled t. The wiig set of umbes daw i a paticula week is deoted by. Let ( t,) = 1 If t ad have exactly umbes i commo. 0 Othewise. The expected factio of playes matchig exactly umbes, i a week whe the wiig umbes ae idexed by, is the give by f () whee: f ( ) = ( t,) P(t). (3) t Suppose W lottey daws have bee made, the values of f 3 (), f 4 (), ad f 5 () ae kow fo W diffeet values of : 1, 2,..., W. Equatio (3) the leads to a set of 3W costaits that apply to the distibutio P(t). We assume that P(t) is idepedet of time ad sice playes make idepedet samples fom P(t), the umbe of playes buyig ticket t follows a Poisso distibutio with paamete µ(t) = P(t) N, whe N tickets ae sold. We fid the maximum etopy estimate of the populaity of each ticket is $P ( t): (2) 2

P $ ( t) exp = 1 λ ( t, ), (4) Z, i which Z, the patitio fuctio, omalises the pobability distibutio: Z = exp λ ( t, ) (5) t, To fid the ukow Lagage multiplies, λ, we substitute P $ ( t) fom (4) ito the costait equatios (3). Fo W daws, the costait equatios defie a set of 3W o-liea equatios fo the Lagage multiplies, λ : f 1 ( ) = (t, ) exp λ ( t, ). (6) Z t, 3 Paallel Computatioal Method To solve the o-liea equatios defied by (6), we use a iteative techique based o Newto s method [9] i which we supply the aalytic Jacobia. The pocedue coveges i 5-8 iteatios. The equatios (6) may be witte as: G = { (t, ) f ( )} exp λ ( t, ) = 0, (7) t, which yields the followig explicit elemets of the Jacobia: J i s G = i λ s = Each iteatio updates the Lagage multiplies usig t s(t, i ) { (t, ) f ( )} exp λ, i s i s ( i 1 J ) G s ( t, ). (8) λ ( ew) = λ ( old). (9) We have desiged a efficiet paallel algoithm to solve the system of o-liea equatios (7) which cosists of two pats: 1. The Jacobia fo the system is filled i paallel, by dividig up the sum ove t (the 14 millio tickets) i (8) betwee the pocessos. 2. Calculatio of the Jacobia may be expessed as computatio of the patitio fuctio (5). Usig the aggegate memoy o the multiple pocessos, it is possible to stoe a lookup table fom which the patitio fuctio may be computed easily. 3

The lookup table yields which tickets wo pizes i which weeks. Fo each week of data this table has espectively 246 820, 13 545, ad 258 eties fom tickets wiig 3, 4, ad 5 match pizes. To stoe the lookup table fo a few huded weeks of data equies seveal huded Mb of memoy. To illustate the stoage scheme fo the patitio fuctio, we coside a simple lottey i which thee umbes ae chose out of {1, 2, 3, 4, 5}. Pizes ae awaded fo those tickets matchig 2 o 3 umbes. Let the wiig umbes daw be {1,2,3}, {1,3,4}, ad {1,4,5}. I this case ticket {1,2,3}, fo example, wo a 3-match i week 1 ad a 2-match i week 2. Its cotibutio to the patitio fuctio (5) is 1 2 exp( λ + ). Fo coveiece, each Lagage multiplie is labelled by a sigle idex: λ m = 3 λ2 λ, whee m = (-2) W +, whee = 2,, 3 ad = 1,, W = 3. The lookup table cosists of a couted aay of the pizes each ticket has wo, labelled by m. The fist elemet fo each ticket is the umbe of pizes the ticket has wo, followed by a list of the labels m. Fo ou example ticket, the eties would thus be 2 (the umbe of pizes wo), 4 (3-match i week 1), ad 2 (2 match i week 2). To educe the stoage futhe, it is possible to combie the fist two table eties fo each ticket ito a sigle umbe by shiftig oe of the umbes to the left ad addig. This has the advatage that the size of the lookup table is educed ad it gows by a fixed amout as moe weeks of data ae used. Each pocesso evaluates the patitio fuctio ove its set of tickets, ad the fial esult is obtaied usig a global eductio. The Jacobia matix is filled i a aalogous mae. I Figue 1 we show the pocessig time fo 73 weeks of data. The efficiecy o the 16 ode SP system is ust ove 90%. Use of the lookup table is memoy itesive: ideed the sigle ode pefomace of the code is limited by the available memoy badwidth. The SP thi2 odes have twice the memoy badwidth of the thi2 odes (ad a slightly lage cache) ad pefom ealy twice as fast. The itecoectio etwok betwee pocessos o ou commodity cluste of DEC Alpha wokstatios is 100Mbit switched etheet. At the time of witig (Feb 1998), the 0.92 Beta elease MPI implemetatio o NT 4.0 [10] which we ae usig limits the efficiecy fo uig paallel obs: it is iteded fo use o shaed memoy systems. It achieves a badwidth of 56 kbs -1, compaed with file tasfe badwidth of 4-7 Mbs -1. We supplied ou ow Fota bidigs fo this [5]. Whilst ou esults should ot be itepeted as a bechmak of the pefomace of MPI o NT, we ae ecouaged by the speedup (15%) obtaied o two odes. 4

Time (secods) 4000 3500 3000 2500 2000 1500 IBM SP (Thi1) IBM SP (Thi2) 500 MHz DEC Alpha Cluste 1000 500 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Numbe of Pocessos Figue 1 Total pocessig time fo 73 weeks of data o a vaious machies with paallel costuctio of the Jacobia matix ad table lookup to calculate the patitio fuctio. Whilst a speedup of 14.5 is deived fom usig a 16 ode machie by a simple paallelisatio of the algoithm, it is woth otig that implemetatio of the lookup table made the code u 17 times faste! Good distibuted paallel algoithms should exploit ot oly the pocessig powe, but also the aggegate memoy available o seveal pocessos. We used 6 thi2 SP odes to compute the Lagage multiplies fo W 100 ad used task famig to ou 8 ode DEC cluste fo W < 100. All othe calculatios, which did ot equie sigificat memoy esouces, wee pefomed usig the commodity cluste. 4 Results I the UK, a umbe of ogaisatios buy a lage umbe of tickets each week ad distibute them fo advetisig puposes. We have cosideed a sydicate which buys 75000 tickets each week. If such a sydicate bought tickets at adom, the the expected pizes i the vaious categoies would be the aveage 10 (fixed), 62, 1500, 100 000, ad 2 000 000 fo matchig 3, 4, 5, 5 plus bous, ad 6 umbes espectively. We have cosideed a sydicate which buys, i the ext daw, the least popula 75000 tickets which ou Maximum Etopy techique ca idetify usig the pevious W weeks of data. We ecompute the pizes i the ext daw as if ou sydicate had bought these tickets, takig ito accout the effect of olloves ad supe daws (whee the ackpot is topped up) usig the published pize stuctue [6]. I Table 1 we compae what such a sydicate would have wo usig the eal lottey data with the aveage 5

pizes wo ove the fist 224 weeks of the lottey. I all cases whee pickig upopula tickets ca icease the pize wo, the pizes wo ae iceased by betwee 36% to 101%. Jackpot 5 + bous 5 match 4 match 3 match Maximum Etopy Sydicate Aveage Pize 3 307 196 210 821 2690 87 10 Obseved Pize 1 946 366 104 881 1574 64 10 % Icease 70 101 71 36 (Fixed) Table 1 Aveage Pizes wo by sydicate compaed with aveage pizes obseved fom the lottey data I Figue 2 we show the sydicate s aveage etu o thei total ivestmet as the daws pogess. The peaks i the gaph occu whe the sydicate wo 5+bous pizes (daws 73, 76, 105, 109, 133, 222) o a ackpot pize (daw 133). I total the sydicate spet 15.3 millio ad wo back 10.3 millio. This compaes with the theoetical 6.9 millio expected fom buyig tickets at adom. Ou sydicate would have wo 50% moe moey usig ou Maximum Etopy techique. It is impotat to ote that the chaces of wiig ae uaffected: the additioal wiigs ae oly due to pickig upopula sets of wiig umbes. 100 Aveage Retu pe ticket (Pece) 80 60 40 20 Theoetical Retu fom Aveage Ticket 0 0 50 100 150 200 250 Daw Numbe (W ) Figue 2 Aveage etu pe ticket fo sydicate buyig 75000 tickets pe week as a fuctio of daw umbe 6

5 Coclusios We have applied the Maximum Etopy method to estimate the pobability of each ticket beig bought i the UK Natioal Lottey usig the factio of wies i the 3, 4, ad 5 match categoies. The esultig system of o-liea equatios wee solved usig a efficiet paallel algoithm o a distibuted memoy IBM SP ad o a commodity cluste of DEC Alpha wokstatios. We coside a sydicate which buys, i the ext daw, the least popula 75000 tickets that we ca idetify usig data fom the pevious weeks. We fid that the aveage pize i the 4, 5, 5 + bous match ad ackpot categoy is iceased by at least 36%. The oveall etu is iceased by 50% fom 45 pece i the poud (buyig adomly) to 67 pece. I the futue we ited to pefom the same calculatios fo a umbe of othe lotteies. 6 Ackowledgemets We would like to thak Ageli Thomas, Keith Lloyd ad Joh Haigh fo useful discussios. We appeciate the effots of Richad Lloyd i caefully collatig the data ad placig it i o the iteet. 7 Refeeces [1] HAIGH, J., 1995. Ifeig Gambles Choice of Combiatios i the Natioal Lottey. IMA Bulleti. 31, pp. 132-136. [2] HAIGH, J., 1997. The Statistics of the Natioal Lottey. J. R. Statist. Soc. A 160, Pat 2, pp.187-206. [3] MOORE, P.G., 1997. The Developmet of the UK Natioal Lottey: 1992-96. J. R. Statist. Soc. A: 160, Pat 2, pp.169-185. [4] COX, S.J., DANIELL, G.J., ad NICOLE, D.A., 1997. Usig Maximum Etopy to Double Oe s Expected Wiigs i the UK Natioal Lottey. Submitted to J. R. Statist. Soc. D. [5] COX, S.J., NICOLE, D.A., ad TAKEDA, K.J, 1998. Commodity High Pefomace Computig at Commodity Pices. To appea i WoTUG-21, Poceedigs of the 21st Wold occam ad Taspute Use Goup Techical Meetig. [6] Richad Lloyd. Cuetly: http://lottey.meseywold.com/ [7] JAYNES, E.T., 1983. Papes o Pobability, Statistics ad Statistical Physics (ed. R.D. Rosekatz). Dodecht: Reidel. ISBN 9027714487. 7

[8] JAYNES, E.T., 1959. Pobability Theoy i Egieeig ad Sciece, pp. 110-151, USA: Socoy Mobil Oil Compay. [9] PRESS, W.H., TEULKOLSKY S.A., VETTERLING, W.T. ad FLANNERY B.R., 1992. Numeical Recipes i FORTRAN 77, 2d editio. Cambidge: Cambidge Uivesity Pess. ISBN 052143064X. [10] Athoy Skellum, Bois Potopopov, Shae Hebet, Pete J. Bea, ad Walte Seefeld. 1997. MPI o Widows NT. We used 0.92 Beta elease. Cuetly available at: http://www.ec.msstate.edu/mpi/mpint.html 8