Statistical techniques for betting on football markets. Stuart Coles Padova, 11 June, 2015
Smartodds Formed in 2003. Originally a betting company, now a company that provides predictions and support tools to gamblers and bookmakers. Originally restricted to football, now includes modelling of tennis, baseball, american football, ice hockey, basketball and cricket. Originally employed 3 people, now employs more than 100 people on full time contracts, plus other staff on occasional basis. Departments include: quant team (statistics), IT, software development, client support, analysts, watcher operations, management. Close links with clients (traders) who use the information and tools we provide to help them decide when to make bets.
Smartodds Roles of the different teams: Quant team: Model development and maintenance. Provision of probabilities of match outcomes of different types. IT: Maintenance of hardware structure in office: computer and satellite systems etc. Software development: development of interfaces and tools to enable clients to use information provided by quant team and analysts in as simple a way as possible, and to support the mechanism of finding and placing efficient bets. Analysts: provide reviews and previews for every game. Watcher operations: manage the process of assigning contract staff to watch all major games and provide relevant feedback.
Football betting markets These days you can bet on pretty much anything numbers of shots, corners, red cards, throw-ins etc etc but there are two main markets for which there is usually large liquidity:
Football betting markets These days you can bet on pretty much anything numbers of shots, corners, red cards, throw-ins etc etc but there are two main markets for which there is usually large liquidity: Asian handicap
Football betting markets These days you can bet on pretty much anything numbers of shots, corners, red cards, throw-ins etc etc but there are two main markets for which there is usually large liquidity: Asian handicap Total goals
Asian Handicap In an Asian Handicap bet, an advantage the handicap is given to one of the 2 teams, usually in such a way as to make a bet on either team closer to 50/50.
Asian Handicap In an Asian Handicap bet, an advantage the handicap is given to one of the 2 teams, usually in such a way as to make a bet on either team closer to 50/50. Total goals is also a handicap market: a threshold is fixed and you can either bet on there being more or fewer goals than the handicap so-called unders and overs bets.
Asian Handicap In an Asian Handicap bet, an advantage the handicap is given to one of the 2 teams, usually in such a way as to make a bet on either team closer to 50/50. Total goals is also a handicap market: a threshold is fixed and you can either bet on there being more or fewer goals than the handicap so-called unders and overs bets. In either case, if the result is a draw once the handicap is taken into account, the bet is cancelled.
Asian Handicap In an Asian Handicap bet, an advantage the handicap is given to one of the 2 teams, usually in such a way as to make a bet on either team closer to 50/50. Total goals is also a handicap market: a threshold is fixed and you can either bet on there being more or fewer goals than the handicap so-called unders and overs bets. In either case, if the result is a draw once the handicap is taken into account, the bet is cancelled. Typically bets will be available at different handicaps, but there is usually greater liquidity on handicaps that make bets closer to evens (50/50).
Market examples Asian handicap price examples, Juventus v Barcelona, CL final 2015.
Market examples Asian handicap price examples, Juventus v Barcelona, CL final 2015. Handicap Juventus Barcelona -0.5 7.20 1.10 0 4.00 1.23 0.5 2.35 1.57 1 1.80 2.10 1.5 1.45 2.77 2 1.20 4.40
Market examples Total goals price examples, Juventus v Barcelona, CL final 2015.
Market examples Total goals price examples, Juventus v Barcelona, CL final 2015. Handicap Over Under 1.5 1.29 3.50 2 1.49 2.77 2.5 1.95 1.85 3 2.76 1.49 3.5 3.40 1.30
Betting regimes Note also that there are two different betting regimes:
Betting regimes Note also that there are two different betting regimes: Before kick off ( deadball )
Betting regimes Note also that there are two different betting regimes: Before kick off ( deadball ) During the match ( in running )
Betting regimes Note also that there are two different betting regimes: Before kick off ( deadball ) During the match ( in running ) In running betting is more challenging, both in terms of statistics and logistics. Need a model that constantly adapts with time according to the current match state, as well as a method of monitoring market price changes throughout the game.
Betting regimes Note also that there are two different betting regimes: Before kick off ( deadball ) During the match ( in running ) In running betting is more challenging, both in terms of statistics and logistics. Need a model that constantly adapts with time according to the current match state, as well as a method of monitoring market price changes throughout the game. We ll focus in this talk on deadball betting.
Model requirements 1. Model must be able to provide probabilities not just of who will win, but whether a team will win by 1 goal, 2 goals, etc, so that the probability of a win at different handicap thresholds can be calculated.
Model requirements 1. Model must be able to provide probabilities not just of who will win, but whether a team will win by 1 goal, 2 goals, etc, so that the probability of a win at different handicap thresholds can be calculated. 2. Similarly, the model must provide information on the complete score, not just the goal difference, so that probabilities of total goals can be calculated.
Betting strategy Standard decision theory: make a bet if expected winnings are positive.
Betting strategy Standard decision theory: make a bet if expected winnings are positive. Two cases, depending on whether a null result is possible with handicap. Easiest to see with example.
Betting strategy: example Suppose our model says P(Barca beat Juve) = 0.6, P(draw) = 0.2, P(Juve beat Barca) = 0.2.
Betting strategy: example Suppose our model says P(Barca beat Juve) = 0.6, P(draw) = 0.2, P(Juve beat Barca) = 0.2. If we bet on Barca with handicap of -0.5: E(profit) = 1.57 0.6 1 = 0.058
Betting strategy: example Suppose our model says P(Barca beat Juve) = 0.6, P(draw) = 0.2, P(Juve beat Barca) = 0.2. If we bet on Barca with handicap of -0.5: E(profit) = 1.57 0.6 1 = 0.058 Alternative bet on Juve with handicap of +0.5: E(profit) = 2.35 0.4 1 = 0.06
Betting strategy: example Suppose our model says P(Barca beat Juve) = 0.6, P(draw) = 0.2, P(Juve beat Barca) = 0.2. If we bet on Barca with handicap of -0.5: E(profit) = 1.57 0.6 1 = 0.058 Alternative bet on Juve with handicap of +0.5: E(profit) = 2.35 0.4 1 = 0.06 So both bets have negative expected value. Model is too similar to market prices.
Betting strategy Suppose our model says P(Barca beat Juve) = 0.6, P(draw) = 0.2, P(Juve beat Barca) = 0.2.
Betting strategy Suppose our model says P(Barca beat Juve) = 0.6, P(draw) = 0.2, P(Juve beat Barca) = 0.2. If we bet on Barca with handicap of 0: E(profit) = 1.23 0.6 + 1 0.2 1 = 0.062.
Betting strategy Suppose our model says P(Barca beat Juve) = 0.6, P(draw) = 0.2, P(Juve beat Barca) = 0.2. If we bet on Barca with handicap of 0: E(profit) = 1.23 0.6 + 1 0.2 1 = 0.062. If we bet on Juve with handicap of 0: E(profit) = 4 0.2 + 1 0.2 1 = 0
Betting strategy Suppose our model says P(Barca beat Juve) = 0.6, P(draw) = 0.2, P(Juve beat Barca) = 0.2. If we bet on Barca with handicap of 0: E(profit) = 1.23 0.6 + 1 0.2 1 = 0.062. If we bet on Juve with handicap of 0: E(profit) = 4 0.2 + 1 0.2 1 = 0 Still no value on either side.
Betting strategy Comments: 1. Bookmaker odds always contain an inbuilt edge: 1/q E + 1/q E c > 1, which means that with a reasonable distribution of bets, they will make a profit whatever the result of the match.
Betting strategy Comments: 1. Bookmaker odds always contain an inbuilt edge: 1/q E + 1/q E c > 1, which means that with a reasonable distribution of bets, they will make a profit whatever the result of the match. 2. Since we know our model estimates are not perfect, but are subject to sampling error etc., it s usual to only make a bet if the expected profit falls above some specified threshold, say 5% of unit stake.
Betting strategy Comments: 1. Bookmaker odds always contain an inbuilt edge: 1/q E + 1/q E c > 1, which means that with a reasonable distribution of bets, they will make a profit whatever the result of the match. 2. Since we know our model estimates are not perfect, but are subject to sampling error etc., it s usual to only make a bet if the expected profit falls above some specified threshold, say 5% of unit stake. 3. There s an interesting theory about how one should choose the proportion of your available funds to stake on any particular bet: see Kelly Criterion in Wikipedia, for example.
Betting strategy Comments: 1. Bookmaker odds always contain an inbuilt edge: 1/q E + 1/q E c > 1, which means that with a reasonable distribution of bets, they will make a profit whatever the result of the match. 2. Since we know our model estimates are not perfect, but are subject to sampling error etc., it s usual to only make a bet if the expected profit falls above some specified threshold, say 5% of unit stake. 3. There s an interesting theory about how one should choose the proportion of your available funds to stake on any particular bet: see Kelly Criterion in Wikipedia, for example. 4. Note the difference with gambling in a casino: all casino bets have negative expected value. You might win in a casino, but you have to be lucky. If our models are correct, we are bound to win in the long run!
Models for football Fundamentally, models for football are generally based around a Poisson distribution for the number of goals scored by either team. Partly motivated by theoretical argument, partly by pragmatism.
Models for football: theory Assume that goals are independently occurring events in time, with a (possibly changing) rate throughout the game. This defines a Poisson process. Consequently, the number of events (goals) in a fixed amount of time (e.g. 90 minutes) follows a Poisson distribution.
Models for football: theory Assume that goals are independently occurring events in time, with a (possibly changing) rate throughout the game. This defines a Poisson process. Consequently, the number of events (goals) in a fixed amount of time (e.g. 90 minutes) follows a Poisson distribution. Note: strictly these assumptions are wrong. Generally, once a goal is scored, the scoring rates of both teams change. This violates the Poisson assumption.
Models for football: pragmatism The Poisson distribution has the correct support for a random variable that corresponds to goal counts, and is also an easy family for building regression models through GLM models etc. Simple empirical analyses suggest the Poisson distribution to be a reasonable approximation.
Models for football So, if team i plays team j, we might first assume that the home and away goals satisfy: with X (h) i,j X (h) i,j Poisson(µ h ), X (a) i,j Poisson(µ a ) and X (a) i,j independent. As with our dice tournament model, the issue then is how to build structure into the model via the parametrization of µ h and µ a.
Models for football So, if team i plays team j, we might first assume that the home and away goals satisfy: X (h) i,j Poisson(µ h ), X (a) i,j Poisson(µ a ) with X (h) i,j and X (a) i,j independent. As with our dice tournament model, the issue then is how to build structure into the model via the parametrization of µ h and µ a. Different teams have different strengths. Mean goals are likely to depend on strength in attack and defence of respective teams. Means should be restricted to positive real line.
Modelling the means Assume parameters (α i, β i ), i = 1,..., N, that measure the strength in attack and defence respectively of each team i.
Modelling the means Assume parameters (α i, β i ), i = 1,..., N, that measure the strength in attack and defence respectively of each team i. Assume: log µ h = ν + γ + α i + β j log µ a = ν + α j + β i which includes a global mean parameter ν and a home advantage parameter, γ.
Modelling the means Assume parameters (α i, β i ), i = 1,..., N, that measure the strength in attack and defence respectively of each team i. Assume: log µ h = ν + γ + α i + β j log µ a = ν + α j + β i which includes a global mean parameter ν and a home advantage parameter, γ. But again the model is over-parametrised, this time by 2 degrees of freedom. So set, for example, αi = β i = 0.
Basic model failures Apart from slight concerns that the theoretical justification for the Poisson model might not be completely reasonable, there are two other substantial limitations of the Poisson model:
Basic model failures Apart from slight concerns that the theoretical justification for the Poisson model might not be completely reasonable, there are two other substantial limitations of the Poisson model: 1. Dependence: empirical studies suggest there is a slight dependence between the number of goals scored by the home and away teams. In particular, scores 0-0 and 1-1 are more common than the independent Poisson model permits.
Basic model failures Apart from slight concerns that the theoretical justification for the Poisson model might not be completely reasonable, there are two other substantial limitations of the Poisson model: 1. Dependence: empirical studies suggest there is a slight dependence between the number of goals scored by the home and away teams. In particular, scores 0-0 and 1-1 are more common than the independent Poisson model permits. 2. Temporal variation: team performances do not remain static from season-to-season, or even within a season. Even global mean and home advantage may change through time.
Dependence One possible model for handling dependence: Pr(X (h) i,j = x h, X (h) i,j = x a ) = τ(x h, x a ) exp( µ h)µ x h h x h! exp( µ a)µ xa a x a! where τ(x h, x a ) = 1 ρµ h µ a if x h = x a = 0 1 + ρµ h if (x h, x a ) = (0, 1) 1 + ρµ a if (x h, x a ) = (1, 0) 1 ρ if (x h, x a ) = (1, 1) 1 otherwise
Dependence If ρ = 0, this model is the standard independent Poisson model. For other choices of ρ, which inflate/deflate the scores (0, 0), (1, 0), (0, 1) and (1, 1), there is a correlation between the home and away scores. Surprisingly, the marginal distributions of X (h) i,j and X (a) i,j remain Poisson with means µ h and µ a respectively. (Exercise: check this).
Time variation Obviously, more recent games provide a more reliable guide to current team performance than older games. So, we need an inference method which lets parameters be time dependent, and which gives greater inferential weight to the most recent games. Two possible solutions: 1. Best: a fully dynamic model. 2. Easiest: a local likelihood model.
Time variation: dynamic model Allow for a time-evolving parameter vector such that as defined previously, and θ t = (ν, γ, α 1,..., α N, β 1,..., β N, τ) t, (X (h) i,j, X (a) i,j ) t Poisson dep(µ h,t, µ a,t, τ t ) θ t = θ t 1 + ɛ t where, say, ɛ t N(0, Σ)
Time variation: local likelihood Fit model at time t via weighted likelihood function N L(θ t ) = f (x h,t, x a,t ; θ t ) φ(t) t=1 where φ(t) is a decreasing function with history.
Time variation: attack parameter estimates Attack for Juventus 2010 2015 Attack for AC Milan 2010 2015 4.5 3.6 4.0 Attack Attack 3.3 3.5 3.0 3.0 2010 2011 2012 2013 2014 2015 Date 2.7 2010 2011 2012 2013 2014 2015 Date
Time variation: defence parameter estimates Defence for Juventus 2010 2015 Defence for AC Milan 2010 2015 0.40 0.36 0.35 Defence 0.30 Defence 0.32 0.25 0.28 0.20 0.24 2010 2011 2012 2013 2014 2015 Date 2010 2011 2012 2013 2014 2015 Date
Other issues 1. Population choice. 2. Other information from previous games: shots, corners, free kicks, etc. 3. Seasonality (especially in total goals). 4. Red cards. 5. Motivation. 6. Referees. 7. Weather. 8. Team information.
Finally... On basis of information we provided clients, they chose not to bet on CL final prior to the match.
Finally... On basis of information we provided clients, they chose not to bet on CL final prior to the match. But they placed a bet on Juventus on 75 minutes, with a handicap of 0.25 when the score was 1-2.
Finally... On basis of information we provided clients, they chose not to bet on CL final prior to the match. But they placed a bet on Juventus on 75 minutes, with a handicap of 0.25 when the score was 1-2. If the score had remained 1-2, they would have (half) won the bet. If Juventus had equalised, they would have fully won the bet.
Finally... On basis of information we provided clients, they chose not to bet on CL final prior to the match. But they placed a bet on Juventus on 75 minutes, with a handicap of 0.25 when the score was 1-2. If the score had remained 1-2, they would have (half) won the bet. If Juventus had equalised, they would have fully won the bet. In the 7th minute of injury time, Barcelona scored to make it 1-3 and the bet was lost.
Finally... On basis of information we provided clients, they chose not to bet on CL final prior to the match. But they placed a bet on Juventus on 75 minutes, with a handicap of 0.25 when the score was 1-2. If the score had remained 1-2, they would have (half) won the bet. If Juventus had equalised, they would have fully won the bet. In the 7th minute of injury time, Barcelona scored to make it 1-3 and the bet was lost. Statistics can only help you to win: it can t undo the role of chance.
Contact stuart.coles@smartodds.co.uk