Example of the Glicko-2 system



Similar documents
Normal distribution. ) 2 /2σ. 2π σ

CS 261 Fall 2011 Solutions to Assignment #4

Financial Mathematics and Simulation MATH Spring 2011 Homework 2

Roots of equation fx are the values of x which satisfy the above expression. Also referred to as the zeros of an equation

I. Pointwise convergence

Operations and Supply Chain Management Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology Madras

Standard Deviation Estimator

5. Continuous Random Variables

16. THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION

Department of Civil Engineering-I.I.T. Delhi CEL 899: Environmental Risk Assessment Statistics and Probability Example Part 1

Math 425 (Fall 08) Solutions Midterm 2 November 6, 2008

1. Then f has a relative maximum at x = c if f(c) f(x) for all values of x in some

Confidence Intervals for the Difference Between Two Means

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

6.4 Normal Distribution

3.4 Statistical inference for 2 populations based on two samples

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

1 Sufficient statistics

The USCF Rating System

1 Error in Euler s Method

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

An Introduction to Basic Statistics and Probability

Chapter 5: Normal Probability Distributions - Solutions

THE KRUSKAL WALLLIS TEST

Intro to Simulation (using Excel)

Chapter 3 RANDOM VARIATE GENERATION

Descriptive Statistics

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Notes on Continuous Random Variables

Profit Forecast Model Using Monte Carlo Simulation in Excel

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Practice problems for Homework 11 - Point Estimation

The Normal Distribution. Alan T. Arnholt Department of Mathematical Sciences Appalachian State University

SOME ASPECTS OF GAMBLING WITH THE KELLY CRITERION. School of Mathematical Sciences. Monash University, Clayton, Victoria, Australia 3168

The Normal distribution

CALCULATIONS & STATISTICS

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

5.1 Derivatives and Graphs

3.2 Measures of Spread

Continuous Random Variables

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

The Normal Distribution

x a x 2 (1 + x 2 ) n.

HYPOTHESIS TESTING: POWER OF THE TEST

The continuous and discrete Fourier transforms

AC : MATHEMATICAL MODELING AND SIMULATION US- ING LABVIEW AND LABVIEW MATHSCRIPT

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Traffic Behavior Analysis with Poisson Sampling on High-speed Network 1

TOPIC 4: DERIVATIVES

Decision & Risk Analysis Lecture 6. Risk and Utility

Chapter 4 Lecture Notes

Math Quizzes Winter 2009

9.07 Introduction to Statistical Methods Homework 4. Name:

2WB05 Simulation Lecture 8: Generating random variables

4. Continuous Random Variables, the Pareto and Normal Distributions

Monte Carlo-based statistical methods (MASM11/FMS091)

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

36 CHAPTER 1. LIMITS AND CONTINUITY. Figure 1.17: At which points is f not continuous?

LIMITS AND CONTINUITY

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Nonlinear Algebraic Equations. Lectures INF2320 p. 1/88

Two Correlated Proportions (McNemar Test)

One pile, two pile, three piles

STAT 200 QUIZ 2 Solutions Section 6380 Fall 2013

Dongfeng Li. Autumn 2010

LogNormal stock-price models in Exams MFE/3 and C/4

Confidence Intervals for Cp

Determining distribution parameters from quantiles

Chapter 4: Nominal and Effective Interest Rates

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Stats on the TI 83 and TI 84 Calculator

2.6. Probability. In general the probability density of a random variable satisfies two conditions:

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method

Bayes Theorem. Bayes Theorem- Example. Evaluation of Medical Screening Procedure. Evaluation of Medical Screening Procedure

Measures of Central Tendency and Variability: Summarizing your Data for Others

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, cm

5.1 Identifying the Target Parameter

$ ( $1) = 40

פרויקט מסכם לתואר בוגר במדעים )B.Sc( במתמטיקה שימושית

Betting with the Kelly Criterion

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

Math 104A - Homework 2

Lecture 13: Martingales

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

PRICING OF GAS SWING OPTIONS. Andrea Pokorná. UNIVERZITA KARLOVA V PRAZE Fakulta sociálních věd Institut ekonomických studií

Chapter 13 The Black-Scholes-Merton Model

Statistical Process Control (SPC) Training Guide

Week 4: Standard Error and Confidence Intervals

Hypothesis Testing for Beginners

Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function

=

Math 151. Rumbos Spring Solutions to Assignment #22

Probability Distributions

3 Some Integer Functions

Transcription:

Example of the Glicko-2 system Professor Mark E. Glickman Boston University November 30, 203 Every player in the Glicko-2 system has a rating, r, a rating deviation, RD, and a rating volatility σ. The volatility measure indicates the degree of expected fluctuation in a player s rating. The volatility measure is high when a player has erratic performances (e.g., when the player has had exceptionally strong results after a period of stability), and the volatility measure is low when the player performs at a consistent level. As with the original Glicko system, it is usually informative to summarize a player s strength in the form of an interval (rather than merely report a rating). One way to do this is to report a 95% confidence interval. The lowest value in the interval is the player s rating minus twice the RD, and the highest value is the player s rating plus twice the RD. So, for example, if a player s rating is 850 and the RD is 50, the interval would go from 750 to 950. We would then say that we re 95% confident that the player s actual strength is between 750 and 950. When a player has a low RD, the interval would be narrow, so that we would be 95% confident about a player s strength being in a small interval of values. The volatility measure does not appear in the calculation of this interval. The formulas: To apply the rating algorithm, we treat a collection of games within a rating period to have occurred simultaneously. Players would have ratings, RD s, and volatilities at the beginning of the rating period, game outcomes would be observed, and then updated ratings, RD s and volatilities would be computed at the end of the rating period (which would then be used as the pre-period information for the subsequent rating period). The Glicko-2 system works best when the number of games in a rating period is moderate to large, say an average of at least 0-5 games per player in a rating period. The length of time for a rating period is at the discretion of the administrator. The rating scale for Glicko-2 is different from that of the original Glicko system. However, it is easy to go back and forth between the two scales. The following steps assume that ratings are on the original Glicko scale, but the formulas convert to the Glicko-2 scale, and then convert back at the end to Glicko. Step. Determine a rating and RD for each player at the onset of the rating period. The system constant, τ, which constrains the change in volatility over time, needs to be set prior to application of the system. Reasonable choices are between 0.3 and.2, though the system should be tested to decide which value results in greatest predictive accuracy. Smaller values of τ prevent the volatility measures from changing by large

amounts, which in turn prevent enormous changes in ratings based on very improbable results. If the application of Glicko-2 is expected to involve extremely improbable collections of game outcomes, then τ should be set to a small value, even as small as, say, τ = 0.2. (a) If the player is unrated, set the rating to 500 and the RD to 350. Set the player s volatility to 0.06 (this value depends on the particular application). (b) Otherwise, use the player s most recent rating, RD, and volatility σ. Step 2. For each player, convert the ratings and RD s onto the Glicko-2 scale: µ = (r 500)/73.778 φ = RD/73.778 The value of σ, the volatility, does not change. We now want to update the rating of a player with (Glicko-2) rating µ, rating deviation φ, and volatility σ. He plays against m opponents with ratings µ,..., µ m, rating deviations φ,..., φ m. Let s,..., s m be the scores against each opponent (0 for a loss, 0.5 for a draw, and for a win). The opponents volatilities are not relevant in the calculations. Step 3. Compute the quantity v. This is the estimated variance of the team s/player s rating based only on game outcomes. m v = g(φ j ) 2 E(µ, µ j, φ j ){ E(µ, µ j, φ j )} j= where g(φ) = E(µ, µ j, φ j ) = + 3φ 2 /π 2, + exp( g(φ j )(µ µ j )). Step 4. Compute the quantity, the estimated improvement in rating by comparing the pre-period rating to the performance rating based only on game outcomes. with g() and E() defined above. m = v g(φ j ){s j E(µ, µ j, φ j )} j= Step 5. Determine the new value, σ, of the volatility. This computation requires iteration. Note: This iterative procedure has been revised as of February 22, 202, and is now stable. 2

. Let a = ln(σ 2 ), and define f(x) = ex ( 2 φ 2 v e x ) (x a) 2(φ 2 + v + e x ) 2 τ 2 Also, define a convergence tolerance, ε. The value ε = 0.00000 is a sufficiently small choice. 2. Set the initial values of the iterative algorithm. Set A = a = ln(σ 2 ) If 2 > φ 2 + v, then set B = ln( 2 φ 2 v). If 2 φ 2 + v, then perform the following iteration: (i) Let k = (ii) If f(a kτ) < 0, then Set k k + Go to (ii). and set B = a kτ. The values A and B are chosen to bracket ln(σ 2 ), and the remainder of the algorithm iteratively narrows this bracket. 3. Let f A = f(a) and f B = f(b). 4. While B A > ε, carry out the following steps. (a) Let C = A + (A B)f A /(f B f A ), and let f C = f(c). (b) If f C f B < 0, then set A B and f A f B ; otherwise, just set f A f A /2. (c) Set B C and f B f C. (d) Stop if B A ε. Repeat the above three steps otherwise. 5. Once B A ε, set σ e A/2 Some comments: The original procedure, an application of the Newton-Raphson algorithm, turned out to be less stable than I realized. Even though the function to be optimized was reasonably well-behaved (one local maximum that was a global maximum, continuous, differntiable, etc.), the algorithm occasionally did not converge due to a poor starting value. Attempts to improve the choice of a starting value did not result in any systematic method to salvage the algorithm, so I decided to start afresh to produce a consistently stable numerical procedure. The new algorithm is based on the so-called Illinois algorithm, a variant of the regula falsi (false position) procedure. The Illinois algorithm is quite stable, reliable, and converges quickly. The algorithm takes advantage of the knowledge that the desired value of σ can be sandwiched at the start of the algorithm by the initial choices of A and B. Other algorithms also would work well, such as the BFGS algorithm (but this is a bit cumbersome to code), and bisection algorithms (though this can be slow to converge). A few technical comments about the above procedure: 3

When 2 φ 2 + v, a special provision needs to be made to bracket ln(σ 2 ) which requires searching values to the left of a = ln(σ 2 ) in multiples of kτ. Based on my tests, k is almost always, and very rarely 2 or more. The main iteration of the Illinois algorithm to narrow the bracket around ln(σ 2 ) is reasonably quick. From simulation analyses, the median number of iterations to narrow the bracket to a width of less than ε = 0.00000 was 5, with a mean of 5.6, and a maximum of 9 (in 0000 simulations). Step 6. Update the rating deviation to the new pre-rating period value, φ : φ = φ 2 + σ 2 Step 7. Update the rating and RD to the new values, µ and φ : φ = / φ + 2 v µ = m µ + φ 2 g(φ j ){s j E(µ, µ j, φ j )} Step 8. Convert ratings and RD s back to original scale: j= r = 73.778µ + 500 RD = 73.778φ Note that if a player does not compete during the rating period, then only Step 6 applies. In this case, the player s rating and volatility parameters remain the same, but the RD increases according to φ = φ = φ 2 + σ 2. Example calculation: Suppose a player rated 500 competes against players rated 400, 550 and 700, winning the first game and losing the next two. Assume the 500-rated player s rating deviation is 200, and his opponents are 30, 00 and 300, respectively. Assume the 500 player has volatility σ = 0.06, and the system constant τ is 0.5. Converting to the Glicko-2 scale, the player s rating and RD become 0 and.53. For the opponents: 4

We then compute And now j µ j φ j g(φ j ) E(µ, µ j, φ j ) s j 0.5756 0.727 0.9955 0.639 2 0.2878 0.5756 0.953 0.432 0 3.53.7269 0.7242 0.303 0 v = ( [(0.9955) 2 (0.639)( 0.639) +(0.953) 2 (0.432)( 0.432) + (0.7242) 2 (0.303)( 0.303)] ) =.7785 =.7785 (0.9955( 0.639) + 0.953(0 0.432) + 0.7242(0 0.303)) = 0.4834 The starting values in the iterative procedure are A = a = ln(0.06 2 ) = 5.62682, and (with k = ) B = a kτ = 5.62682 (.0)(0.5) = 6.2682, with corresponding function values computed as f A = 0.00053567 and f B =.999675. The following table summarizes the iterations (iteration 0 corresponds to the initial values above): Iteration A B f A f B 0 5.62682 6.2682 0.00053567.999675 5.62682 5.62696 0.00026784 0.00000005238 2 5.62696 5.62696 0.00000005238 0.00000005238 The iterative procedure therefore converges to A = 5.62696, so we set σ = e 5.62696/2 = 0.05999. Now update to the new value of φ : φ =.53 2 + 0.05999 2 =.52862. Next, update to the new values of φ and µ : φ = /.529 + 2.7785 = 0.8722 µ = 0 + (0.8722) 2 [0.9955( 0.639) +0.953(0 0.432) +0.7242(0 0.303)] = 0 + 0.7607( 0.272) = 0.2069 5

Finally, convert back to the Glicko scale: The new volatility σ = 0.05999. r = 0.2069(73.778) + 500 = 464.06 RD = 0.8722(73.778) = 5.52 Note that the resulting rating for this computation does not differ much from the original Glicko computation because the game outcomes do not provide any evidence of inconsistent performance. 6