USING MICROSOFT EXCEL TO MODEL A TENNIS MATCH

Similar documents
Predicting a tennis match in progress for sports multimedia

Combining player statistics to predict outcomes of tennis matches

DEVELOPING A MODEL THAT REFLECTS OUTCOMES OF TENNIS MATCHES

Tennis Winner Prediction based on Time-Series History with Neural Modeling

MATHEMATICAL MODELLING IN HIERARCHICAL GAMES WITH SPECIFIC REFERENCE TO TENNIS

Inferring Tennis Match Progress from In-Play Betting Odds : Project Report

Teaching Mathematics and Statistics Using Tennis

DOUBLES TENNIS LEAGUE

DOUBLES TENNIS LEAGUE *The following rules provided by Purdue Intramural Sports are not meant to be all encompassing.*

The Tennis Formula: How it can be used in Professional Tennis

SINGLES TENNIS RULES

TEAM TENNIS TOURNAMENT *The following rules provided by Purdue Intramural Sports are not meant to be all encompassing.*

Betting on Excel to enliven the teaching of probability

HOW TO BET ON TENNIS. Gambling can be addictive. Please play responsibly.

SECTION A OFFICIAL EVENTS

Instructions for Authors. 1. Scope

Spring Doubles Tennis Rules

Thursday, October 18, 2001 Page: 1 STAT 305. Solutions

Complement. If A is an event, then the complement of A, written A c, means all the possible outcomes that are not in A.

Current California Math Standards Balanced Equations

Inferring the score of a tennis match from in-play betting exchange markets

A SET-BY-SET ANALYSIS METHOD FOR PREDICTING THE OUTCOME OF PROFESSIONAL SINGLES TENNIS MATCHES

Percentage Play in Tennis

Melbourne Sports & Aquatic Centre SOCIAL COMPETITIONS - ALL SEASONS Rules and Regulations 2013

Pool Play and Playoff Formats: 14 Teams

BEE Calculations Content

OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS

Raising Interest in Statistics through Sporting Predictions on the Internet

WORKED EXAMPLES 1 TOTAL PROBABILITY AND BAYES THEOREM

PLAYING RULES & FORMAT

Intro to Simulation (using Excel)

Applying the Kelly criterion to lawsuits

Statistical estimation using confidence intervals

USTA TENNIS RULES CHALLENGE FOR TEACHING PROFESSIONALS AND COACHES *****************************************

Bellevue Club Cup Tennis Guidelines

3.2 Roulette and Markov Chains

SCORESHEET INSTRUCTIONS. RALLY POINT SCORING (RPS 2 out of 3 sets)

Formula One Race Strategy

Microeconomics Sept. 16, 2010 NOTES ON CALCULUS AND UTILITY FUNCTIONS

III. TEAM COMPOSITION

2016 NJSIAA/New Balance WOMENS TENNIS SINGLES/DOUBLES CHAMPIONSHIPS

BEACH VOLLEYBALL SHORT COURT RULES

73rd Annual Tennis Championships. Up to maximum $500 for Open Singles Winner!

Lesson 1. Key Financial Concepts INTRODUCTION

Math Quizzes Winter 2009

Inferring Tennis Match Progress from In-Play Betting Odds

USTA Middle States Sectional Tournaments Points Tables

Math 728 Lesson Plan

PREDICTING THE MATCH OUTCOME IN ONE DAY INTERNATIONAL CRICKET MATCHES, WHILE THE GAME IS IN PROGRESS

Using the Solver add-in in MS Excel 2007

Sample Questions with Explanations for LSAT India

A Little Set Theory (Never Hurt Anybody)

An Automated Test for Telepathy in Connection with s

USA Volleyball Scorer Test A

Parental Occupation Coding

Machine Learning for the Prediction of Professional Tennis Matches

Construction Administrators Work Smart with Excel Programming and Functions. OTEC 2014 Session 78 Robert Henry

The GMAT Guru. Prime Factorization: Theory and Practice

Q&As: Microsoft Excel 2013: Chapter 2

Fundamentals of Probability

1) Course Entry Requirement(s) To gain entry to the Diploma of Higher Education in Information Technology prospective students must have:

Referee Analytics: An Analysis of Penalty Rates by National Hockey League Officials

Who wins the rest of the match European handicap Home t.win or Away t. win after regular time/overtime/penalty shootout Which team will win

Fairfield Public Schools

Teachers should read through the following activity ideas and make their own risk assessment for them before proceeding with them in the classroom.

Paralympic Table Tennis

Effects of expertise on football betting

MULTIPLE REGRESSION WITH CATEGORICAL DATA

SECTION 8: PROMOTIONS FOR LIVE ODDS IN SPORTS COVERAGE...

Scientific Graphing in Excel 2010

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Practice Problems for Homework #8. Markov Chains. Read Sections Solve the practice problems below.

FINANCIAL ANALYSIS GUIDE

Measuring Evaluation Results with Microsoft Excel

ENGLISH SCHOOLS' TABLE TENNIS ASSOCIATION - INDIVIDUAL KNOCKOUT - 16 PLAYERS

Level. Two-handed striking

Jan Vecer Tomoyuki Ichiba Mladen Laudanovic. Harvard University, September 29, 2007

Part 1 will be selected response. Each selected response item will have 3 or 4 choices.

MANAGEMENT OPTIONS AND VALUE PER SHARE

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

SCORESHEET INSTRUCTIONS. RALLY POINT SCORING (RPS 2 out of 3 sets)

MAT Elements of Modern Mathematics Syllabus for Spring 2011 Section 100, TTh 9:30-10:50 AM; Section 200, TTh 8:00-9:20 AM

Graphing Parabolas With Microsoft Excel

VOLLEYBALL. Information SOMI-Specific Sport season: April-June

Mu2108 Set Theory: A Gentle Introduction Dr. Clark Ross

Journal of Student Success and Retention Vol. 2, No. 1, October 2015 THE EFFECTS OF CONDITIONAL RELEASE OF COURSE MATERIALS ON STUDENT PERFORMANCE

State of the Art Virtual Portfolio Management: Building Skills that Matter


OBSERVATIONS FROM 2010 INSPECTIONS OF DOMESTIC ANNUALLY INSPECTED FIRMS REGARDING DEFICIENCIES IN AUDITS OF INTERNAL CONTROL OVER FINANCIAL REPORTING

Lesson Plans for (9 th Grade Main Lesson) Possibility & Probability (including Permutations and Combinations)

SCRATCH Lesson Plan What is SCRATCH? Why SCRATCH?

Rating Systems for Fixed Odds Football Match Prediction

An Activity-Based Costing Assessment Task: Using an Excel Spreadsheet

Computational Geometry Lab: FEM BASIS FUNCTIONS FOR A TETRAHEDRON

MATH ADVISEMENT GUIDE

Self-directed learning: managing yourself and your working relationships

CHAPTER 18 Programming Your App to Make Decisions: Conditional Blocks

How to Calculate the Probabilities of Winning the Eight LUCKY MONEY Prize Levels:

AN ANALYSIS OF A WAR-LIKE CARD GAME. Introduction

GOLF COMPETITIONS AND HOW THEY ARE PLAYED Golf Australia Advice (Version 20 February 2013)

Transcription:

USING MICROSOFT EXCEL TO MODEL A TENNIS MATCH Tristan J. Barnett and Stephen R. Clarke School of Mathematical Sciences Swinburne University PO Box 218 Hawthorn Victoria 3122, Australia tbarnett@groupwise.swin.edu.au sclarke@groupwise.swin.edu.au Abstract This paper demonstrates the use of Microsoft Excel to generate the probability of winning and mean length of the remainder of the match conditional on the state of the match. Previous models treat games, sets and matches independently. We show how a series of interconnected sheets can be used to repeat these results. We also set up a sheet which can give the required statistics using the full match, set and game score as the state. The exercise could form an interesting and useful teaching example, and allow students to investigate the properties of tennis scoring systems. 1 Introduction Many papers have investigated the characteristics of various scoring systems, particularly in racquet sports. The game of Lawn Tennis has a history of changing formats. Although it uses a basic nested system of points, games, sets and matches, there is much variation within that basic structure. Matches can be one set, or best of three or five sets, sets can be first to six, eight or ten games, advantage or non advantage, and games can be the traditional four point advantage, or seven or ten point tiebreakers. We even have the recent innovation at the 2001 Australian Open mixed, where the third set becomes a single tiebreaker game. This innovation is extending into the Victorian Tennis Association pennant this year. The basic principles involved in modelling a tennis match are well known, and a Markov chain model with a constant probability of winning a point was set up by Schutz [4]. While such a model is acceptable within a game, a model which allows a player a different probability of winning depending on whether they are serving or receiving is essential for tennis. Statistics of interest are usually the chance of each player winning, and the expected length of the match. Croucher [1] looks at the conditional probabilities for either player winning a single game from any position. Pollard [3] uses a more analytic approach to calculate the probability for either player winning a game or set along with the expected number of points or games to be played and their corresponding variance. Most of the previous work uses analytical methods, and treats each scoring unit independently. This results in limited tables of statistics. Thus the chance of winning a game and the expected number of points remaining in the game is calculated at the various scores within a game. The chance of winning a set and the expected number of games remaining in the set is calculated only after a completed game and would not show for example the probability of a player s chance of winning from three games to two, 15-30. 1

2 T.J. Barnett and S.R. Clarke This paper discusses the use of Microsoft Excel to repeat these applications using a set of interrelated spreadsheets. This allows any probabilities to be entered and the resultant statistics automatically calculated or tabulated. In addition, more complicated workbooks can be set up which allow the calculation of the chance of winning a match and the expected number of points in the remainder of the match at any stage of the match given by the match, set and game score. These allow the dynamic updating of player s chances as a match progresses. 2 Simple Model We explain the method by first looking at the simple model where we have two player s, A and B, and player A has a constant probability p of winning a point. We set up a Markov chain model of a game where the state of the game is the current game score in points (thus 40-30 is 3-2). With probability p the state changes from a, b to a + 1, b and with probability 1 p it changes from a, b to a, b + 1. Thus if P (a, b) is the probability that player A wins when the score is (a,b), we have: P (a, b) = pp (a + 1, b) + (1 p)p (a, b + 1) The boundary values are P (a, b) = 1 if a = 4, b 2, P (a, b) = 0 if b = 4, a 2. The boundary values and formula can be entered on a simple spreadsheet. The problem of deuce can be handled in two ways. Since deuce is logically equivalent to 30-30, a formula for this can be entered in the deuce cell. This creates a circular cell reference, but the iterative function of Excel can be turned on, and Excel will iterate to a solution. Alternatively, it is easily shown that the chance of winning from deuce is: p 2 p 2 + (1 p) 2 Entering this formula removes the need for iteration. Table 1 shows the results obtained, using a value of p = 0.54. It indicates that a player with a 54% chance of winning a point has a 60% chance of winning the game. Note that since advantage server is logically equivalent to 40-30, and advantage receiver is logically equivalent to 30-40, the required statistics can be found from these cells. The mean number of points in the remainder of the game is handled similarly as shown in Table 2. All boundary points become 0, the recurrence relation becomes: M(a, b) = 1 + pm(a + 1, b) + (1 p)m(a, b + 1) The formula for deuce is: 2 p 2 + (1 p) 2 0 15 30 40 game 0 0.60 0.74 0.87 0.96 1 15 0.44 0.59 0.76 0.91 1 B score 30 0.25 0.39 0.58 0.81 1 40 0.09 0.17 0.31 0.58 game 0 0 0 Table 1: The conditional probabilities of A winning the game from various scorelines A similar spreadsheet can be set up for a set. The constant probability of winning a game can be obtained from the (0,0) cell of the previous sheet. Similarly for a match. Thus we obtain six connected

Using Microsoft Excel to model a tennis match 3 0 15 30 40 game 0 6.7 5.6 4.0 2.1 0 15 5.9 5.2 4.1 2.3 0 B score 30 4.5 4.4 4.0 2.8 0 40 2.5 2.7 3.1 4.0 game 0 0 0 Table 2: The expected number of points remaining in a game from various scorelines sheets - entering the chance of winning a point results in the chance of winning a game and expected number of points in the remainder of the game conditional on the game score, the chance of winning a set and expected number of games in the remainder of the set conditional on the set score, and the chance of winning a match and expected number of sets in the remainder of the match conditional on the match score. In this case, the sheet suggests a player with a 54% chance of winning a point, has a 59.9% chance of winning a game, a 76.3% chance of winning a tiebreak set, a 85.9% chance of winning a 3 set match and a 91% chance of winning a five-set match. Excel has excellent facilities for building up one or two way tables. Table 3 shows some of the match statistics for chances and expected lengths of winning games, sets and matches with the probability of winning a point ranging from 0.5 to 1. We will use the following notation: p(game) = probability of winning a game M(Game) = expected number of points in a game p(set) = probability of winning a tiebreak set M(Set) = expected number of games in a tiebreak set p(match) = probability of winning a 5 set match M(Match) = expected number of sets in a 5 set match Magnus and Klassen [2] tested some often-heard hypotheses relating to the service in tennis. They collected data on 481 matches played in the men s singles and women s singles championships at Wimbledon from 1992 to 1995. We will compare some of their statistics with our model. In men s singles with two seeded player s, on service they won 67% of the points and 86% of the games. In women s singles with two seeded player s, on service they won 57% of the points and 67% of the games. Both these results agree with our model as highlighted in Table 3. 3 Two parameter model For more realistic models the process is essentially the same, just a little more complicated. We now have 2 parameters where: p A = probability of A winning a point if A is serving p B = probability of B winning a point if B is serving Since the chance of a player winning now depends on who is serving, we need to introduce this into the state description. The state of a game becomes the score and whether the player is serving or receiving. Thus we need two sheets for a game, one for player A serving and one for player B serving. Similarly we need two sheets for the set score, and one for the match score. The formula and boundary conditions are similar. The main difference is that adjacent cells in terms of the Markov chain are not necessarily adjacent on the spreadsheet, since players alternate service games. We will also use the following notation: p A p B = probability of A winning a game if A is serving = probability of B winning a game if B is serving

4 T.J. Barnett and S.R. Clarke p p(game) M(Game) p(set) M(Set) p(m atch) M(M atch) 0.5 0.50 6.8 0.50 9.7 0.50 4.1 0.52 0.55 6.7 0.56 9.6 0.75 4.0 0.54 0.60 6.7 0.62 9.3 0.91 3.7 0.56 0.65 6.7 0.68 9.0 0.98 3.4 0.57 0.67 6.6 0.73 8.8 1 3.3 0.58 0.69 6.6 0.74 8.6 1 3.2 0.6 0.74 6.5 0.79 8.1 1 3.1 0.62 0.78 6.4 0.83 7.7 1 3 0.64 0.81 6.3 0.87 7.4 1 3 0.66 0.85 6.1 0.9 7.1 1 3 0.67 0.86 6.1 0.92 7 1 3 0.68 0.88 6.0 0.93 6.9 1 3 0.7 0.90 5.8 0.95 6.7 1 3 0.8 0.98 5.1 1 6.1 1 3 0.9 1 4.5 1 6 1 3 1 1 4 1 6 1 3 Table 3: Match statistics depending on the probability of winning a point The following probabilities are applied: p A = 0.6, p B = 0.58 The conditional probabilities for an advantage set are given in Tables 4 and 5 for each player serving. The probability of A winning an advantage set from 6 games all - A or B serving is: p A p B 1 p A p B + (1 p A )(1 p B ) By substituting p A = 0.6 in our simple model, p A = 0.74. Similarly by letting p B = 0.58, p B = 0.69. Substituting these figures into the equation above gives 0.55. 0 1 2 3 4 5 6 0 0.57 0.75 0.82 0.93 0.97 1 1 1 0.49 0.57 0.77 0.84 0.96 0.99 1 2 0.29 0.48 0.56 0.79 0.87 0.98 1 B score 3 0.20 0.25 0.46 0.58 0.82 0.92 1 4 0.06 0.15 0.20 0.44 0.55 0.88 1 5 0.02 0.03 0.09 0.13 0.41 0.55 0.88 6 0 0 0 0 0 0.41 0.55 Table 4: The conditional probabilities of A winning an advantage set from various games scores if A is serving As expected 6-6 = 5-5 = 4-4 regardless of who is serving. Also worth noting is that P (winning set a score with an even number of games with A serving) = P (winning set a score with an even number of games with B serving). 4 Six parameter model Instead of building up independent sheets linked through the probability of winning a point, game or set, we can build a model in which the state of the game is match score, set score, game score and whether

Using Microsoft Excel to model a tennis match 5 0 1 2 3 4 5 6 0 0.57 0.64 0.82 0.88 0.97 0.99 1 1 0.37 0.57 0.65 0.84 0.91 0.99 1 2 0.29 0.35 0.56 0.65 0.87 0.94 1 B score 3 0.12 0.25 0.31 0.56 0.67 0.92 1 4 0.06 0.08 0.20 0.26 0.55 0.69 1 5 0.01 0.03 0.04 0.13 0.17 0.55 0.69 6 0 0 0 0 0 0.17 0.55 Table 5: The conditional probabilities of A winning an advantage set from various game scores if B is serving the player is serving or receiving. Also included is first and second serves. With this model, a typical question could be: What is the probability of player A winning a five set match with the current score at 2 sets to 1, 3 games to 4, 30-15 and fault one? We now have 6 parameters: 1st serve % for each player, winning % on first serve for each player and winning % on second serve for each player. The same techniques of deriving a formula for the last point of a game or set and allowing recurrence relations to work out the remaining points are applied. We will take the statistics from the 2002 Australian Open men s singles final, where Thomas Johannson defeated Marat Safin 3-6 6-4 6-4 7-6 (7-4) Let Johannson = player A, Safin = player B 1st serve % for player A = 56% 1st serve % for player B = 63% Winning % on 1st Serve for player A = 86% Winning % on 1st Serve for player B = 67% Winning % on 2nd Serve for player A = 53% Winning % on 2nd Serve for player B = 54% Table 6 represents the conditional probabilities of player A winning a 5 set match at a set all, 4-4 in the 3rd set with player A serving. It highlights the importance of a 1st serve on certain points. At 30-40 a 1st serve in would give a 77% chance of winning the match, whereas 30-40 fault drops to 74%. Particularly in tiebreaks, this margin has greater emphasis to the outcome of a match. 0 0/2nd 15 15/2nd 30 30/2nd 40 40/2nd game 0 0.84 0.83 0.84 0.84 0.85 0.85 0.85 0.85 1 15 0.81 0.8 0.83 0.82 0.84 0.84 0.85 0.85 1 B score 30 0.77 0.76 0.8 0.78 0.82 0.81 0.84 0.84 1 40 0.71 0.7 0.74 0.71 0.77 0.74 0.82 0.81 game 0 0 0 0 0 0 0 Table 6: Conditional probabilities of winning a game from various scorelines 5 Conclusions Through the use of Excel we have shown how to produce the conditional probabilities of either player winning the match and the expected number of points, games and sets remaining from any position

6 T.J. Barnett and S.R. Clarke within the match. The use of if statements often used in Excel would allow us to enter what type of match it is (tiebreaker of advantage, 3 or 5 sets) and the current position in the match, to give us the corresponding probabilities and the expected length remaining. This idea would make accessible the relevant statistics for a match in progress. Microsoft Excel now incorporates Visual Basic for Applications (VBA), enabling the modelling of a tennis match to be powerful and still relatively easy to implement. References [1] J.S. Croucher, The conditional probability of winning games of tennis, Res. Quart. for Exercise and Sport, 57(1) (1986), 23 26. [2] J.R. Magnus and F.J.G.M Klaassen, On the advantage of serving first in a tennis set: four years at Wimbledon. The Statist., 48(2) (1999), 247 256. [3] G.H. Pollard, An analysis of classical and tie-breaker tennis. Austral. J. Statist., 25 (1983), 496 505. [4] R.W. Schutz, A mathematical model for evaluating scoring systems with specific reference to tennis. Res. Quart. for Exercise and Sport, 41 (1970), 552 561.