arxiv: v2 [physics.pop-ph] 17 Jun 2016

Similar documents
Lab 11. Simulations. The Concept

HOW TO BET ON TENNIS. Gambling can be addictive. Please play responsibly.

Probability, statistics and football Franka Miriam Bru ckler Paris, 2015.

Gaming the Law of Large Numbers

Teaching Mathematics and Statistics Using Tennis

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Random variables, probability distributions, binomial random variable

Acceleration of Gravity Lab Basic Version

DEVELOPMENT AND IMPLEMENTATION OF AN AUTOMATED SYSTEM TO EXCHANGE ATTENUATORS OF THE OB85/1 GAMMA IRRADIATOR

Luiz Felipe Scolari, Portuguese National Team Coach: Who scores, wins!

The Importance of Graduate Programs in Brazil

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Acceleration Introduction: Objectives: Methods:

Stat 20: Intro to Probability and Statistics

Session 7 Bivariate Data and Analysis

Math Quizzes Winter 2009

DOSIMETRIC CHARACTERIZATION OF DYED PMMA SOLID DOSIMETERS FOR GAMMA RADIATION

Predicting a tennis match in progress for sports multimedia

6. Let X be a binomial random variable with distribution B(10, 0.6). What is the probability that X equals 8? A) (0.6) (0.4) B) 8! C) 45(0.6) (0.

Hands-On Data Analysis

Using Excel for inferential statistics

PROBABILITY SECOND EDITION

arxiv:cond-mat/ v4 [cond-mat.soft] 21 Feb 2007

WORKED EXAMPLES 1 TOTAL PROBABILITY AND BAYES THEOREM

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Chapter 5. Discrete Probability Distributions

Hooray for the Hundreds Chart!!

TEACHER NOTES MATH NSPIRED

DANONE NATIONS CUP SOUTH AFRICA SASFA U12 SUMMARISED RULES 2016

Problem of the Month: Fair Games

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Descriptive Statistics

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

AP Physics 1 and 2 Lab Investigations

What is the Probability of Pigging Out

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

4. Continuous Random Variables, the Pareto and Normal Distributions

What Does the Normal Distribution Sound Like?

Minimax Strategies. Minimax Strategies. Zero Sum Games. Why Zero Sum Games? An Example. An Example

Expected Value and the Game of Craps

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

The Normal Distribution

Sums of Independent Random Variables

Assignment #1: Spreadsheets and Basic Data Visualization Sample Solution

It is remarkable that a science, which began with the consideration of games of chance, should be elevated to the rank of the most important

USTA TENNIS RULES CHALLENGE FOR TEACHING PROFESSIONALS AND COACHES *****************************************

The Advanced Guide to Youtube Video SEO

Solution. Solution. (a) Sum of probabilities = 1 (Verify) (b) (see graph) Chapter 4 (Sections ) Homework Solutions. Section 4.

The Solar Radio Burst Activity Index (I,) and the Burst Incidence (B,) for

Nuclear Physics Lab I: Geiger-Müller Counter and Nuclear Counting Statistics

That s Not Fair! ASSESSMENT #HSMA20. Benchmark Grades: 9-12

Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions

The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces

ε: Voltage output of Signal Generator (also called the Source voltage or Applied

Determination of the Effective Energy in X-rays Standard Beams, Mammography Level

Foundation 2 Games Booklet

Describing Populations Statistically: The Mean, Variance, and Standard Deviation

Math Board Games. For School or Home Education. by Teresa Evans. Copyright 2005 Teresa Evans. All rights reserved.

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Fifth Grade Physical Education Activities

1 of 7 9/5/2009 6:12 PM

Math 251, Review Questions for Test 3 Rough Answers

Conceptual Questions: Forces and Newton s Laws

The normal approximation to the binomial

Physics: Principles and Applications, 6e Giancoli Chapter 2 Describing Motion: Kinematics in One Dimension

6.4 Normal Distribution

Team Selection. Team Selection. Advanced Game. Positions. Advanced Game

In order to describe motion you need to describe the following properties.

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

First-year Statistics for Psychology Students Through Worked Examples

Chapter 4 Lecture Notes

Trigonometric functions and sound

Combining player statistics to predict outcomes of tennis matches

GRAPHS/TABLES. (line plots, bar graphs pictographs, line graphs)

RULES AND REGULATIONS OF FIXED ODDS BETTING GAMES

Statistics and Probability

Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab

Lesson 2: Constructing Line Graphs and Bar Graphs

READ AND REACT OFFENSE

Definition and Calculus of Probability

PERFORMANCE EVALUATION OF THE REFERENCE SYSTEM FOR CALIBRATION OF IPEN ACTIVIMETERS

STAT 200 QUIZ 2 Solutions Section 6380 Fall 2013

Ch5: Discrete Probability Distributions Section 5-1: Probability Distribution

Probability Distributions

Section 6.1 Discrete Random variables Probability Distribution

Getting Started with Statistics. Out of Control! ID: 10137

Chapter 3 RANDOM VARIATE GENERATION

Numbered Cones for Physical Education Games, Drills, and Exercises For Fun and Fitness

YMCA Basketball Games and Skill Drills for 3 5 Year Olds

EXPERIMENTAL DESIGN REFERENCE

Unit 13 Handling data. Year 4. Five daily lessons. Autumn term. Unit Objectives. Link Objectives

Betting on Excel to enliven the teaching of probability

The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION INTEGRATED ALGEBRA. Thursday, June 14, :15 to 4:15 p.m.

What Is Energy? Energy and Work: Working Together. 124 Chapter 5 Energy and Energy Resources

Ulster GAA Sport Science Services Fitness Testing Procedures Ulster GAA Fitness Testing Procedures For County Academy Squads

Normality Testing in Excel

Baseball Multiplication Objective To practice multiplication facts.

Rotational Motion: Moment of Inertia

The Math. P (x) = 5! = = 120.

Transcription:

What statistics can tell us about strategy in tennis arxiv:1511.06163v2 [physics.pop-ph] 17 Jun 2016 I. Y. Kawashima Escola Paulista de Medicina - UNIFESP, 04023-062, São Paulo, SP, Brazil O. Helene Instituto de Física da Universidade de São Paulo, C.P. 66318, CEP 05315-970, São Paulo, Brazil M. T. Yamashita Instituto de Física Teórica, UNESP - Univ Estadual Paulista, C.P. 70532-2, CEP 01156-970, São Paulo, SP, Brazil R. S. Marques de Carvalho Departamento de Informática em Saúde - Escola Paulista de Medicina - UNIFESP, 04023-062, São Paulo, SP, Brazil E-mail: marques.carvalho@unifesp.br Abstract. In this paper we analyse tiebreak results from some tennis players in order to investigate whether we are able to identify a non-aleatory distribution of the points in this crucial moment of the game. We compared the observed results with a binomial distribution considering that the probabilities of winning or losing a point are equal. Using a χ 2 test we found that, excepting some players, the greatest part of the results agrees with our hypothesis that the points in tiebreaks are merely aleatory. Keywords: Sports, χ 2 Test, Binomial Distribution

What statistics can tell us about strategy in tennis 2 1. Introduction A recurrent question in a signal analysis is whether it is a true signal or just a noise [1, 2]. This question arises, for example, when we are analysing a tomography or X-ray picture [3] or searching for a new particle like Higgs boson [4]. In these cases different statistical tests are made and usually the discussion is how many standard deviations we can accept or reject a given hypothesis. Statistical analysis of experimental data are made since the first years of physics and engineering courses [5]. The connection of classroom problems with daily problems [6, 7] may be more stimulating than, for example, roll many dice hundreds times to see in practice a binomial distribution. The aim of this paper is to investigate whether the points in a tiebreak originate from a statistical fluctuation and are randomly decided. A tennis match is divided in sets and games. To win a set the player should complete six games with at least two games of difference from the other player (6x0, 6x1,..., 6x4). In the case of a player with six games and the other with five, it is played one more game and then it may happen two things: if the player with six games wins the game then the set ends in 7x5. If the player loses the game then the set is tied and they will play a tiebreak. During the tiebreak, the player who wins the first seven points with at least two points of difference of the other player wins the game and the set. If necessary, the tiebreak continues until the minimum difference of two points is achieved. In this paper we collected results from tiebreaks of several players. We then plotted in a histogram the difference of points, where positive values mean victories and negative losses. These histograms are compared with a theoretical binomial distribution, constrained to the tennis rules described in the last paragraph, but considering equal probabilities of winning or losing a point. We performed a χ 2 test to have an objective parameter to say if the observed and calculated results are statistically different. The paper is organized as follows. In section 2 we explain the criterium we used to select the players and the χ 2 test. In section 3 we compare the observed results and our theoretical prediction. Finally, in section 4 we summarize and give our conclusions. 2. Methods Analysing the conditions which may lead to a tiebreak we could consider two main reasons. The first one: both players may have a very similar game. Then, considering that the serve can really be considered as an advantage, we will have in this condition a very favorable condition for a tiebreak(every game of serve the player who is serving wins the game). The second reason usually occurs when one of the players has an amazing serve. Normally, this condition comes essentially from a very big height (exceptions to this fact may be found). Then, thanks to the height the agility of the player is seriously note that we are not considering the last set of Grand Slam events or Davis Cup where the games can continue infinitely

What statistics can tell us about strategy in tennis 3 compromised, which makes that in one hand the tall player has a low probability to win the game when his opponent is serving, but on the other hand the opponent rarely can obtain a good return of the big serve. A good example for this second reason is Ivo Karlovic (211 cm) from Croatia. However, despite of what reason caused the tiebreak, the fact is: when the match goes to a tiebreak, in that moment the match was very balanced. Thus, it is not strange to think that each point could go randomly to any of the players. In order to investigate tiebreak results, we selected the top ten players according to ATP (Professional Tennis Association) website in the last week of October, 2015. For the analysis we also included the player Ivo Karlovic, as his games usually go to a tiebreak. The tiebreak results were mainly extract from gambling sites, where we may find detailed results. In 1 we plotted a binomial distribution of a tiebreak in tennis considering that the probability to win, or lose, a point in the tiebreak is 0.5. 0.25 0.20 Probability 0.15 0.10 0.05 0.00-7 -6-5 -4-3 -2-1 0 1 2 3 4 5 6 7 Points Figure 1. Theoretical result for the probability to have a result from -7 to 7 in a tiebreak considering that the probability to win(lose) a point is 0.5. Horizontal axis is the difference of points from a given and other players. Positive values mean victories and negative losses. We calculated the χ 2 quantity in order to compare the expected results with the observed ones. The χ 2 variable is defined as: χ 2 = 7 7 (y (O) i y (E) i ) 2 y (E) i, (1) where y (O) i and y (E) i are, respectively, the observed and expected number of events. The sum should be performed over all values. The expected number of events is simply given by Np i, where N is the total number of events and p i (i = 7,...,7) is the probability

What statistics can tell us about strategy in tennis 4 for result i to occur (see Fig. 1). Defining F = F(χ 2 ) as the probability density function for eleven degrees of freedom, we may write, respectively, the probability of finding a smaller and a greater value than a given χ 2 as P < = χ 2 < 0 F(χ 2 )dχ 2, P > = χ 2 > F(χ 2 )dχ 2. (2) Here, we will consider that the observed values agree with our theoretical result if the calculated χ 2 stays inside the interval χ 2 < < χ 2 < χ 2 > 4.575 < χ 2 < 19.675, which corresponds to P < = 0.05 and P > = 0.95. 3. Results In this section we will compare our theoretical result with some observed data. Fig. 2 showstheexpectedresults, givenbynp i (forkarlovicn = 274)andrepresentedbyopen circles, compared to the observed ones. Not only the structure of the results are very similar, but also the values in each channel. The calculated χ 2 is 10.6, which is inside the interval mentioned in the last section indicating that both results are statistically equivalent. This means that despite the big serve from Karlovic his results in tiebreak are close to a completely random situation. 60 Ivo Karlovic 50 40 Ocurrences 30 20 10 0-7 -6-5 -4-3 -2-1 0 1 2 3 4 5 6 7 Points Figure 2. Histogramof Ivo Karlovicresults. The bars are a total of 274collected data and the open circles are our theoretical results. The agreement between both results are very good giving a χ 2 = 10.6. This result means that despite the great serve from Karlovic, the probability of winning (or losing) a point is close to 50 %. Figure 3 shows the results from Roger Federer. We can immediately note the large difference from the theoretical and observed results. This is clearly a non-aleatory result with a χ 2 exceeding by far the upper limit of 19.675. Roger Federer is one of the greatest

What statistics can tell us about strategy in tennis 5 tennis players of all time and this figure may demonstrate it. Tennis is a very mental game with moments of extreme pressure (tiebreaks, for example). A crucial moment occurs when the point candefine thegame, the set, or thematch. The toptennis players have the capacity to increase considerably their concentration and tennis level in these moments. The courage to hit a drop shot or a ball down the line in a delicate moment avoiding the opponent to win the point is a quality that is not shared by all players. 60 Roger Federer Ocurrences 40 20 0-7 -6-5 -4-3 -2-1 0 1 2 3 4 5 6 7 Points Figure 3. Histogram of Roger Federer results. Same as figure 2 for 219 data and a χ 2 = 39.3. This is a typical figure of non-aleatory results. Table 3 shows the calculated χ 2 for the top ten players at the moment we were writing this paper (note that the ranking changes every week) and Ivo Karlovic, who has the biggest number of aces in history and a large number of tiebreaks. In order to agree with the theoretical prediction (50 % of probability to win or lose a point in tiebrak) the χ 2 should stay inside the interval [4.575,19.675]. As we can see, Djokovic, Murray, Wawrinka, Ferrer, Tsonga and Karlovic agree with an aleatory result. Note that almost more than half of the top ranked players have practically random tiebreak results. If we consider lower rankings the number of players who agree with our hypothesis increases considerably. 4. Conclusion We could see from our calculations that half of the top ten players and the player who has the greatest number of aces in history (Karlovic) display a tiebreak result that is in agreement with our hypothesis of aleatory points. Definitely, this is not a statistical accident. The agreement with our prediction just tell that the strategy used by these players to play tiebreaks is returning the same result as the coin thrown in the beginning

What statistics can tell us about strategy in tennis 6 Player χ 2 Data Novak Djokovic 19.0 181 Roger Federer 39.3 219 Andy Murray 14.6 168 Stan Wawrinka 11.4 197 Tomas Berdych 21.6 199 Rafael Nadal 29.3 163 Kei Nishikori 23.7 115 David Ferrer 9.1 151 Jo-Wilfried Tsonga 11.0 220 Milos Raonic 31.9 260 Ivo Karlovic 10.6 274 Table 1. Calculated χ 2 -values for several players. The last column is the number of collected data. In order to be statistically equivalent to a random result, χ 2 should stayinside 4.575 < χ 2 < 19.675. In this table, the values largerthan 19.675correspond to more victories than that predicted by our model. of the match to decide which player serves first. Besides not an easy task, the coach of these players could at least adopt a strategy which could arrive in a result different of 50 %. Considering lower rankings the number of players who agree with our hypothesis increases considerably. As written in the introduction, statistical analysis is a topic explored in the first years of undergraduate physics or engineering courses. A contact with a real problem where it is possible to analyse the data of your favorite team or player is by far more exciting than spend an hour (or more) rolling many dice to see in practice a binomial distribution. Variations of the problem treated here may be easily extended to other sports like, e.g. football. Acknowledgments The authors thank PET (Programa de Educação Tutorial - MEC) for support. MTY, a very good amateur tennis player, thanks FAPESP (Fundação de Amparo à Pesquisa do Estado de São Paulo) and CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico) for partial support. References [1] Helene, O. Upper limit of peak area Nucl. Instrum. Methods Phys. Res. 212 319-22 (1983) [2] Razak, M. M. A. Detection and extraction of weak signals buried in noise Am. J. Phys. 77 1061-65 (2009) [3] Mylott, E.; Klepetka, R.; Dunlap, J. C. and Widenhorn, R. An easily assembled laboratory exercise in computed tomography Eur. J. Phys. 32 1227-35 (2011)

What statistics can tell us about strategy in tennis 7 [4] ATLAS Collaboration Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC Phys. Lett. B 716 1-29 (2012); CMS Collaboration Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC Phys. Lett. B 716 30 (2012) [5] Peterlin, P. Data analysis and graphing in an introductory physics laboratory: spreadsheet versus statistics suite Eur. J. Phys. 31 919-31 (2010) [6] Helene, O. and Yamashita, M. T. The force, power and energy of the 100-meter sprint Am. J. Phys. 78 307-9 (2010); Helene, O. and Yamashita, M. T. A unified model for the long and high jump Am. J. Phys. 73 906-8 (2005) [7] Helene, O. and Yamashita, M. T. Understanding the tsunami with a simple model Eur. J. Phys. 27 855-63 (2006)