Game ON! Predicting English Premier League Match Outcomes
|
|
|
- Lesley McCormick
- 9 years ago
- Views:
Transcription
1 Game ON! Predicting English Premier League Match Outcomes Aditya Srinivas Timmaraju Aditya Palnitkar Vikesh Khanna Abstract Among the different club-based soccer leagues in the world, the English Premier League (EPL), broadcast in 212 territories to 643 million homes, is the most followed. In this report, we attempt to predict match outcomes in EPL. One of the key challenges of this problem is the high incidence of drawn games in EPL. We identify desirable characteristics for features that are relevant to this problem. We draw parallels between our choice of features and those in state-of-the-art video search and retrieval algorithms. We demonstrate that our methods offer superior performance compared to existing methods, soccer pundits and the betting markets. We also share a few insights we gained from this interesting exploration. 1 Introduction The most popular soccer league in the world is the English Premier League (EPL), which is based in England. It was watched by an estimated figure of 4.7 billion people in the season. So, it doesn t come as a surprise that the TV rights are valued at 1 billion per season. In EPL, there are 20 teams that contest for the first place. The bottom three teams are relegated and replaced by other teams from lower leagues who perform better. Each team plays every other team twice, once at home and once away. So, there are a total of 380 (= 2 20 C 2 ) games per season. A season runs from August to May of the following year. 1.1 Challenges One of the things that makes predicting outcomes tricky is the significant incidence of draws (neither team wins, if they both score the same number of goals) as compared to other sports. Most popular games viz., tennis, cricket, baseball and American football either have no draws or very few draw occurrences. Consider this - out of the 380 games in the season, 166 games were won by the home team, 106 games by the away team and there were 108 draws! To put these numbers in perspective, we calculate the season s entropy. We regard home wins(1), away wins(2) and drawn games(3) as three separate classes and compute the fraction of each of these outcomes. p 1 = 166/380 p 2 = 106/380 p 3 = 108/380 Since, we have a 3-way classification, we use the base-3 logarithm. Entropy = (p 1 log 3 (p 1 ) + p 2 log 3 (p 2 ) + p 3 log 3 (p 3 )) = An entropy of 1 would correspond to a perfectly random setting (p 1 = 1/3, p 2 = 1/3, p 3 = 1/3). So, considering this, a random prediction would give an expected accuracy of 33.33%. Another naive, but a marginally better approach would be to predict a home win always, which would deliver 1
2 44-45% accuracy (about the average fraction of home wins in EPL). These are however, not very good numbers. 1.2 Prior Work In [1], a Bayesian dynamic model is built, but there are many parameters and the authors do not justify the choice of their values, which may not work well with other testing sets. Also, out of the 48 EPL games they bet on, they won on only 15 games [1]. Their primary focus is towards computing pseudo-likelihood values for betting on match outcomes. In [2], the authors specifically choose to focus on predicting match outcomes of a single team - Tottenham Hotspur. Also, their model is dependent on the presence of particular players and is thus, by their own admission, not scalable. There is no standard way of reporting results for this problem. Most authors do not mention prediction accuracy directly, some specify precision/recall and some others specify geometric mean of predicted probabilities of correct classes, called Pseudo-likelihood Statistic (PLS). We report the accuracy of predictions, precision/recall as well as PLS. We compare the performance of our algorithms on all these fronts and show a marked improvement. The remainder of the report is organized as follows. Section 2 discusses the choice of features, which we feel is a key step. In Section 3, we give a description of the approaches we have taken towards addressing the problem. Section 4 describes our results. In the remainder of the report, we use the words match and game interchangeably. By convention, in A vs. B, A is the home team and B is the away team. 2 Selection of Features We first identify the most relevant performance metrics: Goals, Corners and Shots on Target. Goals are an obvious choice as they determine which team wins. As for corners, a higher number of corners indicate a team playing well, and thus enforcing a corner kick from the opponent. Shots on Target convey the number of times a shot taken was on target (including those stopped by the keeper). We consciously did not pick possession percentage, because some teams just have a more possessive style of play than others, and moreover, higher possession percentage does not necessarily imply better performance. So, our performance metric vector for a game is a 3-element vector of the form (Goals scored, Corners, Shots on Target). We recognize three important characteristics for a feature. It should: Incorporate a notion of the competing nature (between teams) of the problem Be reflective of the recent form of a particular team Manifest the home advantage factor Capture the improving/declining trend in the relative performance of competing teams (i.e., has team A been on the rise more than team B? Has team B slumped in form more than team A?) Keeping the above characteristics in mind, we propose the following choice of features (we progressively arrived at the above list and so, the first choice below does not capture all these properties). 2.1 KPP (k-past Performances) In KPP, we use the performance of a team in the last k games to determine our prediction. In the game A vs. B, we first take an average of each performance metric of team A in the last k games to arrive at P A. If G = [g 1, g 2, g 3,..., g k ] is a vector containing the number of goals scored in each of the last k games, we have g avg = (g1+g2+...+g k) k. Similarly, we arrive at c avg and st avg for corners and shots on target respectively. Thus, we have P A = [g avg ; c avg ; st avg ]. 2
3 Similarly, we arrive at P B. We then take the ordered difference P A -P B, a 3-element vector, as our feature. The ordered difference is chosen to inherently include information about which team is playing home and which team away. 2.2 TGKPP (Temporal Gradient k-past Performances) In TGKPP, as in KPP, we first arrive at the 3-element vector P A -P B. Now, consider the performance metric (say Goals scored) vector for team A in the past k games. Denote it as G = [g 1, g 2, g 3,...g k ] We now apply a temporal differencing operator on this vector, which is essentially a convolution with a filter of the form diff = [1; 1]. We now have g da = convolution(g, diff) = (g 2 g 1, g 3 g 2,..., g k g k 1 ) We do this for the other performance metrics and compute c da and st da. We repeat this process for team B and arrive at g db, c db and st db. Then, we compute P diff = [µ(g da ) µ(g db ); µ(c da ) µ(c db ); µ(st da ) µ(st db )] where µ( ) indicates the standard mean of elements in a vector. Thus, our final feature vector is a 6-element vector of the form [P A P B ; P diff ]. This feature is similar to the video fingerprint in [7], where spatial and temporal gradient are applied to blocks in a frame of video. We have used the website for obtaining our data. 3 Approaches taken We first toyed with the idea of using Naive Bayes, but found that independence is a very strong assumption in this problem. Also, we did not try using Gaussian Discriminant Analysis because our data was nowhere close to being normally distributed. Owing to space constraints, we have omitted including figures that showed this. We used multiple variations on Multinomial Logistic Regression and Support Vector Machines. 3.1 Approach 1 The first approach we took was using Multinomial Logistic Regression (since there were more than 2 possible outcomes). In this approach, during the training phase, we only considered the performance metrics derived from the current match, rather than taking the average over the last k matches. During testing, suppose we are required to predict the match outcome of team A vs team B, we arrived at the feature vector using KPP. 3.2 Approach 2 In the second approach, we trained in the same way we would later test the data. More precisely, in the training phase too, instead of using the feature vector as the performance metric vector corresponding to the current match, we used KPP. This meant that the trained parameters now inform our beliefs about the result of a match based on the performance in last k matches. In this approach, we also used TGKPP instead of KPP and evaluate our approach. 3.3 Approach 3 : Using teamwise models So far, all approaches we tried would find a global set of parameters, which were independent of the competing teams. So, given the past k performances of the team playing home and the team playing away, our model was agnostic to the identity of the actual teams playing. However, we felt we maybe missing out on some team-specific trends using this method. So, in this approach, we trained different models for different teams. However, this approach placed a limitation on the data 3
4 Table 1: MLR with Approach 1 Season k value Accuracy Table 2: Prediction Accuracy: KPP & TGKPP Season k MLR(KPP) RBF-SVM(KPP) MLR(TGKPP) RBF-SVM(TGKPP) 2 Class (RBF-SVM) we could use to test/train our model. We could no longer combine data from two different seasons, due to the form of the teams varying between seasons, and major players being traded between teams. So, due to the limited data, and the increased noise induced by increasing the granularity in the model, we ended up getting a lower accuracy (an average of 47%). 4 Results and Discussion We train all our models on the season and test on the (current) season. Table 1 contains the prediction accuracy (in %) obtained using MLR with Approach 1. Figure 1 contains the learning curves for the two algorithms, MLR and RBF-SVM. Our definition of k imposes a restriction on starting point for testing. For k=4, 5, 6, 7 we test on 81, 71, 61, 51 number of games respectively. Table 2 contains the prediction accuracy (in %) obtained using Approach 2 with MLR and RBF-SVM. The last column of Table 2 contains accuracy obtained using only Home win and Away win classes, disregarding the draws. Basically, if C is the confusion matrix, we truncate the 3rd row and 3rd column to obtain C trunc. We then compute 2classaccuracy = trace(c trunc )/ i,j C trunc,ij. We computed this because we felt it would offer a fair metric for comparing against accuracy in other sports which do not have the concept of drawn games. As our problem is a 3-way classification, there is no positive and negative class demarcation. So, while computing them for the Home wins class, we regard both the other classes as negatives. Table 3 lists the precision and recall values. While the precision of Drawn games is fairly okay, it is evident our method misses out majorly on recall value for draws. Under-estimating draws is a problem with the existing methods too. (a) (b) Figure 1: Accuracy vs training set size (for k=7) (a)mlr (b)rbf-svm 4
5 Table 3: Precision and Recall with RBF-SVM (TGKPP and k=7) Home win Away win Drawn game Precision Recall Table 4: Confusion Matrix (TGKPP and k=7) Predicted Home wins Predicted Home losses Predicted draws Actual Home wins Actual Home losses Actual draws In [1], the authors propose a metric which is computed as the geometric mean of the predicted probabilites of actual outcomes, they call it PLS (Pseudo-likelihood Statistic). They obtain a PLS of for EPL. In [3], the best value of PLS obtained is In comparison, our average PLS value is , with a maximum of with k=4 using MLR and TGKPP. In [4], the authors report their results in the form of a confusion matrix. But they do not predict drawn games at all. Our method offer superior performance than [4] too. We also considerably outdo the accuracy of the probabilistic model in [5], which reaches an accuracy of 52.1% for and 48.15% In comparison to our best accuracy of 66.67%, the accuracy of expert pundits like Mark Lawrenson of BBC is 52.6% [6]. Also, we outperformed the accuracy of betting markets which averaged 55.3% last season [6], suggesting possible avenues for making money! The results we obtained, while interesting and significant in their own right, also offered interesting insights into the match data we analysed. Though our results using approach 3 were not encouraging, we found a striking trend. Since we have two models, we are faced with a choice: If A & B are the teams competing, should we use θ A or θ B for prediction? Or should we always use the home team s parameters? We observed that using θ away for prediction always does better as compared to using θ home. In hindsight, we may be tempted to interpret that using θ away does better because there is more variability in an away teams performance, and so that gives us more information. But a priori, we might as well have thought that home performances are more consistent and so a better predictor would use θ home. 5 References [1] Rue, Havard, and Oyvind Salvesen, Prediction and retrospective analysis of soccer matches in a league Journal of the Royal Statistical Society: Series D (The Statistician) 49.3 (2000): [2] Joseph, A., Norman E. Fenton, & Martin Neil, Predicting football results using Bayesian nets and other machine learning techniques. Knowledge-Based Systems 19.7 (2006): [3] Goddard, John, Regression models for forecasting goals and match results in association football. International Journal of forecasting 21.2 (2005): [4] Crowder, M., Dixon, M., Ledford, A. and Robinson, M. (2002), Dynamic modelling and prediction of English Football League matches for betting. Journal of the Royal Statistical Society: Series D (The Statistician), 51: doi: / [5] Constantinou, Anthony C., Norman E. Fenton, and Martin Neil, pi-football: A Bayesian network model for forecasting Association Football match outcomes. Knowledge-Based Systems (2012). [6] [7] Oostveen, Job, Ton Kalker, and Jaap Haitsma, Feature extraction and a database strategy for video fingerprinting, Recent Advances in Visual Information Systems (2002):
Predicting Soccer Match Results in the English Premier League
Predicting Soccer Match Results in the English Premier League Ben Ulmer School of Computer Science Stanford University Email: [email protected] Matthew Fernandez School of Computer Science Stanford University
Football Match Winner Prediction
Football Match Winner Prediction Kushal Gevaria 1, Harshal Sanghavi 2, Saurabh Vaidya 3, Prof. Khushali Deulkar 4 Department of Computer Engineering, Dwarkadas J. Sanghvi College of Engineering, Mumbai,
Beating the MLB Moneyline
Beating the MLB Moneyline Leland Chen [email protected] Andrew He [email protected] 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series
Soccer Analytics. Predicting the outcome of soccer matches. Research Paper Business Analytics. December 2012. Nivard van Wijk
Soccer Analytics Predicting the outcome of soccer matches Research Paper Business Analytics December 2012 Nivard van Wijk Supervisor: prof. dr. R.D. van der Mei VU University Amsterdam Faculty of Sciences
Predicting sports events from past results
Predicting sports events from past results Towards effective betting on football Douwe Buursma University of Twente P.O. Box 217, 7500AE Enschede The Netherlands [email protected] ABSTRACT
Predicting outcome of soccer matches using machine learning
Saint-Petersburg State University Mathematics and Mechanics Faculty Albina Yezus Predicting outcome of soccer matches using machine learning Term paper Scientific adviser: Alexander Igoshkin, Yandex Mobile
Creating a NL Texas Hold em Bot
Creating a NL Texas Hold em Bot Introduction Poker is an easy game to learn by very tough to master. One of the things that is hard to do is controlling emotions. Due to frustration, many have made the
Is it possible to beat the lottery system?
Is it possible to beat the lottery system? Michael Lydeamore The University of Adelaide Postgraduate Seminar, 2014 The story One day, while sitting at home (working hard)... The story Michael Lydeamore
Probability, statistics and football Franka Miriam Bru ckler Paris, 2015.
Probability, statistics and football Franka Miriam Bru ckler Paris, 2015 Please read this before starting! Although each activity can be performed by one person only, it is suggested that you work in groups
2. Scoring rules used in previous football forecast studies
Solving the problem of inadequate scoring rules for assessing probabilistic football forecast models Anthony Constantinou * and Norman E. Fenton Risk Assessment and Decision Analysis Research Group (RADAR),
Rating Systems for Fixed Odds Football Match Prediction
Football-Data 2003 1 Rating Systems for Fixed Odds Football Match Prediction What is a Rating System? A rating system provides a quantitative measure of the superiority of one football team over their
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
Does pay inequality within a team affect performance? Tomas Dvorak*
Does pay inequality within a team affect performance? Tomas Dvorak* The title should concisely express what the paper is about. It can also be used to capture the reader's attention. The Introduction should
Predict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system
1. Systems of linear equations We are interested in the solutions to systems of linear equations. A linear equation is of the form 3x 5y + 2z + w = 3. The key thing is that we don t multiply the variables
WORKED EXAMPLES 1 TOTAL PROBABILITY AND BAYES THEOREM
WORKED EXAMPLES 1 TOTAL PROBABILITY AND BAYES THEOREM EXAMPLE 1. A biased coin (with probability of obtaining a Head equal to p > 0) is tossed repeatedly and independently until the first head is observed.
Sports betting odds: A source for empirical Bayes
Sports betting odds: A source for empirical Bayes Mateo Graciano Londoño Andrés Ramírez Hassan, EAFIT University Medellín, Colombia November 25, 2015 Abstract This paper proposes a new betting strategy
Numerical Algorithms for Predicting Sports Results
Numerical Algorithms for Predicting Sports Results by Jack David Blundell, 1 School of Computing, Faculty of Engineering ABSTRACT Numerical models can help predict the outcome of sporting events. The features
Maximizing Precision of Hit Predictions in Baseball
Maximizing Precision of Hit Predictions in Baseball Jason Clavelli [email protected] Joel Gottsegen [email protected] December 13, 2013 Introduction In recent years, there has been increasing interest
Statistical techniques for betting on football markets.
Statistical techniques for betting on football markets. Stuart Coles Padova, 11 June, 2015 Smartodds Formed in 2003. Originally a betting company, now a company that provides predictions and support tools
CS 688 Pattern Recognition Lecture 4. Linear Models for Classification
CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(
Data mining and football results prediction
Saint Petersburg State University Mathematics and Mechanics Faculty Department of Analytical Information Systems Malysh Konstantin Data mining and football results prediction Term paper Scientific Adviser:
Beating the NCAA Football Point Spread
Beating the NCAA Football Point Spread Brian Liu Mathematical & Computational Sciences Stanford University Patrick Lai Computer Science Department Stanford University December 10, 2010 1 Introduction Over
Predicting Market Value of Soccer Players Using Linear Modeling Techniques
1 Predicting Market Value of Soccer Players Using Linear Modeling Techniques by Yuan He Advisor: David Aldous Index Introduction ----------------------------------------------------------------------------
1.2 Solving a System of Linear Equations
1.. SOLVING A SYSTEM OF LINEAR EQUATIONS 1. Solving a System of Linear Equations 1..1 Simple Systems - Basic De nitions As noticed above, the general form of a linear system of m equations in n variables
Relational Learning for Football-Related Predictions
Relational Learning for Football-Related Predictions Jan Van Haaren and Guy Van den Broeck [email protected], [email protected] Department of Computer Science Katholieke Universiteit
Additional information >>> HERE <<<
Additional information >>> HERE http://urlzz.org/winningtip/pdx/palo1436/ Tags: how to best price adidas football boots
Linear Programming. March 14, 2014
Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1
How To Bet On An Nfl Football Game With A Machine Learning Program
Beating the NFL Football Point Spread Kevin Gimpel [email protected] 1 Introduction Sports betting features a unique market structure that, while rather different from financial markets, still boasts
Predicting Flight Delays
Predicting Flight Delays Dieterich Lawson [email protected] William Castillo [email protected] Introduction Every year approximately 20% of airline flights are delayed or cancelled, costing
Basics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu [email protected] Modern machine learning is rooted in statistics. You will find many familiar
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
RULES AND REGULATIONS OF FIXED ODDS BETTING GAMES
RULES AND REGULATIONS OF FIXED ODDS BETTING GAMES Project: Royalhighgate Public Company Ltd. 04.04.2014 Table of contents SECTION I: GENERAL RULES... 6 ARTICLE 1 GENERAL REGULATIONS...6 ARTICLE 2 THE HOLDING
Coaching Tips Tee Ball
Coaching Tips Tee Ball Tee Ball Overview The great thing about tee ball is that there are very few rules to learn and that the game is all about involving lots of kids. It s about making sure all players
Collaborative Filtering. Radek Pelánek
Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains
Sports betting: A source for empirical Bayes
Research practice 1: Progress report Mateo Graciano-Londoño Mathematical Engineering Student Andrés Ramírez-Hassan Department of Economics Tutor EAFIT University, Medelĺın Colombia October 2 nd, 2015 Previous
The 7 Premier League trends the bookies don t want YOU to know...
If you knew something happened 7, 8 or even 9 times out of 10 you d bet on it happening again wouldn t you? The 7 Premier League trends the bookies don t want YOU to know... Inside this special report
Classification Problems
Classification Read Chapter 4 in the text by Bishop, except omit Sections 4.1.6, 4.1.7, 4.2.4, 4.3.3, 4.3.5, 4.3.6, 4.4, and 4.5. Also, review sections 1.5.1, 1.5.2, 1.5.3, and 1.5.4. Classification Problems
Football Match Result Predictor Website
Electronics and Computer Science Faculty of Physical and Applied Sciences University of Southampton Matthew Tucker Tuesday 13th December 2011 Football Match Result Predictor Website Project supervisor:
Solutions to Math 51 First Exam January 29, 2015
Solutions to Math 5 First Exam January 29, 25. ( points) (a) Complete the following sentence: A set of vectors {v,..., v k } is defined to be linearly dependent if (2 points) there exist c,... c k R, not
Hedge-funds: How big is big?
Hedge-funds: How big is big? Marco Avellaneda Courant Institute, New York University Finance Concepts S.A.R.L. and Paul Besson ADI Gestion S.A., Paris A popular radio show by the American humorist Garrison
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
S-Parameters and Related Quantities Sam Wetterlin 10/20/09
S-Parameters and Related Quantities Sam Wetterlin 10/20/09 Basic Concept of S-Parameters S-Parameters are a type of network parameter, based on the concept of scattering. The more familiar network parameters
Making Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University [email protected] [email protected] I. Introduction III. Model The goal of our research
v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.
3. Cross product Definition 3.1. Let v and w be two vectors in R 3. The cross product of v and w, denoted v w, is the vector defined as follows: the length of v w is the area of the parallelogram with
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Pick Me a Winner An Examination of the Accuracy of the Point-Spread in Predicting the Winner of an NFL Game
Pick Me a Winner An Examination of the Accuracy of the Point-Spread in Predicting the Winner of an NFL Game Richard McGowan Boston College John Mahon University of Maine Abstract Every week in the fall,
Analyzing Information Efficiency in the Betting Market for Association Football League Winners
Analyzing Information Efficiency in the Betting Market for Association Football League Winners Lars Magnus Hvattum Department of Industrial Economics and Technology Management, Norwegian University of
Predicting the World Cup. Dr Christopher Watts Centre for Research in Social Simulation University of Surrey
Predicting the World Cup Dr Christopher Watts Centre for Research in Social Simulation University of Surrey Possible Techniques Tactics / Formation (4-4-2, 3-5-1 etc.) Space, movement and constraints Data
CHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a
1 Solving LPs: The Simplex Algorithm of George Dantzig
Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.
Framing Business Problems as Data Mining Problems
Framing Business Problems as Data Mining Problems Asoka Diggs Data Scientist, Intel IT January 21, 2016 Legal Notices This presentation is for informational purposes only. INTEL MAKES NO WARRANTIES, EXPRESS
A simple analysis of the TV game WHO WANTS TO BE A MILLIONAIRE? R
A simple analysis of the TV game WHO WANTS TO BE A MILLIONAIRE? R Federico Perea Justo Puerto MaMaEuSch Management Mathematics for European Schools 94342 - CP - 1-2001 - DE - COMENIUS - C21 University
CALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
Data Mining Yelp Data - Predicting rating stars from review text
Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University [email protected] Chetan Naik Stony Brook University [email protected] ABSTRACT The majority
The Physics and Math of Ping-pong and How It Affects Game Play. By: Connor Thompson & Andrew Johnson
The Physics and Math of Ping-pong and How It Affects Game Play 1 The Physics and Math of Ping-pong and How It Affects Game Play By: Connor Thompson & Andrew Johnson The Practical Applications of Advanced
6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
Objective. Materials. TI-73 Calculator
0. Objective To explore subtraction of integers using a number line. Activity 2 To develop strategies for subtracting integers. Materials TI-73 Calculator Integer Subtraction What s the Difference? Teacher
Evidence from the English Premier League and the German Bundesliga. September 21, 2013. NESSIS 2013, Harvard University
Evidence from the English Premier League and the German Bundesliga NESSIS 2013, Harvard University September 21, 2013 Abstract soccer plays a significant role in scoring, about 15% of all goals scored
The Impact of Big Data on Classic Machine Learning Algorithms. Thomas Jensen, Senior Business Analyst @ Expedia
The Impact of Big Data on Classic Machine Learning Algorithms Thomas Jensen, Senior Business Analyst @ Expedia Who am I? Senior Business Analyst @ Expedia Working within the competitive intelligence unit
COACHING GOALS FOR U7 TO U10 PLAYERS
COACHING GOALS FOR U7 TO U10 PLAYERS The players in these age groups are fundamental to the growth and success of Mandeville Soccer Club. More importantly, what these players are taught during these years
You will see a giant is emerging
You will see a giant is emerging Let s Talk Mainstream Sports More than 290 million Americans watch sports (90% of the population) Billion dollar company with less - 72% (18-29 years old), 64% (20-49 years
INTERNATIONAL FRAMEWORK FOR ASSURANCE ENGAGEMENTS CONTENTS
INTERNATIONAL FOR ASSURANCE ENGAGEMENTS (Effective for assurance reports issued on or after January 1, 2005) CONTENTS Paragraph Introduction... 1 6 Definition and Objective of an Assurance Engagement...
arxiv:1506.04135v1 [cs.ir] 12 Jun 2015
Reducing offline evaluation bias of collaborative filtering algorithms Arnaud de Myttenaere 1,2, Boris Golden 1, Bénédicte Le Grand 3 & Fabrice Rossi 2 arxiv:1506.04135v1 [cs.ir] 12 Jun 2015 1 - Viadeo
Monotonicity Hints. Abstract
Monotonicity Hints Joseph Sill Computation and Neural Systems program California Institute of Technology email: [email protected] Yaser S. Abu-Mostafa EE and CS Deptartments California Institute of Technology
A random point process model for the score in sport matches
IMA Journal of Management Mathematics (2009) 20, 121 131 doi:10.1093/imaman/dpn027 Advance Access publication on October 30, 2008 A random point process model for the score in sport matches PETR VOLF Institute
Linear Programming Problems
Linear Programming Problems Linear programming problems come up in many applications. In a linear programming problem, we have a function, called the objective function, which depends linearly on a number
How To Assess Soccer Players Without Skill Tests. Tom Turner, OYSAN Director of Coaching and Player Development
How To Assess Soccer Players Without Skill Tests. Tom Turner, OYSAN Director of Coaching and Player Development This article was originally created for presentation at the 1999 USYSA Workshop in Chicago.
Math 202-0 Quizzes Winter 2009
Quiz : Basic Probability Ten Scrabble tiles are placed in a bag Four of the tiles have the letter printed on them, and there are two tiles each with the letters B, C and D on them (a) Suppose one tile
Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup
Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor
SUNSET PARK LITTLE LEAGUE S GUIDE TO SCOREKEEPING. By Frank Elsasser (with materials compiled from various internet sites)
SUNSET PARK LITTLE LEAGUE S GUIDE TO SCOREKEEPING By Frank Elsasser (with materials compiled from various internet sites) Scorekeeping, especially for a Little League baseball game, is both fun and simple.
STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables
Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random
Dynamic bayesian forecasting models of football match outcomes with estimation of the evolution variance parameter
Loughborough University Institutional Repository Dynamic bayesian forecasting models of football match outcomes with estimation of the evolution variance parameter This item was submitted to Loughborough
FORM LAB BLACK ADVANCED USER GUIDE
FORM LAB BLACK ADVANCED USER GUIDE This advanced user guide is written by head trader Will Wilde primarily for subscribers who have a Form Lab Black equivalent subscription, but all users will benefit
DEALING WITH THE DATA An important assumption underlying statistical quality control is that their interpretation is based on normal distribution of t
APPLICATION OF UNIVARIATE AND MULTIVARIATE PROCESS CONTROL PROCEDURES IN INDUSTRY Mali Abdollahian * H. Abachi + and S. Nahavandi ++ * Department of Statistics and Operations Research RMIT University,
Forex Success Formula
Forex Success Formula WWW.ForexSuccessFormula.COM Complimentary Report!! Copyright Protected www.forexsuccessformula.com - 1 - Limits of liability/disclaimer of Warranty The author and publishers of this
Video Poker in South Carolina: A Mathematical Study
Video Poker in South Carolina: A Mathematical Study by Joel V. Brawley and Todd D. Mateer Since its debut in South Carolina in 1986, video poker has become a game of great popularity as well as a game
Who s Winning? How knowing the score can keep your team moving in the right direction. Who s Winning?
Who s Winning? How knowing the score can keep your team moving in the right direction. Imagine sitting through an entire football game without knowing the score. The big day has arrived. It s Michigan
Conn Valuation Services Ltd.
CAPITALIZED EARNINGS VS. DISCOUNTED CASH FLOW: Which is the more accurate business valuation tool? By Richard R. Conn CMA, MBA, CPA, ABV, ERP Is the capitalized earnings 1 method or discounted cash flow
The Market for English Premier League (EPL) Odds
The Market for English Premier League (EPL) Odds Guanhao Feng, Nicholas G. Polson, Jianeng Xu arxiv:1604.03614v1 [stat.ap] 12 Apr 2016 Booth School of Business, University of Chicago April 14, 2016 Abstract
What really drives customer satisfaction during the insurance claims process?
Research report: What really drives customer satisfaction during the insurance claims process? TeleTech research quantifies the importance of acting in customers best interests when customers file a property
1 o Semestre 2007/2008
Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 Exploiting Text How is text exploited? Two main directions Extraction Extraction
Grade 5: Module 3A: Unit 2: Lesson 13 Developing an Opinion Based on the Textual Evidence:
Grade 5: Module 3A: Unit 2: Lesson 13 Developing an Opinion Based on the Textual Evidence: Jackie Robinson s Role in the Civil Rights Movement This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike
Recognizing Informed Option Trading
Recognizing Informed Option Trading Alex Bain, Prabal Tiwaree, Kari Okamoto 1 Abstract While equity (stock) markets are generally efficient in discounting public information into stock prices, we believe
Referee Analytics: An Analysis of Penalty Rates by National Hockey League Officials
Referee Analytics: An Analysis of Penalty Rates by National Hockey League Officials Michael Schuckers ab, Lauren Brozowski a a St. Lawrence University and b Statistical Sports Consulting, LLC Canton, NY,
Linear Programming. April 12, 2005
Linear Programming April 1, 005 Parts of this were adapted from Chapter 9 of i Introduction to Algorithms (Second Edition) /i by Cormen, Leiserson, Rivest and Stein. 1 What is linear programming? The first
Assignment #1: Spreadsheets and Basic Data Visualization Sample Solution
Assignment #1: Spreadsheets and Basic Data Visualization Sample Solution Part 1: Spreadsheet Data Analysis Problem 1. Football data: Find the average difference between game predictions and actual outcomes,
Applying Machine Learning to Stock Market Trading Bryce Taylor
Applying Machine Learning to Stock Market Trading Bryce Taylor Abstract: In an effort to emulate human investors who read publicly available materials in order to make decisions about their investments,
Applying Data Science to Sales Pipelines for Fun and Profit
Applying Data Science to Sales Pipelines for Fun and Profit Andy Twigg, CTO, C9 @lambdatwigg Abstract Machine learning is now routinely applied to many areas of industry. At C9, we apply machine learning
THE FORM LAB 2.5 GOAL STRATEGY
Welcome to the Form Lab 2.5 Goals Strategy Guide. THE FORM LAB 2.5 GOAL STRATEGY This guide outlines a simple way to quickly find profitable bets using the Game Notes available in all levels of Football
Probability. a number between 0 and 1 that indicates how likely it is that a specific event or set of events will occur.
Probability Probability Simple experiment Sample space Sample point, or elementary event Event, or event class Mutually exclusive outcomes Independent events a number between 0 and 1 that indicates how
Time Series Forecasting Techniques
03-Mentzer (Sales).qxd 11/2/2004 11:33 AM Page 73 3 Time Series Forecasting Techniques Back in the 1970s, we were working with a company in the major home appliance industry. In an interview, the person
Using ELO ratings for match result prediction in association football
International Journal of Forecasting 26 (2010) 460 470 www.elsevier.com/locate/ijforecast Using ELO ratings for match result prediction in association football Lars Magnus Hvattum a,, Halvard Arntzen b
INVESTOR PRESENTATION
INVESTOR PRESENTATION IMPORTANT DISCLOSURE This presentation contains estimates and forward-looking statements made pursuant to the safe harbour provisions of the Private Securities Litigation Reform Act
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Unit 9 Describing Relationships in Scatter Plots and Line Graphs
Unit 9 Describing Relationships in Scatter Plots and Line Graphs Objectives: To construct and interpret a scatter plot or line graph for two quantitative variables To recognize linear relationships, non-linear
Programming Exercise 3: Multi-class Classification and Neural Networks
Programming Exercise 3: Multi-class Classification and Neural Networks Machine Learning November 4, 2011 Introduction In this exercise, you will implement one-vs-all logistic regression and neural networks
The Tennis Formula: How it can be used in Professional Tennis
The Tennis Formula: How it can be used in Professional Tennis A. James O Malley Harvard Medical School Email: [email protected] September 29, 2007 1 Overview Tennis scoring The tennis formula
Least-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
