Predicting Optimal Game Day Fantasy Football Teams
|
|
|
- Evangeline Hunter
- 9 years ago
- Views:
Transcription
1 Predicting Optimal Game Day Fantasy Football Teams Glenn Sugar and Travis Swenson Department of Aeronautics and Astronautics Stanford University 1 Introduction The popularity of fantasy sports has exploded in recent years. Websites such as Draft Kings and FanDuel offer weekly competitions where people pay to build a fantasy team under a salary constraint for the chance to win money. Each team must have one quarterback (QB), two running backs (RB), three wide receivers (WR), one tight end (TE), one kicker (K), and one team defense (D). While machine learning has been applied to fantasy sports prediction, very little has been published as these algorithms are usually proprietary [1]. The goal of this project is to use machine learning techniques to obtain positive expected returns when playing these games. In order to do this we break our project into two parts. The first uses machine learning algorithms to predict the number of points any given player will score in a given game, while the second uses convex optimization to assemble teams with maximum expected return and minimum risk. Dataset and Feature Selection In order to accurately predict the performance of every player, we naturally considered their individual histories. However, it is also important to consider the history of their opponents, as a player is more likely to have a good game against a poor defense than against a good defense. Injuries also play an important role in predicting a player s performance. If a star defensive player is injured, the opposing team s offense is likely to perform better. If a WR is injured, other WRs on the team might see more targets and their projected points might increase while the QB s projected points might decrease. We quantify the impact of an injured player by creating features that contain the total game averaged stats for all injured players. For example, if two WRs are injured and they have averaged 10 and 5 targets per game respectively, all of their teammates would have 15 for the injured targets feature. We used the urllib and BeautifulSoup Python libraries to access and parse the data on Pro-Football- Reference and Rotoworld for every game that active players have played since 010 and recorded their performance, the injured players, and the opposing team s average performance [,3]. This resulted in over 0 statistics for every active players performance in every game played since 010. The statistics were grouped into 3 basic feature types: career features, current features, and recent history. Career features consist of running averages of player performance metrics over their entire career (e.g. average rush yards per game). Current features give information about the game we are trying to predict (e.g. home vs away). Recent history consists of performance metrics of the player and their opposition in the n most recent games (e.g. rushing touchdowns in each of the most recent 5 games). This data could either be averaged or included as n individual features per statistic. We then process the data so that each feature has zero mean and unit variance. To select the optimal subset of features we used forward search selection to obtain the best feature vector for each position. We also experimented with n, the number of games to consider in a player s recent history (n = 5 gave the best results). We considered several machine learning algorithms, 1
2 see Section 3, but found the optimal features were relatively insensitive to the machine learning algorithm used. An example set of features selected using our forward selection algorithm is given in Table 1. Table 1: Features selected for RB using forward search. Colored boxes indicate the features used for each feature type. FD Points Targets Rush TDs Receptions Reception TDs Injured Rush Attempts Injured Pass Attempts Injured Targets Defensive Turnovers Game Location Defensive Points Allowed Defensive Rush Yards Allowed Opposition Offensive Points Recent History Current Game Career 3 Model Selection Initially we tried to train a hypothesis for each player individually, in the hopes of achieving the most accurate predictions possible. Unfortunately this resulted in high variance, and we were forced to abandon this approach. Instead we grouped the players by position, thus generating six hypotheses for the six positions of the FanDuel team. Training and testing sets were created by randomly dividing every game played by every player into 70% training examples and 30% testing examples, where a train/test example is a single game. We experimented with many machine learning algorithms using the Scikit-Learn Python package and found Ridge Regression, Bayesian Ridge Regression, and Elastic Net gave the lowest root mean squared error (RMSE) []. 3.1 Ridge Regression (RR) minimize Xw y + α w (1) w The objective here is the same as linear regression, except for the addition of an L penalty on the feature weights, w, weighted by α. We found α = 0.1 works well. 3. Bayesian Ridge Regression (BR) p(w λ) = N (w 0, λ 1 I p ) () Bayesian Ridge Regression is similar to ridge regression, however the weighting on the L norm is chosen based on the data supplied, thus it is more robust to changes in the feature vector. The normalization coefficient is chosen using the assumed prior on w, given in equation. 3.3 Elastic Net (EN) minimize w 1 m Xw y + αρ w 1 + α(1 ρ) w (3) EN is similar to Ridge Regression, but includes an additional L 1 penatly. The parameters α, and ρ set the relative weights of the penalties. We found α = 0.5 and ρ = 0.1 works well.
3 3. Bias vs Variance Figure 1: Test and training root mean squared error (RMSE) for the regularized linear model using holdout cross validation for RB. 30% of the data was used in the test set. In order to diagnose the role of bias and variance in our generalization error we ran our algorithms on training sets of increasing size and recorded the resulting training and testing error. As can be seen in Figure 1, both errors converge to roughly the same value when using a large number of training examples, indicating variance is playing a low role in our generalization error. Instead, bias error is limiting the performance of our algorithms. This indicates we must further improve our model in order to reduce error. 3.5 Final Model Selection After running forward selection and using hold out cross validation, we selected the best models and features for each position as given in Table. For each position, we used the regressor and the feature set (either all possible features or the features chosen by forward search) that gave the lowest RMSE for the test set used during cross validation. Table : The optimal model used for each position. Note that limited features correspond to the feature set selected in the forward search, while all corresponds to using all possible features. QB WR RB TE K D Regressor BR BR EN BR EN BR Features All Limited All Limited All All Results.1 Error Comparison Figure : RMSE comparison between our projections and Yahoo s. We evaluated the performance of our algorithms by comparing the RMSE of our predictions with the publicly available Yahoo projections. Yahoo is a multi-billion dollar company that is active in the fantasy football market, and therefore provides a good baseline for the state of the art in fantasy 3
4 sports prediction. As can be seen in Figure, our RMSE values are only slightly worse for QBs, WRs, and RBs (.7%, 3.5 %, and.3% respectively), while we beat Yahoo for TE and K (.0 % and 0.7% respectively). Note that team defense is omitted due to difficulties in obtaining Yahoo defense projections. Because our RMSE values are so close to Yahoo s (and in some cases even better) we consider our point predictor a success.. Team Optimizer After computing the predicted points for every player, we can now construct the optimal team. FanDuel gives users a $60,0000 salary to use when building a team of 1 QB, RBs, 3 WRs, 1 TE, 1 K, and a team s defense. We can formulate the team selector as a binary linear program to pick the team that will produce the maximum number of predicted points: maximize x f T x subject to x i {0, 1}, i = 1,..., m. Ax = b G T x h () where f i is the predicted points for the i th player, x i indicates whether player i is chosen, b contains the position requirements, A ji indicates if player i plays position j, h is the maximum salary of the team, and G i is the i th player s salary given by FanDuel. Many of the FanDuel tournaments are structured such that the top 50% of entrants win a fixed prize, and thus there is no incentive to do better than the winning threshold. In this case it makes more sense to minimize risk, subject to the constraint that the winning threshold is met. Fortunately, FanDuel published the average number of points required to win these leagues for the 01 season (111.1 points) [5]. We use this threshold to solve the Markowitz Portfolio Optimization Problem to construct a team with minimal variance [6]. The problem now becomes: minimize x v T x subject to x i {0, 1}, i = 1,..., m. Ax = b G T x h f T x P (5) where v i is the variance of player i and P is the minimum number of total points for the team. Figure 3 shows the results from using both optimization methods on weeks 3-9 of the 015 season. The team picked using Equation has a 71.% winning percentage, while the minimum variance team picked using P = 10 had a 57.1% winning percentage. Even though the sample size is small, our point projections coupled with the team optimizer shows promise of providing positive expected returns when playing FanDuel games. Fanduel Points Optimized Team Performance Max Points Min Variance Week Number Figure 3: The performance of teams picked using both optimization methods (maximum points and minimum variance). The minimum variance team was chosen to have at least 10 total predicted points. The horizontal red line shows the average required points needed to win a FanDuel league.
5 .3 Prediction Improvement: Clustering Players Even with the encouraging results so far, it could be possible to improve our predictions by recognizing that there are subsets of players within a position. For example, some RBs are often utilized as WRs, while others will rarely be targets for passes. By identifying and training on subsets of a position, we should be able to lower our overall error and potentially beat Yahoo s prediction across all positions. While we have not performed all of this analysis, we have investigated potential subsets in the WR and RB positions. Note that this is a very preliminary analysis and there is much work to be done. Because some players that are classified as WRs by FanDuel are often utilized as RBs (and vice versa), we looked for clusters within the WR/RB set of players. First, we normalized the data to have zero mean and unit variance for seven features per player. We then used principal component analysis (PCA) to extract the first principal eigenvectors and projected the 7 dimensional RB and WR scaled data onto the reduced dimensional PCA space. The left plot of Figure.3 shows the player classification used for our current predictions, and the right plot shows a new classification using K-Means with K=5. PCA Dimension WR and RB Before Clustering WRs RBs PCA Dimension WR and RB K-Means Clustring (K = 5) Average RBs Star WRs Star RBs Backups Average WRs PCA Dimension PCA Dimension 1 Figure : Initial WR and RB classification (left) and classification after K-Means clustering (right). The dimensionality was reduced to via PCA for plotting purposes. These plots have many interesting features. The distribution of WRs and RBs suggests that the x- axis roughly corresponds to a player s receiving ability, while the y-axis corresponds to their rushing ability. The player with the largest x coordinate is Odell Beckham Jr., and player with the largest y coordinate is Adrian Peterson. These are arguably the best players at their respective positions. The points around (-,0) correspond to both WRs and RBs that are not very active and have few receptions and rushing attempts. These are mostly backup players, and their production is likely to be affected by injuries much differently than a normal starter. This is because if a starter gets injured, a backup player has a chance to start and have a drastic increase in points, whereas a starter would not notice much of a difference if his teammate misses a game due to injury. Because of these differences, we can generate more effective subsets using clustering algorithms. Furthermore, we can do this without the risk of overfitting the models. Figure 1 shows that we can reduce the training sets by a factor of 3 without significantly increasing variance. 5 Conclusion and Future Work We have demonstrated a machine learning approach to predict fantasy football player performance that is on par with state of the art methods. With our point predictions, we used a binary linear program solver to pick the optimum team given salary and position constraints. Using these methods, we obtained positive returns playing FanDuel games in weeks 3-9 of the 015 season. We also demonstrated a possible path to improve our point predictor by building models that use player performance rather than their prescribed position labels. We used K-Means clustering to obtain subsets of RBs and WRs that could produce better models than both Yahoo s and our current predictor. More work needs to be done to explore the performance of new models that are trained on the smaller player subsets, as well as how these subsets are generated via different clustering methods. 5
6 References [1] Dunnington, Nathan. Fantasy Football Projection Analysis. Honors Thesis. University of Oregon Web. < Thesis 015.pdf> [] Pro-Football-Reference. Sports Reference LLC. < [3] Rotoworld. NBC Sports Digital. < [] Pedregosa et al. Scikit-learn: Machine Learning in Python. JMLR 1, pp , 011. [5] Gonos, David. FanDuel NFL Benchmarks: Points to Target in Each Contest Type. 7 Aug Web. < [6] Markowitz, Harry. Portfolio Selection. The Journal of Finance 7.1 (195):
Beating the NCAA Football Point Spread
Beating the NCAA Football Point Spread Brian Liu Mathematical & Computational Sciences Stanford University Patrick Lai Computer Science Department Stanford University December 10, 2010 1 Introduction Over
A Predictive Model for NFL Rookie Quarterback Fantasy Football Points
A Predictive Model for NFL Rookie Quarterback Fantasy Football Points Steve Bronder and Alex Polinsky Duquesne University Economics Department Abstract This analysis designs a model that predicts NFL rookie
How To Bet On An Nfl Football Game With A Machine Learning Program
Beating the NFL Football Point Spread Kevin Gimpel [email protected] 1 Introduction Sports betting features a unique market structure that, while rather different from financial markets, still boasts
Predicting outcome of soccer matches using machine learning
Saint-Petersburg State University Mathematics and Mechanics Faculty Albina Yezus Predicting outcome of soccer matches using machine learning Term paper Scientific adviser: Alexander Igoshkin, Yandex Mobile
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
DOES SPORTSBOOK.COM SET POINTSPREADS TO MAXIMIZE PROFITS? TESTS OF THE LEVITT MODEL OF SPORTSBOOK BEHAVIOR
The Journal of Prediction Markets (2007) 1 3, 209 218 DOES SPORTSBOOK.COM SET POINTSPREADS TO MAXIMIZE PROFITS? TESTS OF THE LEVITT MODEL OF SPORTSBOOK BEHAVIOR Rodney J. Paul * and Andrew P. Weinbach
Fantasy Football University
Chapter 1 Fantasy Football University What is Fantasy Football? Fantasy Football puts you in charge and gives you the opportunity to become the coach, owner, and general manager of your own personal football
Beating the MLB Moneyline
Beating the MLB Moneyline Leland Chen [email protected] Andrew He [email protected] 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series
Making Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University [email protected] [email protected] I. Introduction III. Model The goal of our research
Server Load Prediction
Server Load Prediction Suthee Chaidaroon ([email protected]) Joon Yeong Kim ([email protected]) Jonghan Seo ([email protected]) Abstract Estimating server load average is one of the methods that
Maximizing Precision of Hit Predictions in Baseball
Maximizing Precision of Hit Predictions in Baseball Jason Clavelli [email protected] Joel Gottsegen [email protected] December 13, 2013 Introduction In recent years, there has been increasing interest
Team Success and Personnel Allocation under the National Football League Salary Cap John Haugen
Team Success and Personnel Allocation under the National Football League Salary Cap 56 Introduction T especially interesting market in which to study labor economics. The salary cap rule of the NFL that
Numerical Algorithms for Predicting Sports Results
Numerical Algorithms for Predicting Sports Results by Jack David Blundell, 1 School of Computing, Faculty of Engineering ABSTRACT Numerical models can help predict the outcome of sporting events. The features
Predicting Market Value of Soccer Players Using Linear Modeling Techniques
1 Predicting Market Value of Soccer Players Using Linear Modeling Techniques by Yuan He Advisor: David Aldous Index Introduction ----------------------------------------------------------------------------
Football Learning Guide for Parents and Educators. Overview
Overview Did you know that when Victor Cruz catches a game winning touchdown, the prolate spheroid he s holding helped the quarterback to throw a perfect spiral? Wait, what? Well, the shape of a football
Referee Analytics: An Analysis of Penalty Rates by National Hockey League Officials
Referee Analytics: An Analysis of Penalty Rates by National Hockey League Officials Michael Schuckers ab, Lauren Brozowski a a St. Lawrence University and b Statistical Sports Consulting, LLC Canton, NY,
Predicting Flight Delays
Predicting Flight Delays Dieterich Lawson [email protected] William Castillo [email protected] Introduction Every year approximately 20% of airline flights are delayed or cancelled, costing
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet
Darrelle Revis Workout
Darrelle Revis Workout Darrelle Revis workout training routine provides him with incredible speed, agility and a muscle body that sports a ripped six pack. Revis has a tremendous work ethic and is a shutdown
Music Mood Classification
Music Mood Classification CS 229 Project Report Jose Padial Ashish Goel Introduction The aim of the project was to develop a music mood classifier. There are many categories of mood into which songs may
NAIRABET AMERICAN FOOTBALL
NAIRABET AMERICAN FOOTBALL American football which is also called the NHL is played on a 100-yard long field, with goal lines on each end. The field is marked in increments of one yard. The main objective
Predicting the Betting Line in NBA Games
Predicting the Betting Line in NBA Games Bryan Cheng Computer Science Kevin Dade Electrical Engineering Michael Lipman Symbolic Systems Cody Mills Chemical Engineering Abstract This report seeks to expand
Cross-Validation. Synonyms Rotation estimation
Comp. by: BVijayalakshmiGalleys0000875816 Date:6/11/08 Time:19:52:53 Stage:First Proof C PAYAM REFAEILZADEH, LEI TANG, HUAN LIU Arizona State University Synonyms Rotation estimation Definition is a statistical
Car Insurance. Prvák, Tomi, Havri
Car Insurance Prvák, Tomi, Havri Sumo report - expectations Sumo report - reality Bc. Jan Tomášek Deeper look into data set Column approach Reminder What the hell is this competition about??? Attributes
Pick Me a Winner An Examination of the Accuracy of the Point-Spread in Predicting the Winner of an NFL Game
Pick Me a Winner An Examination of the Accuracy of the Point-Spread in Predicting the Winner of an NFL Game Richard McGowan Boston College John Mahon University of Maine Abstract Every week in the fall,
International Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: [email protected] 1. Introduction
Click Here ->> Football Trading Portfolio: Winning Made Easy Football Trading Portfolio
Football bets tips forum, top 3 bets football, nfl football picks week 1 against the spread. Click Here ->> Football Trading Portfolio: Winning Made Easy Football Trading Portfolio Download From Official
Creating a NL Texas Hold em Bot
Creating a NL Texas Hold em Bot Introduction Poker is an easy game to learn by very tough to master. One of the things that is hard to do is controlling emotions. Due to frustration, many have made the
Machine Learning Big Data using Map Reduce
Machine Learning Big Data using Map Reduce By Michael Bowles, PhD Where Does Big Data Come From? -Web data (web logs, click histories) -e-commerce applications (purchase histories) -Retail purchase histories
Data Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
Does NFL Spread Betting Obey the E cient Market Hypothesis?
Does NFL Spread Betting Obey the E cient Market Hypothesis? Johannes Harkins May 2013 Abstract In this paper I examine the possibility that NFL spread betting does not obey the strong form of the E cient
An Exploration into the Relationship of MLB Player Salary and Performance
1 Nicholas Dorsey Applied Econometrics 4/30/2015 An Exploration into the Relationship of MLB Player Salary and Performance Statement of the Problem: When considering professionals sports leagues, the common
Principal components analysis
CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x R n as approximately lying in some k-dimension subspace, where k
We employed reinforcement learning, with a goal of maximizing the expected value. Our bot learns to play better by repeated training against itself.
Date: 12/14/07 Project Members: Elizabeth Lingg Alec Go Bharadwaj Srinivasan Title: Machine Learning Applied to Texas Hold 'Em Poker Introduction Part I For the first part of our project, we created a
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott
Football Trading Portfolio: Winning Made Easy Football Trading Portfolio - Review - > Click Here GET IT NOW
Ncaa football predictions week 3 2013, football betting comparison sites. Football Trading Portfolio: Winning Made Easy Football Trading Portfolio - Review - > Click Here GET IT NOW Enter Here ->>> Football
A Statistical Analysis of Popular Lottery Winning Strategies
CS-BIGS 4(1): 66-72 2010 CS-BIGS http://www.bentley.edu/csbigs/vol4-1/chen.pdf A Statistical Analysis of Popular Lottery Winning Strategies Albert C. Chen Torrey Pines High School, USA Y. Helio Yang San
Mathematics on the Soccer Field
Mathematics on the Soccer Field Katie Purdy Abstract: This paper takes the everyday activity of soccer and uncovers the mathematics that can be used to help optimize goal scoring. The four situations that
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
Introduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
Prediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
Collaborative Filtering. Radek Pelánek
Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains
The Need for Training in Big Data: Experiences and Case Studies
The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor
Health Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important
Health Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important Floyd Ray Martin, FSA, MAAA Thomas A. McInteer, FSA, MAAA Jonathan P. Polon, FSA Dental Insurance Fraud Detection
How I won the Chess Ratings: Elo vs the rest of the world Competition
How I won the Chess Ratings: Elo vs the rest of the world Competition Yannis Sismanis November 2010 Abstract This article discusses in detail the rating system that won the kaggle competition Chess Ratings:
Why do statisticians "hate" us?
Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data
JR. NFL YOUTH FLAG FOOTBALL LEAGUE OFFICIAL RULE BOOK
JR. NFL YOUTH FLAG FOOTBALL LEAGUE OFFICIAL RULE BOOK Boys & Girls Clubs of Conejo & Las Virgenes 5137 Clareton Dr. #210 Agoura Hills, CA 91301 www.bgcconejo.org For further information or comments please
Applying Machine Learning to Stock Market Trading Bryce Taylor
Applying Machine Learning to Stock Market Trading Bryce Taylor Abstract: In an effort to emulate human investors who read publicly available materials in order to make decisions about their investments,
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
The Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
The offensive team takes possession of the ball at its 5 yard line and has four
RULES Scoring Touchdown: 6 Points PAT: 1 point from 3 yard line; 2 points from 10 yard line Team scoring a TD must declare 1 or 2 point decision immediately. The first declaration to the official will
Football Match Winner Prediction
Football Match Winner Prediction Kushal Gevaria 1, Harshal Sanghavi 2, Saurabh Vaidya 3, Prof. Khushali Deulkar 4 Department of Computer Engineering, Dwarkadas J. Sanghvi College of Engineering, Mumbai,
Big Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
Football Trading Portfolio: Winning Made Easy Football Trading Portfolio - Real User Experience ->> Enter Here > Visit Now <
Betting board for football, free football betting tips for wednesday, nfl football picks week 1 point spread, betting tips for today's football games. Football Trading Portfolio: Winning Made Easy Football
Delaware Sports League Flag Football Rules
1 Field Delaware Sports League Flag Football Rules A. The field will be 80 yards long from goal line to goal line with the end zones being 10 yards each. B. The field will be divided into four (4) zones
Relational Learning for Football-Related Predictions
Relational Learning for Football-Related Predictions Jan Van Haaren and Guy Van den Broeck [email protected], [email protected] Department of Computer Science Katholieke Universiteit
Common factor analysis
Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor
Question 2 Naïve Bayes (16 points)
Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the
On the Interaction and Competition among Internet Service Providers
On the Interaction and Competition among Internet Service Providers Sam C.M. Lee John C.S. Lui + Abstract The current Internet architecture comprises of different privately owned Internet service providers
Golf League Formats and How to Run Them
Golf League Formats and How to Run Them Contents Running a Golf League... 3 Starting your Golf League... 3 The Players... 3 League Format... 3 Points... 4 Match Play... 4 Points Per hole... 4 Points per
Review Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
EPOKA LEAGUE International Football Championship for high schools
1 2 EPOKA LEAGUE International Football Championship for high schools About Epoka League: International Football Championship for high schools is a mini football tournament between high schools in Albania,
MACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
How To Predict The Outcome Of The Ncaa Basketball Tournament
A Machine Learning Approach to March Madness Jared Forsyth, Andrew Wilde CS 478, Winter 2014 Department of Computer Science Brigham Young University Abstract The aim of this experiment was to learn which
Support Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
Seed Distributions for the NCAA Men s Basketball Tournament: Why it May Not Matter Who Plays Whom*
Seed Distributions for the NCAA Men s Basketball Tournament: Why it May Not Matter Who Plays Whom* Sheldon H. Jacobson Department of Computer Science University of Illinois at Urbana-Champaign [email protected]
JetBlue Airways Stock Price Analysis and Prediction
JetBlue Airways Stock Price Analysis and Prediction Team Member: Lulu Liu, Jiaojiao Liu DSO530 Final Project JETBLUE AIRWAYS STOCK PRICE ANALYSIS AND PREDICTION 1 Motivation Started in February 2000, JetBlue
The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy
BMI Paper The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy Faculty of Sciences VU University Amsterdam De Boelelaan 1081 1081 HV Amsterdam Netherlands Author: R.D.R.
THE DETERMINANTS OF SCORING IN NFL GAMES AND BEATING THE SPREAD
THE DETERMINANTS OF SCORING IN NFL GAMES AND BEATING THE SPREAD C. Barry Pfitzner, Department of Economics/Business, Randolph-Macon College, Ashland, VA 23005, [email protected], 804-752-7307 Steven D.
Socci Sport Alternative Games
- 1 - Socci Sport Alternative Games Table of Contents 1 Roller Socci 2 2 Pass and Shoot Socci 2 3 Punt & Catch Socci 2 4 Long Pass Socci 3 5 Pass, Dribble, and Shoot Socci 3 6 Scooter Socci Basketball
Azure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
The Determinants of Scoring in NFL Games and Beating the Over/Under Line. C. Barry Pfitzner*, Steven D. Lang*, and Tracy D.
FALL 2009 The Determinants of Scoring in NFL Games and Beating the Over/Under Line C. Barry Pfitzner*, Steven D. Lang*, and Tracy D. Rishel** Abstract In this paper we attempt to predict the total points
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Predict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
Cross Validation. Dr. Thomas Jensen Expedia.com
Cross Validation Dr. Thomas Jensen Expedia.com About Me PhD from ETH Used to be a statistician at Link, now Senior Business Analyst at Expedia Manage a database with 720,000 Hotels that are not on contract
Hedge Effectiveness Testing
Hedge Effectiveness Testing Using Regression Analysis Ira G. Kawaller, Ph.D. Kawaller & Company, LLC Reva B. Steinberg BDO Seidman LLP When companies use derivative instruments to hedge economic exposures,
Data Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification
Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal
Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether
Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection
Chapter 311 Introduction Often, theory and experience give only general direction as to which of a pool of candidate variables (including transformed variables) should be included in the regression model.
Dimensionality Reduction: Principal Components Analysis
Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research [email protected]
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research [email protected] Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
Final Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
Prices, Point Spreads and Profits: Evidence from the National Football League
Prices, Point Spreads and Profits: Evidence from the National Football League Brad R. Humphreys University of Alberta Department of Economics This Draft: February 2010 Abstract Previous research on point
Partial Least Squares (PLS) Regression.
Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component
Logistic Regression for Spam Filtering
Logistic Regression for Spam Filtering Nikhila Arkalgud February 14, 28 Abstract The goal of the spam filtering problem is to identify an email as a spam or not spam. One of the classic techniques used
Overview of Factor Analysis
Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,
Puck Pricing : Dynamic Hockey Ticket Price Optimization
ABSTRACT Paper 2863-2015 Puck Pricing : Dynamic Hockey Ticket Price Optimization Sabah Sadiq, Jing Zhao, Christopher Jones Deloitte Consulting LLP Dynamic pricing is a real-time strategy where corporations
W6.B.1. FAQs CS535 BIG DATA W6.B.3. 4. If the distance of the point is additionally less than the tight distance T 2, remove it from the original set
http://wwwcscolostateedu/~cs535 W6B W6B2 CS535 BIG DAA FAQs Please prepare for the last minute rush Store your output files safely Partial score will be given for the output from less than 50GB input Computer
LCs for Binary Classification
Linear Classifiers A linear classifier is a classifier such that classification is performed by a dot product beteen the to vectors representing the document and the category, respectively. Therefore it
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
Winning the Kaggle Algorithmic Trading Challenge with the Composition of Many Models and Feature Engineering
IEICE Transactions on Information and Systems, vol.e96-d, no.3, pp.742-745, 2013. 1 Winning the Kaggle Algorithmic Trading Challenge with the Composition of Many Models and Feature Engineering Ildefons
