ISyE 2028 Basic Statistical Methods - Fall 2015 Analysis of NFL Team Performance Against the Spread Betting Line



Similar documents
2015 NFL Annual Selection Meeting R P O CLUB PLAYER POS COLLEGE ROUND 2

2011 AMERICAN CONFERENCE. East Division W L T Pct. Pts. OP New England# Washington North Division

Using Baseball Data as a Gentle Introduction to Teaching Linear Regression

P l at i n u m

The Performance of Betting Lines for Predicting the Outcome of NFL Games

SACKS-CHAMPIONSHIPS-GAMES WON ALL-TIME RECORDS TEAM RECORDS

Sam Schaefer April 2011

Sports Betting Systems

Arizona Cardinals Yellow

Organizing Topic: Data Analysis

43 RB DARREN SPROLES VITALS EAGLES SINGLE-SEASON PUNT RETURN YARDS

Volume 30, Issue 4. Market Efficiency and the NHL totals betting market: Is there an under bias?

Social Casino Postmortems. Derrick Morton CEO, FlowPlay

NFL Stadium Funding Information. Stadiums Opened Since 1997

2014 Super Bowl betting odds have Broncos tied with Patriots

nfl picks week 15 espn

Quality, Statistics, and Probability in Sports

The Effects of Atmospheric Conditions on Pitchers

A Contrarian Approach to the Sports Betting Marketplace

Beating the Book: Are There Patterns in NFL Betting Lines?

Fair Bets and Profitability in College Football Gambling

Does NFL Spread Betting Obey the E cient Market Hypothesis?

During the course of our research on NBA basketball, we found out a couple of interesting principles.

The NCAA Basketball Betting Market: Tests of the Balanced Book and Levitt Hypotheses

Herd Behavior and Underdogs in the NFL

Finding Inefficiency in Sports Betting Markets; a Look through NFL Confidence Pick Em Biases

Forecasting Accuracy and Line Changes in the NFL and College Football Betting Markets

THE EFFICIENT MARKET HYPOTHESIS AND GAMBLING ON NATIONAL FOOTBALL LEAGUE GAMES YOON TAE SUNG THESIS

THE ULTIMATE FOOTBALL BETTING GUIDE

Sport Hedge Millionaire s Guide to a growing portfolio. Sports Hedge

Prices, Point Spreads and Profits: Evidence from the National Football League

LIVE BETTING ULTRA RULES

THE DETERMINANTS OF SCORING IN NFL GAMES AND BEATING THE SPREAD

Bookmaker Bookmaker Online Bookmaker Offline

POINT SPREAD SHADING AND BEHAVIORAL BIASES IN NBA BETTING MARKETS. by Brad R. Humphreys *

The Secret To Making Consistent. Profits Betting On Sports

Solution Let us regress percentage of games versus total payroll.

Testing Efficiency in the Major League of Baseball Sports Betting Market.

Sin City. In poker, the facility to buy additional chips in tournaments. Total payout liability of a casino during any one game.

-- Special Ebook -- Bookie Buster: Secret Systems Used by Pro Sports Gamblers Finally REVEALED!

Pre-Match Betting Plus Rules & Policies

A Test for Inherent Characteristic Bias in Betting Markets ABSTRACT. Keywords: Betting, Market, NFL, Efficiency, Bias, Home, Underdog

EXAMINING NCAA/NFL MARKET EFFICIENCY

Additional details >>> HERE <<<

I. Oddset 2-Outcome means the version of Sports Action Oddset gameplay offering the choice between a Home Win or a Visitor Win.

Social Media and Professional Sports: An analysis of the National Basketball Association and National Football League teams strategies

Lodging, Rental Car and Meal Taxes on Travelers in the Top 50 U.S. Cities

International Statistical Institute, 56th Session, 2007: Phil Everson

MAJOR LEAGUE BASEBALL 2011 ATTENDANCE ANALYSIS. Compiled and Written by David P. Kronheim.

Charleston to Houston (Hobby) CHS - HOU Chicago Midway to Austin MDW - AUS Chicago Midway to Boston Logan MDW - BOS Chicago Midway

Refining the Point(s)-After- Touchdown Decision

Beating the NCAA Football Point Spread

Columbus, OH is the Number One Sports Town in the U.S. Boston, Buffalo and Pittsburgh Also Leading Sports Cities

The Determinants of Scoring in NFL Games and Beating the Over/Under Line. C. Barry Pfitzner*, Steven D. Lang*, and Tracy D.

Live Betting Extra Rules

Predicting sports events from past results

UP TO 40 % TEAMS AND PRODUCTS ALL PRODUCTS OFFICIALLY LICENSED

Home Bias in the NFL Pointspread Market. Matt Cundith Department of Economics California State University, Sacramento

An Analysis of Sportsbook Behavior and How to Profit. Chris Ludwiczak. Advisor: Dr. John Clark

Predicting the NFL Using Twitter. Shiladitya Sinha, Chris Dyer, Kevin Gimpel, Noah Smith

Behavioural Biases in the European Football Betting Market

Betting Terms Explained

Seniors Housing Market Overview

Journal of Quantitative Analysis in Sports

SPORTS WAGERING RULES EDUCATION SESSION. SMC Compliance Office

Comparing & Contrasting. - mathematically. Ways of comparing mathematical and scientific quantities...

Basketball Market Efficiency and the Big Dog Bias. Ladd Kochman* and Randy Goodwin*

A Two-Stage Bayesian Model for Predicting Winners in Major League Baseball

SECRET BETTING CLUB US SPORTS TIPSTER REPORTS SAMPLE ISSUE

Pre-Match Betting Markets

VALIDATING A DIVISION I-A COLLEGE FOOTBALL SEASON SIMULATION SYSTEM. Rick L. Wilson

THE ULTIMATE BASKETBALL BETTING GUIDE

the sporting index group

A Professional Handicapper Shares His Secrets On Betting Football, Basketball and Baseball For A Living

Numerical Algorithms for Predicting Sports Results

Point Shaving in NCAA Basketball: Corrupt Behavior or Statistical Artifact?

Elementary Statistics and Inference. Elementary Statistics and Inference. 17 Expected Value and Standard Error. 22S:025 or 7P:025.

UZH Business Working Paper Series (ISSN )

MARKET EFFECIENCY: IS THE NFL BETTING MARKET EFFICIENT? By: Alexander Kuper. Thesis Advisor: Professor Roger Craine

bet365 spread betting best odds

Andrew P. Weinbach, Ph.D. Assistant Professor Accounting, Finance, and Economics College of Business

How To Bet On An Nfl Football Game With A Machine Learning Program

The Odds X-Factor System

HOW TO BET ON FOOTBALL. Gambling can be addictive. Please play responsibly.

NATIONAL FOOTBALL LEAGUE

Tau Beta Pi Convention Bid Basic Information

Additional information >>> HERE <<<

Bookmaking. A practical study on how illegal bookmakers operate and their weak points. By: L. Burke Files lbfiles@feeinc.com.

DOES SPORTSBOOK.COM SET POINTSPREADS TO MAXIMIZE PROFITS? TESTS OF THE LEVITT MODEL OF SPORTSBOOK BEHAVIOR

The Fibonacci Strategy Revisited: Can You Really Make Money by Betting on Soccer Draws?

NATIONAL BASKETBALL ASSOCIATION National TV schedule As of Aug. 13, 2014

Forecasting in Financial and Sports Gambling -Markets

NCAA Student-Athlete Gambling Behaviors and Attitudes: Supplementary Tables May 2013

Using Past Performance to Predict NFL Outcomes: A Chartist Approach

Assignment #1: Spreadsheets and Basic Data Visualization Sample Solution

Transcription:

ISyE 2028 Basic Statistical Methods - Fall 2015 Analysis of NFL Team Performance Against the Spread Betting Line Introduction Author: Jonathan Pang Sports betting, although controversial, is an extremely popular form of gambling in the United States. Although many different types of bets can be placed on popular websites or in casinos, the most popular type of bet that is placed on the results of major American sports is the spread bet. A spread bet is a wager that the difference between the scores of two teams will be less than or greater than a certain value, which is specified by the bookmaker. For example, if the spread betting line for the Detroit Lions at Kansas City Chiefs game in Week 8 is +3.5 for the Detroit Lions, a spread bet placed on the Detroit Lion would be won if the score for the Detroit Lions (D) + the spread betting line (+3.5) is greater than the score for the Kansas City Chiefs (K). D + 3.5 > K If the score for the Detroit Lions (D) + the spread betting line (+3.5) is less than the score for the Kansas City Chiefs, the wager will be lost. D + 3.5 < K In other words, the bettors are wagering on whether or not the Kansas City Chiefs will outscore the Detroit Lions by +3.5 points. It is also noted that most sports scores are only represented by integers. Thus, oftentimes the bookmakers will set the betting line at half a point intervals, making it impossible for neither side to win the bet. Otherwise, if the Kansas City Chiefs outscored the Detroit Lions by exactly the spread betting line, a push would occur and neither side would win (or lose). In order to maximize the odds of winning a spread bet, historical data can be utilized to determine patterns in the final spread and the actual results in games. In this study, trends will be analyzed to determine whether or not any teams in the NFL have historically consistently beat the spread. This will be answered by hypothesis testing, comparing the spread betting lines and score differentials for each game a team plays. If the difference is significant, it is considered a safer bet to wager money on the team. However, it is expected that there is no significant difference between historical spread betting lines and score differentials for each game a team played. In addition, it is expected that historically no team has outperformed or underperformed the spread betting line on a consistent basis. An aggregate of historical and current sports betting information was found at the Odds Sharks website, on which the final point spreads and over under betting lines are recorded for professional sports such as the NFL (National Football League), MLB (Major League Baseball), NBA (National Basketball Association), and NHL (National Hockey League) as well as college sports such as NCAAF (National Collegiate Athletic Association Football) and NCAAB (National Collegiate Athletic Association Basketball). A web scraper was utilized in order to extract past sports betting lines as well as actual scores of games. Data Collection Data was collected from the Odds Sharks website using a CasperJS web scraper and processed by NodeJS. Relevant information was then inserted into a CSV document. Source code is attached in Appendix B. Data was collected for all regular season games played in the NFL since the 1995-1996 season. This resulted in approximately 20 years of data (including the present 2015-2016 season which has not yet been completed). 1

DATA PROCESSING This data was then processed using R to determine the average amount by which each team scored above the betting line as well as how often each team was able to beat the betting line. Data Processing The data collected from the Odds Shark website was first analyzed to determine the amount on average that a team overperformed the spread betting line in the past 20 seasons. This information is summarized in Appendix B in Table 1 and shown in graphical form in Figure 1. As can be seen in Table 1, it is evident that the team that most outperformed the spread betting line on average was the New England Patriots, which scored on average, 2.798761610 points above the spread. Other teams such as the Green Bay Packers and the Pittsburgh Steelers also performed well against the spread betting line. On the other hand, it is evident that the team that on average most underperformed the spread betting line was the Arizona Cardinals, which scored on average, 1.743710692, points below the spread. Likewise, the Cleveland Browns and St. Louis Rams also underperformed against the spread betting line. A two-tailed t-test with an alpha value of 0.05 was then constructed to determine if any teams in the past 20 years significantly outperformed or underperformed the spread betting lines. The t-tests were performed using R. It was determined that the following teams have historically performed significantly different than the spread betting lines: the Arizona Cardinals, the Cleveland Browns, the Green Bay Packers, and the New England Patriots. Both the Arizona Cardinals and the Cleveland Browns historically underperformed the spread betting lines and the Green Bay Packers and the New England Patriots historically outperformed the spread betting lines. The results of the t-tests are printed below. ARI One Sample t t e s t data : team games$pointsats t = 2.3237, df = 317, p value = 0.02077 a l t e r n a t i v e h y p o t h e s i s : true mean i s not equal to 0 95 percent c o n f i d e n c e i n t e r v a l : 3.2200848 0.2673366 sample e s t i m a t e s : mean o f x 1.743711 CLE One Sample t t e s t data : team games$pointsats t = 2.154, df = 277, p value = 0.0321 a l t e r n a t i v e h y p o t h e s i s : true mean i s not equal to 0 95 percent c o n f i d e n c e i n t e r v a l : 2.9534503 0.1328806 sample e s t i m a t e s : mean o f x 1.543165 GB One Sample t t e s t data : team games$pointsats 2

CONCLUSION t = 2. 4 4 5 6, df = 330, p value = 0.01498 a l t e r n a t i v e h y p o t h e s i s : true mean i s not equal to 0 95 percent c o n f i d e n c e i n t e r v a l : 0.3528634 3.2543874 sample e s t i m a t e s : mean o f x 1.803625 NE One Sample t t e s t data : team games$pointsats t = 3. 4 7 6 1, df = 322, p value = 0.0005785 a l t e r n a t i v e h y p o t h e s i s : true mean i s not equal to 0 95 percent c o n f i d e n c e i n t e r v a l : 1.214757 4.382766 sample e s t i m a t e s : mean o f x 2.798762 Thus, the data support that the Arizona Cardinals and Cleveland Browns were historically underperformed the spread betting line and the Green Bay Packers and New England Patriots historically outperformed the spread betting line. Though the previous observations can be interesting to bettors, a more relevant observation would not be how much a team scores above the spread betting line but rather how often a team scores above the spread betting line. Using R, the proportion of games in which a team was able to outperform the spread betting line was calculated and summarized in Table 1 and shown in graphical form in Figure 2. This time, it is noted that the team that most consistently outperformed the spread betting line was the Carolina Panthers while the New England Patriots again fared well as the second most consistent with the Philadelphia Eagles and Green Bay Packers close behind. On the other end, we find that the St. Louis Rams was the most consistent underperformer while the San Francisco 49ers and the Washington Redskins were marginally better. Furthermore, in order to determine if any team consistently outperformed or underperformed the spread betting line in the past 20 years, a z-test for proportions was performed on each of the sets of historical data. The two-tailed z-tests were constructed to compare the percentage of times that a team beat the spread with the null hypothesis that each team beat the spread 50% of the time. This was equivalent to stating that the probability of beating the spread was random. An alpha value of 0.05 was utilized and the tests were processed using R. It was determined that no team performed significantly different from the null hypothesis percentage of 50%. Thus, the hypothesis that no team has historically consistently outperformed or underperformed against the spread is supported. Conclusion The purpose of this study was to determine if any team significantly and consistently beat the spread betting lines placed on games in casinos and on popular betting websites. The spread betting lines are essentially a prediction of how much one team will beat another team and are moved throughout the week in order to maintain a 50% balance on both sides of the spread (50% betting under the spread and 50% betting over the spread). Since a team has no incentive to beat the spread on their own as players and team personnel are prohibited from betting on games, a team that is able to consistently beat the spread can be interpreted as a team that is consistently underrated by the bookmakers and by the general public. Likewise, a team that consistently underperforms against the spread betting line can be interpreted as a team that is consistently overrated by the bookmakers and by the general public. From this study, it was determined that four teams on average perform significantly different from the spread betting lines. These 3

CONCLUSION teams are the Arizona Cardinals and the Cleveland Browns, both of which on average underperform against the spread betting line and the Green Bay Packers and the New England Patriots both of which on average outperform against the spread betting line. However, of these four teams, none consistently outperform or underperform against the spread. This would mean that on the occassions that the teams do outperform or underperform the spread, the teams greatly outperform or underperform. However, these teams do not outperform or underperform the spread on a more consistent basis than do any other teams. In fact, the percentage of beating the spread is, for all these teams, very close to 50% which corresponds to random chance. In conclusion, no team was found to have historically outperformed or underperformed significantly against the spread and the proportion of beating the spread for all teams was found to be 50%. The data collected is not consistent with the expectation that there is no difference between score differentials and the spread betting line. The data collected, however, is consistent with the expectation that no team consistently outperformed or underperformed the spread. The sample size utilized was certainly large enough with 325 game results and spread betting lines collected for each team. The sample is representative of the true population and can be utilized by bettors in order to determine whether to bet above or below the spread when looking at certain teams when placing a spread bet. 4

APPENDIX A Appendix A Table 1: Points Scored Above the Spread for All Teams Team Mean Percent ARI -1.743710692 0.4779874 CLE -1.543165468 0.4580645 STL -1.312703583 0.4462541 OAK -1.117283951 0.4444444 DET -0.938906752 0.4533762 DAL -0.923870968 0.4580645 TEN -0.848056537 0.4558304 SF -0.838762215 0.4495114 WAS -0.656146179 0.4518272 CIN -0.492354740 0.4709480 NO -0.400974026 0.4837662 BUF -0.392857143 0.5031056 TB -0.374137931 0.4827586 JAC -0.360856269 0.4831804 ATL -0.335889571 0.4938650 HOU -0.239669421 0.5082645 NYG -0.112804878 0.4939024 MIN -0.022865854 0.5030488 MIA 0.007692308 0.4676923 CHI 0.253105590 0.4689441 SD 0.285493827 0.4845679 NYJ 0.328703704 0.4938272 DEN 0.387537994 0.4893617 PHI 0.419558360 0.5236593 IND 0.435975610 0.5030488 SEA 0.641033435 0.4924012 CAR 0.847692308 0.5292308 BAL 1.183121019 0.4904459 KC 1.290123457 0.5123457 PIT 1.307692308 0.5032051 GB 1.803625378 0.5226586 NE 2.798761610 0.5263158 5

APPENDIX A Figure 1: Mean Points Above Spread for Teams Figure 2: Mean Points Above Spread for Teams 6

Appendix B main.js var path = r e q u i r e ( path ) ; var c h i l d P r o c e s s = r e q u i r e ( c h i l d p r o c e s s ) ; var c h e e r i o = r e q u i r e ( c h e e r i o ) ; var f s = r e q u i r e ( f s ) ; var csvwriter = r e q u i r e ( csv write stream ) ; var s e p a r a t o r =, ; var newline = \n ; f s. w r i t e F i l e ( output. csv, date + s e p a r a t o r + away + s e p a r a t o r + away score + s e p a r a t o r + home + s e p a r a t o r + home score + s e p a r a t o r + r e s u l t + s e p a r a t o r + home spread + s e p a r a t o r + a t s + s e p a r a t o r + t o t a l + s e p a r a t o r + ou + newline, f u n c t i o n ( err, data ) { } ) ; var teams = [ Arizona, Atlanta, Baltimore, B u f f a l o, Carolina, Chicago, C i n c i n n a t i, Cleveland, D a l l a s, Denver, D e t r o i t, Green, Houston, I n d i a n a p o l i s, J a c k s o n v i l l e, Kansas, Miami, Minnesota, Giants, J e t s, England, Orleans, Oakland, P h i l a d e l p h i a, Pittsburgh, Diego, Francisco, S e a t t l e, 7

main.js team index = 1 ; Louis, Tampa, Tennessee, Washington ] ; while ( team index <= 32) { f o r ( i = 1 ; i < teams. l ength ; i++) { c o n s o l e. l o g ( i ) c o n s o l e. l o g ( teams [ 0 ] ) ; c o n s o l e. l o g ( teams [ i ] ) ; var r e s u l t = c h i l d P r o c e s s. execsync ( c a s p e r j s t e s t. j s + teams [ 0 ] + + teams [ i ] ) ; var $ = c h e e r i o. load ( r e s u l t ) ; var date, away, away score, home, home score, r e s u l t, home spread, ats, t o t a l, ou ; var j s o n = { date :, away :, away score :, home :, home score :, r e s u l t :, home spread :, a t s :, t o t a l :, ou : }; var n o e n t r i e s = $( #block system main t a b l e : nth of type ( 2 ) tbody ). f i n d ( tr ). length ; var no columns = $( #block system main t a b l e : nth of type ( 2 ) tbody t r ). eq ( 1 ). f i n d ( td ). length ; f o r ( j = 0 ; j < n o e n t r i e s ; j++){ date = $( #block system main t a b l e : nth of type ( 2 ) tbody t r ). eq ( j ). c h i l d r e n ( td ). eq ( 0 ). t e x t ( ) ; away = $( #block system main t a b l e : nth of type ( 2 ) tbody t r ). eq ( j ). c h i l d r e n ( td ). eq ( 1 ). t e x t ( ) ; away score = $( #block system main t a b l e : nth of type ( 2 ) tbody t r ). eq ( j ). c h i l d r e n ( td ). eq ( 2 ). t e x t ( ) ; home = $( #block system main t a b l e : nth of type ( 2 ) tbody t r ). eq ( j ). c h i l d r e n ( td ). eq ( 3 ). t e x t ( ) ; home score = $( #block system main t a b l e : nth of type ( 2 ) tbody t r ). eq ( j ). c h i l d r e n ( td ). eq ( 4 ). t e x t ( ) ; r e s u l t = $( #block system main t a b l e : nth of type ( 2 ) tbody t r ). eq ( j ). c h i l d r e n ( td ). eq ( 5 ). t e x t ( ) ; home spread = $( # block system main t a b l e : nth of type ( 2 ) tbody t r ). eq ( j ). c h i l d r e n ( td ). eq ( 6 ). t e x t ( ) ; a t s = $( #block system main t a b l e : nth of type ( 2 ) tbody t r ). eq ( j ). c h i l d r e n ( td ). eq ( 7 ). t e x t ( ) ; t o t a l = $( #block system main t a b l e : nth of type ( 2 ) tbody t r ). eq ( j ). c h i l d r e n ( td ). eq ( 8 ). t e x t ( ) ; ou = $( #block system main t a b l e : nth of type ( 2 ) tbody t r ). eq ( j ). c h i l d r e n ( td ). eq ( 9 ). t e x t ( ) ; f s. appendfile ( output. csv, + date + + s e p a r a t o r + away + s e p a r a t o r + 8

main.js } ; } ; }) ; away score + s e p a r a t o r + home + s e p a r a t o r + home score + s e p a r a t o r + r e s u l t + s e p a r a t o r + home spread + s e p a r a t o r + a t s + s e p a r a t o r + t o t a l + s e p a r a t o r + ou + newline, f u n c t i o n ( e r r ) { c o n s o l e. l o g ( F i l e s u c c e s s f u l l y w r i t t e n! ) } ; var f i r s t = teams. s h i f t ( ) ; teams. push ( f i r s t ) ; team index = team index + 1 ; 9

test.js test.js var casper = r e q u i r e ( casper ). c r e a t e ({ verbose : true, l o g L e v e l : debug }) ; casper. s t a r t ( http : / /www. oddsshark. com/ n f l / database ) ; casper. thenevaluate ( f u n c t i o n ( ) { // Number o f Games document. getelementbyid ( games 30 h2h ). checked = checked ; document. getelementbyid ( l o c a t i o n home h2h ). checked = checked ; // Month document. getelementbyid ( chalk s e l e c t month h2h ). getelementsbytagname ( option ) [ 0 ]. s e l e c t e d = s e l e c t e d ; // Regular Season or Postseason document. getelementbyid ( chalk s e l e c t game type h2h ). getelementsbytagname ( option ) [ 1 ]. s e l e c t e d = s e l e c t e d ; // Favorite or Underdog document. getelementbyid ( chalk s e l e c t odds h2h ). getelementsbytagname ( option ) [ 4 ]. s e l e c t e d = s e l e c t e d ; // Line Range document. getelementbyid ( txt min h2h ). value = 100 ; document. getelementbyid ( txt max h2h ). value = 1 0 0 ; }) ; casper. then ( f u n c t i o n ( ) { // Team Name t h i s. sendkeys ( input#team search h2h, casper. c l i. get ( 0 ), { keepfocus : true }) ; casper. page. sendevent ( keypress, casper. page. event. key. Down) ; t h i s. c l i c k ( ul. ui autocomplete l i. ui menu item : f i r s t c h i l d a ) ; // Opponent Name t h i s. sendkeys ( input#opponent search h2h, casper. c l i. get ( 1 ), { keepfocus : true }) ; casper. page. sendevent ( keypress, casper. page. event. key. Down) ; casper. page. sendevent ( keypress, casper. page. event. key. Down) ; casper. page. sendevent ( keypress, casper. page. event. key. Enter ) ; }) ; casper. then ( f u n c t i o n ( ) { // Click Submit t h i s. c l i c k ( # submit h2h ) ; }) ; casper. then ( f u n c t i o n ( ) { c o n s o l e. l o g ( casper. getpagecontent ( ) ) ; }) ; casper. run ( ) ; 10

plots.r plots.r l i b r a r y ( ggplot2 ) ### Read CSV n f l d a t a < read. csv ( f i l e =.. / d a t a c o l l e c t i o n / output. csv, sep =,, header = TRUE) ### Set Date as Date Type n f l d a t a $ d a t e < as. Date ( n f l d a t a $ d a t e, %b %d, %Y ) ### Take 20 Year Subset o f Data n f l d a t a < subset ( n f l d a t a, subset = n f l d a t a $ d a t e >= 1995 07 01 ) ### Process Home Data Frame to I n c lude Only Relevant F i e l d s ### n f l h o m e d f < data. frame (home = nfl data$home, pointsats = n f l d a t a $ h o m e s c o r e + nfl data$home spread n f l d a t a $ a w a y s c o r e, a t s = n f l d a t a $ a t s ) ### Set ATS as Binary n f l h o m e d f $ a t s < i f e l s e ( n f l h o m e d f $ a t s == W, 1, 0) ### Aggregate Data Frame According t o Means nfl home mean df < setnames ( aggregate ( nfl home df$pointsats, by = l i s t ( nfl home df$home ), FUN = mean), c ( team, mean ) ) ### Make Bar Graph o f Means ggplot ( data = nfl home mean df, aes ( x = r e o r d e r ( team, mean), y = mean) ) + geom bar ( s t a t = i d e n t i t y ) + xlab ( Home Team ) + ylab ( Mean Points ATS ) + #g g t i t l e ( Mean Points Above Spread f o r Home Teams ) + c o o r d f l i p ( ) ggsave ( images /home team mean. png ) ### Aggregate Data Frame According to Percents n f l h o m e p e r c e n t d f < setnames ( aggregate ( n f l h o m e d f $ a t s, by = l i s t ( nfl home df$home ), FUN = mean), c ( team, percent ) ) ### Make Bar Graph o f Percents ggplot ( data = n f l h o m e p e r c e n t d f, aes ( x = r e o r d e r ( team, percent ), y = percent ) ) + geom bar ( s t a t = i d e n t i t y ) + xlab ( Home Team ) + ylab ( Percent ATS ) + ylim ( 0, 1) + #g g t i t l e ( Percent Above Spread f o r Home Teams ) + c o o r d f l i p ( ) ggsave ( images / home team percent. png ) ### Process Away Data Frame to I n clude Only Relevant F i e l d s ### n f l a w a y d f < data. frame ( away = nfl data$away, pointsats = n f l d a t a $ a w a y s c o r e nfl data$home spread nfl data$home score, a t s = n f l d a t a $ a t s ) ### Set ATS as Binary 11

plots.r n f l a w a y d f $ a t s < i f e l s e ( n f l a w a y d f $ a t s == L, 1, 0) ### Aggregate Data Frame According t o Means nfl away mean df < setnames ( aggregate ( nfl away df$pointsats, by = l i s t ( nfl away df$away ), FUN = mean), c ( team, mean ) ) ### Make Bar Graph o f Means ggplot ( data = nfl away mean df, aes ( x = r e o r d e r ( team, mean), y = mean) ) + geom bar ( s t a t = i d e n t i t y ) + xlab ( Away Team ) + ylab ( Mean Points ATS ) + #g g t i t l e ( Mean Points Above Spread f o r Away Teams ) + c o o r d f l i p ( ) ggsave ( images /away team mean. png ) ### Aggregate Data Frame According to Percents n f l a w a y p e r c e n t d f < setnames ( aggregate ( n f l a w a y d f $ a t s, by = l i s t ( nfl away df$away ), FUN = mean), c ( team, percent ) ) ### Make Bar Graph o f Percents ggplot ( data = n f l a w a y p e r c e n t d f, aes ( x = r e o r d e r ( team, percent ), y = percent ) ) + geom bar ( s t a t = i d e n t i t y ) + xlab ( Away Team ) + ylab ( Percent ATS ) + ylim ( 0, 1) + #g g t i t l e ( Percent Above Spread f o r Away Teams ) + c o o r d f l i p ( ) ggsave ( images / away team percent. png ) ### Process Home and Away Data Frames f o r Means nfl home sum mean df < setnames ( aggregate ( nfl home df$pointsats, by = l i s t ( nfl home df$home ), FUN = sum), c ( team, sum ) ) nfl away sum mean df < setnames ( aggregate ( nfl away df$pointsats, by = l i s t ( nfl away df$away ), FUN = sum), c ( team, sum ) ) n f l h o m e l e n g t h d f < setnames ( aggregate ( nfl home df$pointsats, by = l i s t ( nfl home df$home ), FUN = l e ngth ), c ( team, l ength ) ) n f l a w a y l e n g t h d f < setnames ( aggregate ( nfl away df$pointsats, by = l i s t ( nfl away df$away ), FUN = l e ngth ), c ( team, length ) ) n f l t o t a l m e a n d f < data. frame ( team = nfl home sum mean df$team, mean = ( nfl home sum mean df$sum + nfl away sum mean df$sum ) /( n f l h o m e l e n g t h d f $ l e n g t h + n f l a w a y l e n g t h d f $ l e n g t h ) ) n f l t o t a l m e a n d f < n f l t o t a l m e a n d f [ order ( n f l t o t a l m e a n d f $ m e a n ), ] ### Make Bar Graph o f Means ggplot ( data = n f l t o t a l m e a n d f, aes ( x = r e o r d e r ( team, mean), y = mean) ) + geom bar ( s t a t = i d e n t i t y ) + xlab ( Team ) + ylab ( Mean Points ATS ) + #g g t i t l e ( Mean Points Above Spread f o r Teams ) + c o o r d f l i p ( ) 12

plots.r ggsave ( images /team mean. png ) ### Process Home and Away Data Frames f o r Percents n fl h ome s u m p e r c e n t d f < setnames ( aggregate ( n f l h o m e d f $ a t s, by = l i s t ( nfl home df$home ), FUN = sum), c ( team, sum ) ) n f l a w a y s u m p e r c e n t d f < setnames ( aggregate ( n f l a w a y d f $ a t s, by = l i s t ( nfl away df$away ), FUN = sum), c ( team, sum ) ) n f l t o t a l p e r c e n t d f < data. frame ( team = nfl home sum percent df$team, percent = ( nfl home sum percent df$sum + nfl away sum percent df$sum ) /( n f l h o m e l e n g t h d f $ l e n g t h + n f l a w a y l e n g t h d f $ l e n g t h ) ) n f l t o t a l p e r c e n t d f < n f l t o t a l p e r c e n t d f [ order ( n f l t o t a l p e r c e n t d f $ p e r c e n t ), ] ### Make Bar Graph o f Percents ggplot ( data = n f l t o t a l p e r c e n t d f, aes ( x = r e o r d e r ( team, percent ), y = percent ) ) + geom bar ( s t a t = i d e n t i t y ) + xlab ( Team ) + ylab ( Percent ATS ) + ylim ( 0, 1) + #g g t i t l e ( Percent Above Spread f o r Teams ) + c o o r d f l i p ( ) ggsave ( images / team percent. png ) 13

t-test.r t-test.r ### Read CSV n f l d a t a < read. csv ( f i l e =.. / d a t a c o l l e c t i o n / output. csv, sep =,, header = TRUE) ### Set Date as Date Type n f l d a t a $ d a t e < as. Date ( n f l d a t a $ d a t e, %b %d, %Y ) ### Take 20 Year Subset o f Data n f l d a t a < subset ( n f l d a t a, subset = n f l d a t a $ d a t e >= 1995 07 01 ) ### Process Home Data Frame to I n c lude Only Relevant F i e l d s ### n f l h o m e d f < data. frame ( team = nfl data$home, pointsats = n f l d a t a $ h o m e s c o r e + nfl data$home spread n f l d a t a $ a w a y s c o r e, a t s = n f l d a t a $ a t s ) ### Set ATS as Binary n f l h o m e d f $ a t s < i f e l s e ( n f l h o m e d f $ a t s == W, 1, 0) ### Process Away Data Frame to I n clude Only Relevant F i e l d s ### n f l a w a y d f < data. frame ( team = nfl data$away, pointsats = n f l d a t a $ a w a y s c o r e nfl data$home spread nfl data$home score, a t s = n f l d a t a $ a t s ) ### Set ATS as Binary n f l a w a y d f $ a t s < i f e l s e ( n f l a w a y d f $ a t s == L, 1, 0) ### L i s t o f Teams teams < data. frame ( unique ( nfl data$home ) ) ; ### T t e s t s f o r Mean Points Above the Spread f o r ( i in seq ( 1, nrow ( teams ), by = 1) ) { home subset < subset ( nfl home df, subset = nfl home df$team == teams [ i, 1 ] ) away subset < subset ( n f l a w ay df, subset = nfl away df$team == teams [ i, 1 ] ) team games < rbind ( home subset, away subset ) t e s t < t. t e s t ( x = team games$pointsats, mu = 0, ), conf. l e v e l = 0. 9 5 ) a l t e r n a t i v e = c ( two. s i d e d } i f ( t e s t $ p. value < 0. 0 5 ) { p r i n t ( teams [ i, 1 ] ) p r i n t ( t e s t ) } ### Z Test f o r Proportion o f Time Beating the Spread n f l h o m e l e n g t h d f < setnames ( aggregate ( nfl home df$pointsats, by = l i s t ( nfl home df$home ), FUN = l e ngth ), c ( team, l ength ) ) n f l a w a y l e n g t h d f < setnames ( aggregate ( nfl away df$pointsats, by = l i s t ( nfl away df$away ), FUN = l e ngth ), c ( team, length ) ) n fl h ome s u m p e r c e n t d f < setnames ( aggregate ( n f l h o m e d f $ a t s, by = l i s t ( nfl home df$home ), FUN = sum), c ( team, sum ) ) n f l a w a y s u m p e r c e n t d f < setnames ( aggregate ( n f l a w a y d f $ a t s, by = l i s t ( nfl away df$away ), FUN = sum), c ( team, sum ) ) 14

t-test.r f o r ( i in seq ( 1, nrow ( teams ), by = 1) ) { t e s t < prop. t e s t ( x = nfl home sum percent df$sum [ i ] + nfl away sum percent df$sum [ i ], n = n f l h o m e l e n g t h d f $ l e n g t h [ i ] + n f l a w a y l e n g t h d f $ l e n g t h [ i ], p = rep ( 0. 5, times = 32) [ i ], a l t e r n a t i v e = two. s i d e d, conf. l e v e l = 0. 9 5 ) i f ( t e s t $ p. value > 0. 0 5 ) { p r i n t ( teams [ i, 1 ] ) p r i n t ( t e s t ) } } 15