/cdn.vox-cdn.com/uploads/chorus_image/image/67376303/1228239995.jpg.0.jpg)
All data originated from Pro Football Reference and nflFastR.
Normally, I don’t like to use stats to make specific forecasts about NFL wins before the season starts. Not only are future drivers of events unknown, but the impact of those drivers are hard to estimate, even with plenty of historical data. As such, pre-season models just aren’t very good at predicting individual game outcomes.
However, if a model is built with minimal bias then errors from those unknown drivers tend to be random and cancel each other out over multiple games. This means that while maybe a model doesn’t predict single games outcomes very well, it still might be able to do a decent job at predicting season results.
So I took a stab at that and built a simple 2020 season model. Let me stress “simple”. This is more an exercise in model building than it is a rigorous effort at forecasting.
MODEL AND TARGETS
The first thing I did was set a target to judge my model accuracy. Obviously 100% would be great, but clearly unattainable. Based on point spreads, Vegas has predicted about 65.7% of all regular season NFL winners since 2010.
Now, before anyone gets knots in their undergarments, I realize that Vegas does not actually predict games. They just set lines to try and get an even amount of money on either side of the proposition. However, when Team A is a point-spread favorite, this signifies that the money-weighted, mean prediction from the betting population is that team A is more likely to win and Vegas is the clearinghouse and arbiter of that. So, I can go into a protracted discussion about the wisdom of crowds and market efficiency or I can just say “Vegas predicts”. It’s a distinction without a difference and I choose simplicity.
Where was I? Oh yeah, Vegas picks winners 65.7% of the time and I’m not going to pretend I’m smarter than that, so that is a good ceiling of accuracy for any predictive model I make.
For this exercise, I’m going to use logistic regression, which takes variable inputs (like NFL stats) and assigns a 0% to 100% probability of a binary outcome (like a win or loss). If that is Greek to you, then the easy-button version is that the model converts my stats into a formula that I can use to calculate the probability of a team win for any given game. If the probability is greater than 50%, it predicts a win, otherwise it’s a loss.
VARIABLES
I know from many years of analysis, that an NFL win is primarily determined by passing efficiency differentials between the teams. I also know that passing efficiency is best measured by EPA per dropback (EPA/db). So, I’ll start with 2 variables: EPA/db and opponent EPA/db.
QB performance varies from game to game (sometimes drastically) and so to predict passing efficiency for a game, it’s best to use some type of trailing average. Instead of just picking an arbitrary time period, I calculated multiple EPA/db rolling averages using 3 to 30 week ranges (even if they were over multiple seasons or multiple QBs) and compared the predicted win performance of each range against actual wins.
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/21864519/Optimal_Weeks_1.png)
That’s not a bad start. It’s much better than a coin flip, but it’s significantly below target, so I need to add more variables.
Notice that the different rolling time periods have varying performance, seeming to peak around 9 to 14 weeks. There is a spike at 5 weeks that looks like a spurious result: why would 5 weeks be so much more accurate than 4 or 6? This illustrates the dangers of what is called overfitting the data. The accuracy with a 5 week period is likely an unrepeatable false pattern like a guy who flips 10 heads in a row on a coin, but I’ll deal with overfitting later.
To boost the model accuracy, I’ll add defensive performance. I’ll use the rolling avg of EPA/db given up by each team, which means I now have 4 passing metrics:
- Rolling average Team EPA/db
- Rolling average Opponent EPA/db
- Rolling average Team EPA/db against
- Rolling average Opponent EPA/db against
I will also add in those 4 metrics for rushing (EPA/carry) and special teams (EPA/sp tm play), bringing the total to 12 variables and measuring all phases of the game. Finally, I’ll incorporate an indicator for home and away games.
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/21864532/Accuracy_2.png)
Much better. The peak performance hovers between 64%-65% and while it is still lower than the target, I’m pretty happy with the numbers.
Keep in mind that Vegas lines use all available information right up to kickoff and I’m simply using a home/away designation and 12 game play stats calculated the week before. I’m not accounting for injuries, rested starters, roster changes, coaching changes, weather, stadiums, field type etc. If I were to build a rigorous model I would try to account for some of those variables, but for this, I’m keeping it simple.
OVERFITTING
I still need to pick an optimal rolling period and while the above chart shows that a 21 week period gives the highest accuracy result, I can’t simply pick that due to concerns of overfitting the data. In other words, the spike at 21 weeks might be an artifact in the data that won’t repeat when looking at new data and so choosing it may actually give me worse performance.
One method to counteract overfitting is to use a modeling process called 10-fold cross validation. Without getting into the details, this basically partitions the dataset, using 90% of the data to build the model and 10% to test the results against. It then does this 10 times and averages the accuracy scores to get a more unbiased picture. Here are the averages and standard deviation of the 10 test samples by rolling period.
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/21862381/Cross_Validation.png)
Notice that 21 weeks is no longer the best performer. Based on these results, I am selecting 13 weeks as the optimal trailing period due to its high test accuracy and low standard deviation in the test samples (more stable).
ACCURACY
Using 2010 - 2019 regular season data (320 team seasons), my final model had a 64.7% overall accuracy. Here is that breakout by year showing a strong correlation to Vegas predictions:
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/21865850/Untitled.png)
The model also predicted 72.5% of playoff teams compared to 73.3% by Vegas(1).
Since my model gives a win probability for each game, I can convert those probabilities to a point spread and use that as an additional accuracy metric. If point spreads reflect reality, then favored teams should cover 50% of the time.
As a comparison, the Vegas cover rate is 48.8%(2). Using my win probabilities and derived point spreads, 1,250 favored teams covered out of 2,560 games for a 48.8% ratio, exactly matching Vegas results. So, that’s pretty great.
I also measured the number of games within a team season that the model correctly predicted and compared that against Vegas.
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/21865138/Untitled.png)
My model accurately predicted 11 or more game outcomes about 50.6% of the time, whereas Vegas managed a 51.3% rate. The tiny blue column on the very far right is the 2017 Cleveland Browns, who my model predicted to go winless and was correct, going 16 for 16.
As I stated in my introduction, picking individual game winners is hard, but if I just try to predict season win totals and not necessarily the specific game outcomes, then the results are much better.
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/21865260/Untitled2.png)
I correctly predicted total season wins, not just for the 2017 Browns, but for 21% of all teams and I was within 1 win about 50% of the time.
I have to say, I was pleasantly surprised with how well this model performed overall. Of course, all I have done is predict the past, so the real test will be the upcoming season.
PREDICTIONS
For each team, I took the average performance from their last 13 games and assumed that is how they will perform each week in 2020. This actually deviates from the model as the algorithm updated its rolling averages with each new week, but I can’t update week 2 until week 1 has finished, so for now, this is the best I can do.
Here is the Colts season.
.
Week | Team | Opp | Home | Win prob | Wins |
---|---|---|---|---|---|
Week | Team | Opp | Home | Win prob | Wins |
1 | IND | JAX | Away | 51.34% | 1 |
2 | IND | MIN | Home | 39.35% | 0 |
3 | IND | NYJ | Home | 59.75% | 1 |
4 | IND | CHI | Away | 46.06% | 0 |
5 | IND | CLE | Away | 42.39% | 0 |
6 | IND | CIN | Home | 72.89% | 1 |
8 | IND | DET | Away | 49.51% | 0 |
9 | IND | BAL | Home | 18.56% | 0 |
10 | IND | TEN | Away | 27.60% | 0 |
11 | IND | GB | Home | 45.65% | 0 |
12 | IND | TEN | Home | 40.97% | 0 |
13 | IND | HOU | Away | 37.27% | 0 |
14 | IND | LV | Away | 38.68% | 0 |
15 | IND | HOU | Home | 50.64% | 1 |
16 | IND | PIT | Away | 41.64% | 0 |
17 | IND | JAX | Home | 64.71% | 1 |
TOTAL | 7.3 | 5 |
This predicts 5 Colts wins on the season. That seems low, but in defense of the model, all it is doing is looking at the previous 13 games. In that time frame, the Colts ranked 28th in passing efficiency and 30th in special teams efficiency. It is just saying if that if those trends continued in 2020, then 5 wins would be the result.
Also, that number relies on specific game predictions which, as I have stated, are very noisy. There are 3 games where the probability of a win is between 45% - 49%. If those probabilities are accurate, then it is unlikely that all 3 of those result in actual losses.
A better approach is to predict total season win volume and not the sum of individual game predictions. To do that, just add the win probabilities for each game to get the expected win total of 7.3. That still isn’t great, but of course we have had a QB change and it is unlikely that Philip Rivers won’t improve on a 28th ranked passing game. If I use his trailing 13 week numbers instead of Brissett and Hoyer, then the expected wins jump to 9.2.
Rivers had a down year in 2019. Perhaps that is an accurate reflection of his aging skill or perhaps it was fluke. If I use his 3-year average passing efficiency as a predictor for 2020, then that moves total Colts expected wins to 10.1.
Personally, I expect Rivers to rebound and I think 10 wins sounds about right. In that scenario, I have forecasted JAX at 5.5 wins, HOU at 7.7 wins and TEN at 9.6 wins, so if Rivers can return to form then my model predicts a division championship . . . barely.
Here are some other expected win results based on other QB performances.
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/21867256/Untitled2.png)
OUTRO
I have stated multiple times that I don’t like predicting individual games, so having said that, here are my individual game predictions for Week 1.
I made a single tweak to the model inputs, changing the Colts passing efficiency to match Rivers 3-year average. Other than that, the rest is completely what my blind model predicts based on 2019 numbers. These are the same numbers I would have come up with if I had done this math 8 months ago.
.
Team | H/A | Opp | Team Win prob | Pred Winner | My Spread | Vegas Spread |
---|---|---|---|---|---|---|
Team | H/A | Opp | Team Win prob | Pred Winner | My Spread | Vegas Spread |
ARI | @ | SF | 21.3% | SF | 10.1 | 7.5 |
CHI | @ | DET | 46.7% | DET | 1.1 | 3.0 |
CLE | @ | BAL | 6.0% | BAL | 15.5 | 8.0 |
DAL | @ | LA | 52.5% | DAL | -0.9 | -3.0 |
GB | @ | MIN | 36.8% | MIN | 4.6 | 3.0 |
HOU | @ | KC | 31.3% | KC | 6.6 | 9.0 |
IND | @ | JAX | 69.3% | IND | -6.9 | -8.0 |
LAC | @ | CIN | 63.3% | LAC | -4.7 | -3.0 |
LV | @ | CAR | 61.8% | LV | -4.2 | -3.0 |
MIA | @ | NE | 23.9% | NE | 9.2 | 6.5 |
NYJ | @ | BUF | 31.3% | BUF | 6.6 | 6.5 |
PHI | @ | WAS | 71.6% | PHI | -7.7 | -6.0 |
PIT | @ | NYG | 53.1% | PIT | -1.1 | -5.5 |
SEA | @ | ATL | 51.6% | SEA | -0.6 | -1.5 |
TB | @ | NO | 25.0% | NO | 8.8 | 3.5 |
TEN | @ | DEN | 59.5% | TEN | -3.4 | -1.5 |
In the 10 years of model data, Vegas and I predicted the same winner in 83.2% of the games. In Week 1, we see eye to eye on the winner for every single game, although, we still disagree on the point spreads(3).
FOOTNOTES
1) Division champions and wild card spots determined by sum of weekly forecasted wins with season point spread totals used as tiebreakers.
2) According to 2010 - 2019 regular season point spreads from Pro Football Reference data, 1,205 teams out of 2,560 games covered for a 47.07% ratio. However, Vegas is forced to round their point spreads to 1⁄2 point increments, whereas my point spreads are unrounded Real numbers. As such, Vegas experiences point spread pushes and I do not. To calculate a more apples to apples point accuracy, I removed all pushes from the Vegas data resulting in 1,205 favored teams covering out of 2,470 games for a 48.78% ratio.
3) Vegas point spreads are as of the morning of Sep 8th.