Thanks to the nflFastR project and NFL NextGen Stats for the timely sources of data.
In part 1 of this series, I showed how advanced QB metrics — like EPA per dropback (epa/d) — embody a QB’s skillset (accuracy, mobility, vision, etc.). Of course, whether you are a stat geek, film watcher, armchair QB, or barstool fan, no single method of judging QB play is without flaws and epa/d is no exception.
A common objection I see to “trusting” QB stats is the impact of surrounding talent. For every TD bomb that a QB throws, there is a receiver that got open and made the catch. For every sack that doesn’t occur, there is an offensive line fighting to ensure the QB had time. A QB can’t put up any numbers without a team around him and a coaching organization developing and designing the offense. Because of this, some argue that QB efficiency is primarily a measure of team skill and not QB skill.
While there is some truth in that view, it is mostly incorrect. Obviously, surrounding talent is a significant variable in QB efficiency. If Peyton Manning had been surrounded by a bunch of 6th graders, his numbers would have been terrible. However, with NFL-level talent, QBs tend to maintain their efficiency over their careers even as the team around them changes. Consistently high-efficiency QBs seldom become low-efficiency QBs. And poor performers not named Ryan Tannehill don’t suddenly become good. Certainly, there are exceptions, but a good expectation of a QB’s future performance is his past performance, regardless of his coaches or teammates.
This is true even when a QB moves to a new team and experiences a 100% personnel change. Let me frame the argument numerically. Assume Team A has an average +0.10 epa/d, while Team B averages -0.10 and at the end of the season they swap QBs. The following year, if the QBs were to retain their same efficiency, then by definition, team A would see a -0.20 drop in efficiency (+0.10 to -0.10), and team B would see a +0.20 increase. However, if instead, the teams' numbers were to remain constant, then the QBs would experience the 20 point swings. In reality, the numbers aren’t that definitive, but the point remains that if QBs drive efficiency more than teams/coaches do, then that should be measurable.
So, I measured it.
QB EFFICIENCY CHANGES
Since 2001, there have been 45 QBs that have started at least 14 games with 2 (or more) different teams. Here are a few:
To give some scope here, a crude rule of thumb is that a 5 point change in epa/d (+/- 0.05) equates to about a 3 spot change in QB rank for a season (e.g. 16th to 13th). So a 0.05 change isn’t all that much, but 0.15 is substantial.
Many of these QBs shown above didn’t see much change in their efficiency on different teams. Even QBs that played for 3 teams, like Fitzpatrick, stayed fairly consistent. That isn’t always the case though: Drew Brees saw a healthy improvement when he went to the Saints (+0.12) and Ryan Tannehill is a Tennessee phoenix rising from his Miami ashes (+0.27).
Here is a graph of all 45 QBs showing their efficiency changes between teams:
Each QB is represented by a pair of dots: the gray dot is his efficiency on the team he left and the blue dot is efficiency on the team joined. The arrows represent the amount of change: green arrows are improvements, red arrows are declines. The dashed line is the NFL average (+0.04).
Clearly, the arrows are not 0 length and there is a significant reversion to the mean, so the QBs do experience some efficiency changes, but is it a lot? The net average change is 0, but I don’t care about the net impacts, rather it is the magnitude of those changes that matter. In other words, if QB 1 gains +0.20 and QB 2 loses -0.20, the important takeaway is that they both experienced 20 points changes, not that those changes net to 0.
The best way to measure this overall variance is with the standard deviation, which in this case is 0.104. 31 of the 45 QBs (68.9%) experienced an epa/d change of less than 0.104, so that lines up nicely with statistical theory (68.3% of the area under a normal curve is within 1 standard deviation of the mean).
However, there is a multitude of variables besides surrounding personnel that cause a variance in QB performance, so to have any relative meaning the 0.104 needs to be compared against “normal” variance. In the footnotes, I describe the process I used to determine that value (1) but the result was that baseline variance is not statistically different from 0.104.
This means that when switching teams and experiencing a 100% personnel change, a QB’s efficiency varies about the same as if they didn’t switch teams and had a more consistent surrounding cast. This suggests the surrounding players/coaches are not the primary driver of passing efficiency.
TEAM EFFICIENCY CHANGES
The table I displayed, shows that Cam Newton experienced a drop of -0.08 in efficiency when moving to NE, but you have to do some extra math to see that NE experienced a -0.17 drop with that signing. Peyton Manning saw no efficiency change when he went to Denver, but Denver sure did (+0.19).
Here are the efficiency changes from the team’s point of view for those 45 QBs. There is 90 data points as for every QB move there is a “from” team and a “to” team.
It may not be immediately apparent from the graph, but the length of those arrows is much longer than the QB graph. The standard deviation of these changes is 0.142, which is significantly higher than the 0.104 the QBs experienced(2)
So, whereas 2⁄3 of QBs that switched teams experienced < 0.10 change in epa/d, about 1⁄2 of the teams experienced at least that much of a change with their new QB.
This all suggests that QBs impact the team's performance far more than the teams impact the QB’s and thus passing efficiency is a QB measure NOT a team measure.
One flaw with the above analysis is that it ignores that when teams change QBs, they may also significantly change other personnel as well. So, the QB is not the only variable being measured. To better isolate QB impact, I looked at teams that changed their QBs within a season. That way, the surrounding cast hardly changes at all.
For example, here are the teams from 2020 that had more than 1 QB play significant time:
|PHI||Carson Wentz||-0.14||Benched||Jalen Hurts||0.00||0.15|
|NYJ||Sam Darnold||-0.18||Injured||Joe Flacco||-0.05||0.13|
|WAS||Dwayne Haskins||-0.14||Benched||Alex Smith||-0.14||0.01|
|NO||Drew Brees||0.17||Injured||Taysom Hill||0.03||-0.13|
|DAL||Dak Prescott||0.13||Injured||Andy Dalton||-0.02||-0.15|
|SF||Jimmy Garoppolo||0.17||Injured||Nick Mullens||0.00||-0.16|
|CIN||Joe Burrow||0.07||Injured||Brandon Allen||-0.10||-0.17|
|CHI||Mitchell Trubisky||0.11||Benched||Nick Foles||-0.10||-0.21|
|MIA||Ryan Fitzpatrick||0.19||Benched||Tua Tagovailoa||-0.03||-0.22|
|JAX||Gardner Minshew||0.09||Injured||Mike Glennon||-0.15||-0.23|
Notice that every team but Washington saw large changes in their passing efficiency with a different QB. So, I’m just going to ask it: if Carson Wentz’s only problem was that his team was horrible, then why didn’t Jalen Hurts have that same problem?
I wanted to separate injury replacements from benchings, but identifying that distinction was manual and took a lot of time, so I only went back to 2008. Using a requirement that both the starter and the replacement must have had at least 4 full games played, I identified 96 QB changes (192 QBs).
Again, to know how significant these numbers are, I will need a baseline of “normal” variance but I can’t use 0.104 as that was based on multiple years of games, whereas this measure needs to be based on partial seasons and will be much higher. Using an intra-season methodology, I found a baseline standard deviation of 0.153(3).
This means if the QB is NOT the primary driver of efficiency, then teams will experience volatility of about 0.153 when they switch signal-callers, but if the volatility is higher than 0.153, that suggests the QB is driving the stats and not the surrounding talent.
Let’s start with injuries. Here is what the data looks like for injury replacements:
The gray dots are the original starting QB and the blue dots are the replacement QB. The standard deviation of all those changes is 0.183 which is significantly higher than baseline(4).
This really should be a “no duh” result: backups are almost always worse than starters so of course the level of play changes . . . it gets worse. On average, the replacements had -0.08 lower epa/d.
The more interesting case is the benched QBs. I wasn’t really sure what to expect with this view. On one hand, if the QB is playing so poorly as to be benched then the expectation is the backup will do better. On the other hand, the backup shouldn’t have more skill so maybe it’s a wash and the efficiency stays relatively flat.
There is still a lot of movement. The standard regression to the mean is evident with poor performers (left side of the chart) having replacements that do better and better performers (right side of the chart) having replacements that do worse. But, the volatility to accomplish all of that is 0.182, almost exactly the same as it was for the injury replacements and much higher than the expected baseline(4).
So, just like the previous calculations, this all supports that teams don’t alter a new QB’s performance nearly as much as a new QB alters the team’s performance. I said it before and I’ll say it again: passing efficiency is a QB measure NOT a team measure.
(NOTE: This was written prior to the Carson Wentz injury. With Wentz now expected to miss meaningful time, the references about his expected play in 2021 is less applicable, but I’ll present the math anyway as originally written.)
What does this all mean? It means it is more likely than not that a QB will perform at a level similar to how they did in the past even if they switch teams. Therefore, measuring performance on the previous team can provide insight as to how they might perform in the future. This, of course, brings us to Carson Wentz.
Of the 32 QBs with the most passing attempts since Wentz entered the league, he ranks 23rd in passing efficiency with a +0.04 epa/d. For comparison, that is only a little better than what Jacoby Brissett did in 2019 (+0.02). If that is the Wentz we get this year, then the offense will be far worse than 2020. That efficiency would equate to about 4.5 fewer points per game than what Rivers managed(5), moving the Colts from a top 10 offense to the bottom 10.
Of course, not every QB that moves to a new team is locked into their career average and so Wentz might improve. To assign some sort of probability to it, if I assume epa efficiency follows a normal distribution and the NFL historical averages hold (0.04 mean, 0.104 standard deviation), then there is a 50% chance he improves over his career efficiency and about a 10% chance that he can achieve the same or better efficiency than Rivers did last year, in which case the offense scores as much or more than 2020.
So, can the data provide some view into the future other than past averages? Yes, I believe so. When breaking apart Wentz’s numbers, I see room for optimism, but I will discuss that in part 3.
(1) For QBs that changed teams, I totaled the dropbacks they had on each team and used that ratio to split the games on the initial team into 2 groups. For example, if a QB had 1,500 dropbacks on team A and 500 dropbacks on team B, then I randomly split Team A’s games into a group with 75% of the dropbacks and a group with 25%. I then determined the standard deviation of the epa/d differentials between groups to be 0.111 and an F-test determined no significant difference ( P(F<=f) one-tail 0.22).
(2) ( P(F<=f) one-tail 0.049)
(3) I started with 250 teams seasons where a QB played all 16 games and split those seasons into 2 groups randomly forcing the same game volume percentages as the sample group. For example, 5.2% of the sample teams had 1 QB play 6 games and then be replaced by a QB for 10 games. That same 5.2% was applied to the 250 so that 13 QB seasons were split randomly into 6 game groups and 10 game groups. I then determined the standard deviation of the epa/d differentials between groups to be 0.148 and an F-test determined no significant difference ( P(F<=f) one-tail 0.22).
(4) The standard deviation of efficiency changes for all intra-season QB replacements was 0.183, which is statistically different than the baseline of 0.153 ( P(F<=f) one-tail 0.015).
(5) A regression of epa/d against net points per drive results in a trendline formula of ppd = 3.48 * epa/d + 1.77. Rivers 0.17 epa/d equates to 2.36 ppd whereas a 0.04 epa/d would be 1.91, a difference of -0.45 ppd. The Colts average about 10 drives a game and so that difference comes to 4.5 points per game.