It’s not easy being a football analytics guy. Gathering and managing data is time-consuming. Learning and using coding languages is tedious. Becoming adept at methods of analysis is a never-ending life’s pursuit. Putting it all together to create a hypothesis supported by data is usually met with a resounding “whatever”.
However, for me, the reward is completely worth the effort, as there is a story told by the numbers that most fans will never understand. While I don’t expect others to be as passionate about the data as I am, I do hold out hope that I can show that there is a structure in the data that provides a deeper perception of the game than what is provided by simply watching.
To that end, I’m going to address how QB efficiency stats account for multiple aspects of the passing game and why I rely on them heavily as a measure of QB skill.
I’ll use a single chart that will be appended with incremental metrics as I go. It will show measures from 2017 for Carson Wentz and 31 other QBs with the most passing attempts.
As a walk-through, I’ll start with Air Yards per Completion, which is the distance between the line of scrimmage to the point of reception (how far the QB threw).
(NOTE: The points will have different colors, which you can ignore. They mean something to me, but it is irrelevant for this discussion. Click on the chart to see a larger image)
- The x-axis shows the stat that is being measured.
- The Y-axis is the z-score for that stat. If you don’t know what that means, just think of it as a ranking relative to an average of 0.
- The individual points represent each of the 32 QBs being measured.
Wentz is highlighted with his ranking and his actual measure displayed below the x-axis label. In this case, he had 7.8 aYD/c, which was about +2.1 standard deviations higher than average and ranked 2nd longest of 32 QBs.
Let’s add 2 more measures.
Yards After the Catch (YAC) is the complement to aYd/c and while Wentz’s passes only managed the 25th longest YAC, that isn’t necessarily poor receiver performance. There is an inherent negative correlation between YAC and aYd/c (about -0.7), which makes sense as screen passes have a lot of YAC and bombs into coverage don’t. So, it is normal to see this kind of disparity between aY/c and YAC. What matters most is the aggregate total of those yards, which is Yards per Completion (Yd/c).
- Yd/c = aYd/c + YAC
Wentz had the 3rd longest Yd/c and in isolation, that doesn’t mean much. QB vision and aggressiveness certainly play a part in this measure, but this is a limited metric.
Not all passes hit their mark.
The longer the pass attempted, the less likely it is to be caught. Wentz’s completion rate only ranked 25th, but he also had the 2nd longest passes, so comparing that directly to other QBs isn’t all that meaningful.
Expected Completion Rate (XC%) is what it sounds like: it is the completion rate to be expected for any given situation (distance thrown, field position etc.) With the 2nd longest passes, Wentz unsurprisingly had the 2nd lowest XC%.
Subtracting XC% from Cmp% yields Completion Percent Over Expected (CPOE); the higher the CPOE, the more accurate the QB. So, even though Wentz’s Cmp% was low, his CPOE ranked 14th, demonstrating about average accuracy.
QB Accuracy is inherent to Yards per Attempt (YPA) as can be shown by writing the equation in different ways:
- YPA = Passing Yards / Attempts
- YPA = Yds/c * Cmp%
- YPA = (aYd/c + YAC) * (XC% + CPOE)
The last equation shows how increased accuracy (CPOE) leads to increased efficiency.
Notice how all these measures are interrelated. If a QB throws farther, his increased air yards (aYd/c) will directionally increase YPA, but it would also likely decrease YAC and lower the expected completion %, which could then be countered by higher accuracy (CPOE).
Good QBs balance all these inputs to optimize overall passing yardage. Wentz had the 3rd longest completions but average accuracy dragged that rank down to the 11th best YPA.
Not all pass plays end with pass attempts.
A QBs reaction to pressure and pocket presence directly impact the number of sacks he takes. Wentz had a below-average sack rate (SCK%) and when he was sacked, he was good at minimizing lost yards (Y/sck). Extending YPA to include sack data results in Net Yards per Attempt:
- NY/a = (Passing Yards - Sack Yards ) / (Att + Sacks)
For Wentz, his 11th ranked NY/a isn’t different from his YPA ranking, but for a QB like Russell Wilson, who takes a lot of sacks, that lost yardage adds a lot of information about the quality of his play not captured by YPA alone.
Of course, sacks aren’t the only outcome of pressure.
Wentz had the 5th highest scramble rate (SCR%) of any QB — demonstrating his mobility — and on those scrambles, he gained above average yardage. Add those scrambles to NY/a and you get Net Yards per Dropback (NY/d):
- NY/d = (Passing Yards - Sack Yards + Scramble Yards) / (Att + Sacks + Scrambles)
That’s a fairly straightforward calculation, but let me complicate it horribly in terms of the stats discussed so far:
- NY/d = (XC% + CPOE) * (AYd/c + YAC) * (1 - SCK% - SCR%) - (Y/sck * SCK%) + (Y/scr * SCR%)
Now that is a ludicrous way to calculate NY/d, but it demonstrates that the measure includes aspects of accuracy, vision, pocket presence, mobility and other skills. Advanced QB efficiency measures are powerful because they include all of these individual abilities like voices in a chorus.
There are passing events that occur in the game, whose value is not captured by yards.
Touchdowns have a value beyond the yards it took to achieve them. Similarly, an interception or sack-fumble can have a devastating team impact but counts as a benign 0 yards. Even first down conversions, which extend drives get no specific value in yardage efficiency stats. In 2017, Wentz led the league in TD rate, had a very low INT rate (26th), and was 11th best at moving the chains with his arm, but none of that is included in NY/d.
Combining yardage efficiency with these events will make an even stronger measure. Passer rating attempts this but it has serious flaws that can misrepresent the true value a QB adds. So, a better and more comprehensive metric is Adjusted Net Yards per Attempt (ANY/a).
ANY/a is simply NY/a with TD% and INT% added in using yardage equivalents:
- ANY/a = (Pass Yds - Sck Yds + 20 * Pass TDs - 45 * Int) / ( Att + Sacks)
This ignores QB scrambles though, so I’ll add them in to get what I’ll call Adjusted Net Yards per Dropback:
- ANY/d = (Pass Yds - Sck Yds + Scrm Yds + 20 * (Pass TDs + Scrm TDs) - 45 * (Int + Scrm Fum) / ( Att + Sacks + Scrambles)
The last step it to add a yardage equivalent for QB first downs (1st = 9 yards):
- ANY/d = (Pass Yds - Sck Yds + Scrm Yds + 20 * (Pass TDs + Scrm TDs) - 45 * (Int + Scrm Fum) + 9 * (Pass 1st + Scrm 1st)) / ( Att + Sacks + Scrambles)
Whew! That is a monster, but the result is an efficiency stat that is a comprehensive measure of the passing game. Accuracy, vision, reaction to pressure, mobility, pocket presence, ability to move the chains, ball security, and scoring are all in there.
When accounting for these events, Wentz's efficiency improves from 11th (NY/d) to 6th (ANY/d).
So far, I’ve written a lot about measuring QBs in terms of yards. Let’s throw that away.
Not all yards are equal. A 1 yard gain on 3rd & 1 is far more valuable than a 1 yard gain on 1st & 10. A pick-6 is a far worse outcome than a hail-mary interception. This is why stat-nerds embrace Expected Points.
. . . if we look at all 1st and 10s from an offense’ own 20-yard line, the team on offense will score next slightly more often than its opponent. If we add up all the ‘next points’ scored for and against the offense’s team, whether on the current drive or subsequent drives, we can estimate the net point advantage an offense can expect for any football situation . . . These net point values are called Expected Points (EP), and every down-distance-field position situation has a corresponding EP value.
The difference in EP before and after a play is called Expected Points Added (EPA) and it measures the value of a play’s outcome in terms of points, not yards. For example, look at the following from a Colts game last year where the offense started a new series 11 yards from the end zone.
|Down||Distance||Field Pos.||Start EP||Result||End EP||EPA||Cumulative EPA|
|Down||Distance||Field Pos.||Start EP||Result||End EP||EPA||Cumulative EPA|
|1||10||opp 11||5.41||3 yard run||5.24||-0.18||-0.18|
|2||7||opp 8||5.24||2 yard run||4.64||-0.59||-0.77|
|3||5||opp 6||4.64||6 yard pass: Touchdown||7||2.36||1.59|
On 1st down, 7 points is probable, but not guaranteed, so the EP is 5.41. After a 3 yard run, the EP drops to 5.24 reflecting a lowered TD probability, so the value of that play was 5.41 - 5.24 = -0.18 EPA.
Another run of only 2 yards earns -0.59 EPA. Since the team has burned 2 plays and is now less likely to score, the cumulative series EPA is -0.77. However, when Rivers throws a TD on 3rd down, the scoring likelihood is 100% and so the play earns +2.36 EPA.
Had that pass been incomplete or short of the first down, the EP would have dropped significantly representing the increased probability of a field goal attempt or a turnover on downs and Rivers would have earned negative EPA.
If you take the average EPA of every passing play, you get EPA per dropback (EPA/d), which is the average value a QB adds. It is the translation of skills into value measurement.
In 2017, Wentz had the 2nd best EPA/d, while the #1 spot was MVP winner, Tom Brady. 17 of the last 19 MVP QBs finished the season #1 or #2 in EPA/d and 20 of the last 22 Superbowls were decided by the team with the better EPA/d.
As these stats incorporate more and more QB skills, they explain offensive scoring better and better. Here are correlations of each metric to points scored.
The black bars compare the metrics to points scored within the game (explanatory), while the green bars are comparisons to points scored in different games (predictive).
AYd/c and YAC by themselves aren’t correlated well to points but combine them into Yd/c and the correlations are better. Add accuracy measures to get YPA and the correlation grows dramatically. Add in mobility and reaction to pressure for ny/d and its stronger still. The most comprehensive skill stats (ANY/d, EPA/d) are the most related to scoring.
Advanced QB efficiency stats reflect how QB skills present on the field. However, in isolation, they don’t delineate individual strengths and weaknesses. A 0.05 EPA/db tells me the QB performance was average overall, but it doesn’t tell me if that QB was mobile or a bad deep ball passer etc. More than a single stat is required to assess the whole skillset.
Additionally, QB efficiency is impacted by more than just QB talent. O-line, receivers, coaching, game plan, injury, opponent defense, even weather can play a part. The stats are never a pure QB skill measure.
As such, many tend to give little credence to them, claiming that they simply can’t capture the true performance of a QB. I’ll address that more in part 2, but for now let’s hope they are right because this was Carson Wentz in 2020 . . . yikes.
aYd/c: Air yards per completion. The distance between the line of scrimmage to the point of reception.
YAC: Yards After Catch. The distance between point of reception and the spot of the football at the end of the play.
Yd/c: Yards per Completion
Cmp%: Completion Rate
XC%: Expected Completion rate. Completion rate adjusted for distance thrown, field position and other variables.
CPOE: Completion Percent Over Expected. Cmp% - XC%
YPA: Yards per Attempt
Sck%: Sacks as a percentage of dropbacks
Y/SCK: Avg Sack Yards
NY/a: Net Yards per attempt. (Passing Yards - Sack Yards) / (Att + Sacks)
Scrm%: Scrambles as a percentage of dropbacks
Y/SCR: Avg Scramble Yards
NY/d: Net Yards per dropback. (Passing Yards - Sack Yards + Scramble Yards) / (Att + Sacks + Scrambles)
TD%: Touchdown rate. Passing TDs / dropbacks
INT%: Interception rate. INTs / dropbacks
ANY/d: Adjusted Net Yards per dropback. (Pass Yds - Sck Yds + Scrm Yds + 20 * (Pass TDs + Scrm TDs) - 45 * (Int + QB Fum) + 9 * (Pass 1st + Scrm 1st)) / ( Att + Sacks + Scrambles)
EPA/d: Expected Points Added per dropback.