Filed under:

QB Stats Matter (pt. 1)

This article uses statistics pulled from Football Outsiders and the nflSCrapR project.

I don’t discuss the effectiveness of quarterbacks without relying on statistics to make my arguments. Often in those discussions, the person with who I am engaged will espouse something along the lines of “stats don’t tell the whole story” or “I would rather have a good QB than a QB with good stats”. Usually, I will hear musings about "winners" and "clutchness" and other football woo.

I’m not saying that people with those types of beliefs about QBs are wrong . . . but they are wrong. While stats clearly can never tell the whole story, they certainly can tell the vast majority of it and while a good QB is always the desire, they simply don’t exist without good measurables.

Let’s walk through a simple example.

WIN RATE

I often see win record thrown around as proof that a QB can be more than his stats. Some QBs are just winners. The following chart is the average passing yards per game and win rate for every team between 2009 - 2019:

Notice that there is a slight upward slope in the trendline, which suggests that as QB yard totals increase, so does the team win rate. However, the line doesn’t fit the data all that well. The measure of this fit is shown with the “R-squared” value in the formula box in the lower right corner. The value of 0.2824 can be interpreted as 28.24% of a team’s win rate is described by passing yards.

That’s not all that great and it means there are plenty of teams where passing yards per game doesn’t correspond with their win rate. I have highlighted Seattle and Detroit as 2 examples of this. The Seahawks have the 6th best win rate (59.7%) but rank 22nd in passing yards at 232 per game. Conversely, the Lions average the 5th most passing yards (273) but place 26th in wins (42%). This suggests that maybe Russell Wilson (since 2012) and Matt Stafford may have value not measured by these stats.

I guess stats really don’t tell the whole story.

EFFICIENCY

The problem with measuring a QB by passing yards is that game script biases the measure. When teams lead late in the game they tend to stop passing and when they trail they tend to pass more, which leads to a disconnect between passing yards and wins.

However, this isn’t true with passing efficiency stats. The simplest passing efficiency stat to consider is yards per attempt, but since all of you are more advanced than the casual fan (else why read this?), I’ll use an advanced efficiency stat. Instead of just measuring attempts, I will measure all dropbacks by including sacks and QB scrambles. The resulting stat is net yards per dropback (NY/db).

Notice that with NY/db, the R-squared value has jumped to 0.4561 indicating a much better fit of the data to the trend line. This is why data-geeks like me bang the drum so loudly about passing efficiency stats. They have much less bias.

With NY/db, the Seahawks 11th ranked passing is much more in line with their 6th ranked win rate and the Lions data also makes more sense (18th ranked NY/db, 26th win %).

However, even with the improvement, the data still has problems. The Seahawks data point is much higher than the trendline suggesting they win more games than passing efficiency alone can explain and Detroit, being substantially below the line, wins far fewer games than they should. The Chargers have the 2nd best NY/db but barely win more than 50% of their games.

I guess stats really don’t tell the whole story.

POINTS

OR . . . maybe we should not measure a team result with a purely offensive metric. Between 2009-2019, Seattle teams had the 3rd best defenses measured by cumulative DVOA and conversely, Detroit’s defenses ranked 28th. So, maybe those extreme results had some impact on their win totals.

To test that, let’s forget about wins and instead use an offensive only measure; points, or more specifically points per drive (PPD) (1) since tempo will dictate different drive volume between teams.

When using a QB measure (e.g. NY/db), the explanation of an offensive result (points per drive) is far more descriptive than a result impacted by both offense and defense (wins). In other words, this confirms that one cannot measure a team result solely with a QB metric. The dramatically increased R-squared tells us that 78% of PPD is explained by NY/db.

Notice that both Seattle and Detroit are now basically on the trend line. Hooray! The eternal mystery of why win rate doesn’t match passing efficiency stats has finally been solved. It’s not because some QBs are clutch winners or that the magical football fairy adds some value to QB play not captured in the data. It’s because defense matters. I mean, who saw that coming? I would have never guessed. Way outta left field.

What about the teams off of the trend line? The Charges defense, which ranked 20th between 2009-2019, explained some, but not all, of the previous discrepancy. They edged closer to the trend line, but they are still significantly off it. Numerically, NY/db predicts they should have had 2.31 points per drive but they actually only had 2.10.

I guess stats really don’t tell the whole story.

EPA

Hold on, I’m not out of tricks yet. Yards may be the basic measure in football, but they are not the best measure. Their value changes with situation. Gaining 1 yard on 1st & 10 is not nearly as valuable as gaining 1 yard on 4th & 1. Similarly, an interception is counted as 0 yards in NY/db but the impact to the game is far greater than nothing.

To account for this, data analytics dudes use Expected Points Added (EPA) instead of yards. For those not familiar with EPA, here’s a link to a good overview written by Brian Burke — the guy who invented it. Basically EPA, accounts for down, distance, field position and a lot of other variables to give an appropriate comparable incremental value for each play. Here is the data again using EPA per dropback.

Using a better measure, we get better results as 92% of a team’s offensive points is described by our passing efficiency stat. In other words, the passing stat is describing the level of offensive output comprehensively.

If that were not the case, then those data points wouldn’t all line up so neatly and you would see some QB’s impact stick out like a sore thumb either far above or far below that trendline. Clearly, that doesn’t happen.

As far as the Chargers, EPA/db predicts 2.25 points per drive inching them closer to their actual of 2.10. It’s not exact, but I’ve always said that stats don’t tell the whole story.

CONCLUSION

Most QB data is not some crude estimate of what goes on in a game. It is literally the measure of events that transpire. There simply aren’t large chunks of missing player value outside of the stats like some sort of QB dark matter.

Now, you could argue that measurements might not be primarily the result of QB talent as opposed to the surrounding talent of team, coaches etc. or you could say that strength of opponent makes it difficult to directly compare QB metrics or even address issues of comparison across different time periods and I would agree that all of those are very real confounding factors.

What I don’t agree with is the idea that there is some mystical, unmeasurable, undefinable effort put forth by QBs that elevates (or depreciates) their game significantly beyond the stats. If their play on the field adds value, then it shows up in the numbers. If reading defenses and play-calling at the line adds value, then it shows up in the numbers. If leadership adds value, then it shows up in the numbers.

Bottom line, if it doesn’t show up in the numbers, then either you are using the wrong numbers or it isn’t a real thing.

FOOTNOTES

(1) Points per drive is calculated on all drives that complete before the end of a half. Point totals include points given on turnover scores and safeties. Successful field goals, extra points and 2-pt conversion are also included.