clock menu more-arrow no yes mobile

Filed under:

Winning Stats - Explained!

During the offseason last year, we looked at an array of stats reaching across all aspects of the game.  We found out which stats lead to the most wins (Drive Success Rate and Adjusted Net Passing Yards / Attempt), and which ones lead to the least wins (Yards / Carry and Net Punting Average), based on statistics from 2001 - 2008.  We had lots of time to figure out which stats we wanted to look at, but I jumped into the analysis articles without a full explanation of my methodology.

Then when I finished those around August, I had a whirlwind month where I basically had to put this stuff to the side until kickoff of Week 1, so I rushed into the season without a way of predicting games, which is ultimately my end game by doing all this.  I played around a little bit, settled on a pretty good method, and ran with it, and gave a short explanation on what I was doing.  I did the same thing with Adjusting for Opponents as well, but I don't think I really did a good job fully explaining.

After the jump I'll fully explain everything I did, complete with pictures and examples, so that you can all become experts on this stuff, and share the knowledge with the ill-informed.

Power Rankings

In order to compare all 16 stats, with their varying size (3rd/4th Down Conversion Pct. is a lot smaller value that Yards / Drive), and a few stats where smaller is better (Turnovers and 3 & Outs), we need get all stats to a common scale, namely a value between 0 and 1.  This will be the easiest to work with, and the easiest to find.  But how do we go about finding that value?  It's time for Stats 101, and a look at distributions.  For this explanation, I'm going to use ANPY/A, since it is now the best stat we've found.

The two most common distributions are a Uniform Distribution, which is the simplest distribution.  It means that any value on the distribution has the same probability of occurring.  For our example, it would be saying that the likelihood of having an ANPY/A of 0 is the same as having an ANPY/A of 5 is the same as having an ANPY/A of 15 (you get the point).  It would look like this (don't worry about the scale or labels):

Sa5_fig2_medium

 

The second common distribution is the Normal Distribution, which looks a lot like the "bell curve".  It means that the probability is much higher near the average, and gets smaller as it gets farther from the average.  For our example, it would mean it is more likely to have an ANPY/A of 5 (Average of 5.4) than an ANPY/A of 15.  Here's a picture of what this one looks like:

Normal_medium

 

So which one is the one I'm going to use?  Here's the histogram from the ANPY/A from 2001 - 2008:


Hist_anpya_medium

As you can tell quite easily,I'll be using a Normal Distribution to compare the stats.  Pretty cool, huh? From here I just let Excel do the work, and I just give it the value I need to be converted, the average, and the standard deviation, and it gives me back a number between 0 and 1, based on where it falls on the graph above.

Looking at the graph above, the new value gets higher as you go from left to right, so high ANPY/A -> high value.  So how do you get the value for defense, as you want the lowest ANPY/A?  Just subtract your value from 1.  I'll show you an example from Week 2 against the Dolphins:

ANPY/A Dist. Value New Value
Offense 13.958 0.99874 0.99874
Defense 3.400 0.23821 0.76179

The offense was phenomenal, getting almost the full 1 point.  The defense was also pretty good that Monday Night (probably because the Dolphins ran the ball so well), but you see how you can't use the same calculation as the offense, as it would make a good performance look bad.  Just subtract that value from 1, and you've got your new, ready to be weighted, value.  This is also true for a stat like Turnovers, where the offense would need the 1-Value calculation, and the defense would not, since low = better for offense, and high = better for defense.

Just to give you an idea what various ANPY/A values would be converted to:

ANPY/A -4 -2 0 2 4 5 5.5 6 6.5 7 8 10 12 14
Norm. Value 0.0004 0.0044 0.0278 0.1137 0.3085 0.4417 0.5120 0.5820 0.6494 0.7124 0.8196 0.9475 0.9900 0.9988

You'll notice that the Normalized Value (statistical name) changes drastically when between 4 and 7, but not so drastically when you get to the fringes.  When you think about it, it makes sense.  Is there really that much difference between getting 12 Y/A and 14 Y/A?  Not really, but there is a big difference between 5 and 6 Y/A.

Ok, now we have our Normalized Values for all 16 stats, on both offense and defense.  Our next step is to weight them, according to how important they are.  How important are each of them?  That depends on how many wins they lead to, which I've been giving each and every week.  Here's how they've done since 2001, when both the Offense and Defense are above average (values included as well):

Statistic Average Record Win %
ANPY/A 5.441 1104-120 90.2%
DSR 69.1% 856-102 89.4%
Turnovers 1.75 1037-223 82.3%
Yds/Drive 28.49 812-177 82.1%
ToP/Drive 2:39.3 961-249 79.4%
Yds/Play 5.173 794-210 79.1%
First Downs/Drive 1.63 755-230 76.6%
3rd/4th Down 39.1% 849-270 75.9%
Avg Start Pos 31.1 1016-373 73.1%
3 and Outs 3.91 656-275 70.5%
RZ Eff 65.6% 798-351 69.5%
Plays/Drive 5.509 712-359 66.5%
Penalty Yds / Play 0.816 625-393 61.4%
RB Success 45.7% 696-477 59.3%
Yds/Carry 4.14 615-509 54.7%
Net Punts Yds/Game 38.12 546-502 52.1%

I take the weights from the Winning Percentage, making a few adjustments.  This is the one area where I think I can improve the rankings the most, as I just kind of guessed initially adjusting these numbers.  I also want to make the weights such that I get an actual point value out of it, which I'm not doing as now.

To get the final value for both the Offense and Defense, multiply the Normalized value found earlier times the stat weight, then add together all 16 stats for the Offense and Defense, and you have a total value for each side of the ball.  The total value is a simple addition of the two.  Here's how the Colts finished this year:

Team Offense Defense Total
Colts 26.516 10.735 37.250

After you get these value for each team, you can rank them, and that's how I got my Offensive, Defensive, and Total Power Rankings.

Adjusting for Opponent

Because of the 16 game schedule, it's clearly impossible for an NFL team to play each of the other 31 teams, so playing an easier schedule can clearly help pad a team's stats, especially with a relatively small sample size.  In order to "level the playing field", we can adjust our stats for the opponent played, so each game is looked at like it was played against an "average" team.  So how do you look at each game like they're playing an "average" team?  Let's take a look...

First thing is to explain how we'll be looking at the numbers.  Every number you see in this section, until the very end, will be the statistic relative to overall average.  That means you'll see both positive and negative numbers, depending on whether they are above or below average.  We'll be using ANPY/A as our example again, which means on Offense: Positive number -> good, Negative Number -> Bad;  Defense: Negative Number -> Good, Positive Number -> Bad. 

Here were the Colts numbers for 2009:

Team Raw Off Avg Opp Def Avg Adj Off Avg Raw Def Avg Opp Off Avg Adj Def Avg
Colts 1.95243 0.23266 1.71977 -0.49511 0.07257 -0.56767

To get the Adjusted columns, I just subtracted the Opponent Average from the Raw Average (real advanced Math there).  I'll explain what these mean:

  • Raw Offensive Average was 1.95 above average (ANPY/A, from above table, is 5.441).  The Defenses the Colts Offense faced this year averaged 0.23 below average (remember on Defense, Positive -> Bad), so the adjusted number should be slightly lower because, on average, the defenses faced were below average.
  • Raw Defensive Average was 0.495 above average, and the Offenses the Colts Defense faced were 0.07 above average, so the adjusted number gets slightly better, up to 0.57 above average.  Facing better offenses means a better Adjusted Defensive average.

These same calculations are done for each team, so now each team has a new Adjusted Offensive and Defensive Average.  Now comes the tricky part... Initially, we used raw numbers for the Opponents Average, because that's all we had.  Now, however, we have these new Adjusted numbers, which is what we really want, right?  Let's take a look at what the Colts numbers look like after doing this:

Team Raw Off Avg Opp Def Avg Adj Off Avg Raw Def Avg Opp Off Avg Adj Def Avg
Colts 1.95243 0.12223 1.83020 -0.49511 -0.18188 -0.31323

You can see the difference already after just one iteration.  The defenses faced have gotten slightly better, and the offenses faced have gotten worse.  But we're not close to being done yet.  Once again, we do this for every team, and each team, once again, has a new Adjusted Offensive and Defensive Average.  Wash, Rinse, Repeat.  Here's the next few iterations:

Team Raw Off Avg Opp Def Avg Adj Off Avg Raw Def Avg Opp Off Avg Adj Def Avg
Colts 1.95243 0.15116 1.80127 -0.49511 -0.19714 -0.29796
Colts 1.95243 0.15731 1.79512 -0.49511 -0.19112 -0.30398
Colts 1.95243 0.16266 1.78976 -0.49511 -0.18582 -0.30929
Colts 1.95243 0.16734 1.78509 -0.49511 -0.18116 -0.31394
Colts 1.95243 0.17142 1.78100 -0.49511 -0.17708 -0.31802

You'll notice that the difference between the Adjusted numbers from iteration to iteration are starting to get smaller and smaller, and they will continue to do so until the change is so small they are basically identical.  It's a pretty cool process how it works, and eventually you have a final, Adjusted Average.  Here's the Colts final numbers:

Team Raw Off Avg Opp Def Avg Adj Off Avg Raw Def Avg Opp Off Avg Adj Def Avg
Colts 1.95243 0.20011 1.75232 -0.49511 -0.14840 -0.34670

Now that we have the final Adjusted Averages, all we have to do is add back in the Overall League Average, and we have our Adjusted ANPY/A stats:  Colt Offense -> 7.193, Colt Defense -> 5.094.  Both stats were slightly worse than the raw numbers, meaning they played a slightly below average schedule in ANPY/A.  Want to see a game-by-game breakdown for the Colts?:

Opponent Week Offense Defense Opp Def Opp Off
Jaguars 1 1.48287 -1.48354 1.92052 0.12280
Dolphins 2 8.54376 -2.01457 0.90887 -0.41280
Cardinals 3 6.41400 -1.46629 0.00730 0.11771
Seahawks 4 3.07323 -0.41457 1.48235 -1.26869
Titans 5 1.94907 -2.35901 0.32873 -0.51767
Rams 7 3.26190 -4.48354 2.07610 -2.58277
49ers 8 1.68158 -1.05346 -0.17824 -1.10106
Texans 9 -0.96174 -0.28124 0.50408 1.56740
Patriots 10 1.38543 3.19907 -0.20518 2.42205
Ravens 11 1.97253 0.61400 -0.69262 0.15670
Texans 12 -0.06322 -0.32366 0.50408 1.56740
Titans 13 2.42327 -0.23275 0.32873 -0.51767
Broncos 14 -1.48600 0.42634 -1.21608 0.47670
Jaguars 15 6.01876 -0.15267 1.92052 0.12280
Jets 16 -0.20869 -1.08124 -2.24549 -1.20342
Bills 17 -4.24790 3.18543 -2.24192 -1.32391

This same process happens for each of the 16 stats, and now you have all the stats Adjusted for Opponent!  You can then rank these accordingly, just like the Raw numbers.  The nice thing about both the Adjusted, and Raw stats, is I can rank them overall by team, weekly over the whole season, or even within a week.  That's why having everything relative to the average works out so nice.

This whole Adjusting for Opponents process came from Pro-Football-Reference, and a special thanks to Neil Paine, who sent me a video on how to set it up on Excel.  If anyone wants help doing this, please let me know, and I'd be more than happy to help.  You can also thank PFR for coming up with the fantastic Adjusted Net Passing Yards / Attempt stat, which I've used for this explanation.

I hope I've helped shed some light on these stats, rather than just confuse the hell out of you.  I'll attempt to answer any question you may have, as it'll be mean much more if other people understand what we're looking at.