clock menu more-arrow no yes mobile

Filed under:

How much do the NFL elites regress to the mean?

Regression to the mean (called the "Fluke Rule" by ESPN NBA stathead John Hollinger and the "Plexiglas Principal" by the father of Sabermetrics, Bill James") is the tendency for extreme results to trend back towards the average on a second measurement.

Example from Wiki

if you give a class of students a test on two successive days, the worst performers on the first day will tend to improve their scores on the second day, and the best performers on the first day will tend to do worse on the second day. The phenomenon occurs because each sample is affected by random variance. Student scores are determined in part by underlying ability and in part by purely stochastic, unpredictable chance. For the first test, some will be lucky, and score more than their ability, and some will be unlucky and score less than their ability. Some of the lucky students on the first test will be lucky again on the second test, but more of them will have (for them) average or below average scores. Therefore a student who was lucky on the first test is more likely to have a worse score on the second test than a better score. Similarly, students who score less than the mean on the first test will tend to see their scores increase for the second test.

So what effect does regression to the mean have in the NFL, specifically on the teams who have been far above the mean, like our Colts?

 

In the 10 seasons between 1998 and 2007, there were 45 teams to win 12+ games. These teams averaged 12.9 wins. The following year these same franchises won an average of 9.2 games, a nearly 4 game dropoff. Regression to the mean is a very real thing in the NFL, which is compounded by the 1st place schedules and high performing teams picking later in the draft.

So does Indy have to worry about regression to the mean?

This statistical concept has been cited as a reason to doubt the Colts remaining among the NFL's elite this upcoming season. Reading the definition and the example it's clear why this idea is misapplied.

Regression to the mean is the result of luck not repeating itself. An extreme result having been pushed away from the mean by random chance, and in a second testing not having the boost (or drag) of unusual luck. Going back to Wikipedia's student exams example;

Which student's high test score is more likely a result of luck, rather than skill? The student who also aced the previous 5 tests, or the student who has had mixed results before the high test score?

The Colts resistance to regression to the mean in previous seasons gives very strong evidence that it's skill not luck that has put them on top. If your extreme result wasn't from luck, then you aren't in danger of regression to the mean.

Personally I think the term is being mis-applied rather than mis-understood. Sportswriters are slapping a statistic-y name on their subjective opinion without looking past the name of the concept. That the sportswriters misusing the term are trying to sound smarter and using a word they don't actually understand sounds more likely to me than that they are trying to get smarter and didn't understand correctly.

Back to the data for confirmation,

12+ win teams that had won 12+ games within the previous 2 years (16 teams) regressed by 3.1 games on average (winning an average of 13.1 games in season 1 and 9.9 in season 2)

12+ win teams that hadn't won 12+ games within the previous 2 years (29 teams) regressed by 4 games on average (winning an average of 12.8 games in season 1 and 8.8 games in season 2)

So teams that have had great seasons in one of the last two years are less susceptible to regression.

Teams that have won 12+ games in the previous season (making for back to back 12+ win seasons, 12 teams in the sample) are also less susceptible to regression than teams not coming off back to back 12 win seasons.

 

Regression to the mean is a part of the NFL and has a pronounced effect on team wins. However teams that have been very successful in previous seasons are less likely to regress, than those who have just broken out.

Coming Soon, What factors effect elite team regression?

(Excel with all 12+ win teams (1998-2007), and their records the following year)