How much do the NFL elites regress to the mean?
Regression to the mean (called the "Fluke Rule" by ESPN NBA stathead John Hollinger and the "Plexiglas Principal" by the father of Sabermetrics, Bill James") is the tendency for extreme results to trend back towards the average on a second measurement.
Example from Wiki
if you give a class of students a test on two successive days, the worst performers on the first day will tend to improve their scores on the second day, and the best performers on the first day will tend to do worse on the second day. The phenomenon occurs because each sample is affected by random variance. Student scores are determined in part by underlying ability and in part by purely stochastic, unpredictable chance. For the first test, some will be lucky, and score more than their ability, and some will be unlucky and score less than their ability. Some of the lucky students on the first test will be lucky again on the second test, but more of them will have (for them) average or below average scores. Therefore a student who was lucky on the first test is more likely to have a worse score on the second test than a better score. Similarly, students who score less than the mean on the first test will tend to see their scores increase for the second test.
So what effect does regression to the mean have in the NFL, specifically on the teams who have been far above the mean, like our Colts?
In the 10 seasons between 1998 and 2007, there were 45 teams to win 12+ games. These teams averaged 12.9 wins. The following year these same franchises won an average of 9.2 games, a nearly 4 game dropoff. Regression to the mean is a very real thing in the NFL, which is compounded by the 1st place schedules and high performing teams picking later in the draft.
So does Indy have to worry about regression to the mean?
This statistical concept has been cited as a reason to doubt the Colts remaining among the NFL's elite this upcoming season. Reading the definition and the example it's clear why this idea is misapplied.
Regression to the mean is the result of luck not repeating itself. An extreme result having been pushed away from the mean by random chance, and in a second testing not having the boost (or drag) of unusual luck. Going back to Wikipedia's student exams example;
Which student's high test score is more likely a result of luck, rather than skill? The student who also aced the previous 5 tests, or the student who has had mixed results before the high test score?
The Colts resistance to regression to the mean in previous seasons gives very strong evidence that it's skill not luck that has put them on top. If your extreme result wasn't from luck, then you aren't in danger of regression to the mean.
Personally I think the term is being mis-applied rather than mis-understood. Sportswriters are slapping a statistic-y name on their subjective opinion without looking past the name of the concept. That the sportswriters misusing the term are trying to sound smarter and using a word they don't actually understand sounds more likely to me than that they are trying to get smarter and didn't understand correctly.
Back to the data for confirmation,
12+ win teams that had won 12+ games within the previous 2 years (16 teams) regressed by 3.1 games on average (winning an average of 13.1 games in season 1 and 9.9 in season 2)
12+ win teams that hadn't won 12+ games within the previous 2 years (29 teams) regressed by 4 games on average (winning an average of 12.8 games in season 1 and 8.8 games in season 2)
So teams that have had great seasons in one of the last two years are less susceptible to regression.
Teams that have won 12+ games in the previous season (making for back to back 12+ win seasons, 12 teams in the sample) are also less susceptible to regression than teams not coming off back to back 12 win seasons.
Regression to the mean is a part of the NFL and has a pronounced effect on team wins. However teams that have been very successful in previous seasons are less likely to regress, than those who have just broken out.
Coming Soon, What factors effect elite team regression?
(Excel with all 12+ win teams (1998-2007), and their records the following year)
38 comments
|
1 recs |
Do you like this story?
Comments
Oh Shake...
I love it when you talk dirty!
:)
18to88.com
What do the stats say
When a team has won 12+ games 6 years in a row?
Oh wait…
NFC North and NFC South writer for SB Nation's NFL Draft blog: Mocking the Draft
Exactamundo
It says here, and I am reading from the text of Liebniz and Newton (in the original German and Ye Olde Englisshe) to get the quote right… “The mean IS 12 wins, jerkwads.” THen somes some chicken-scratch with lots of Greek letters, but the gist is 12 × 6 =72/6 = 12
Those 18th Century guys had such pottye mouths, but they were goode at numbres.
I hate Joe Namath. That's how long I've been a Colts fan.
This is a simple answer...
A. Good teams who headed by good management with progressive and forward thinking leadership.
B. Rookie salaries for the top 15 picks are wayyyyy out of whack. This causes bad teams to stay bad because they are forced to spend massive amounts of $$$ on players that really aren’t that much better than those drafted in the 2nd round. Basically the talent level for the top 100 players is all about even. Therefore, a team like the Colts who picks at the bottom of the 1st year after year is getting just as good college talent as a team with the 1st overall pick. The difference is the Colts are spending 1/4 of the $$$ a team like the Lions are paying. This system keeps bad teams bad and thus helps good teams stay good.
bad teams trend upwards like good team trend downwards
there are factors that effect next year performance beyond just regression, both that increase it’s effect like the 2 examples I cited and some that oppose it like the two reasons you cited.
The top 100 being even is something I’d like to see support for since it’s against common wisdom and the objective breakdowns I’ve seen
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 9, 2009 3:48 PM EDT up reply actions 1 recs
I love it when stats back up
what I already know.
Now a proud annoyance on Stampede Blue, 18to88, Indy Football Report, and Phil B's blog.
Man, I need a life...
Random fact of the week from the empty void that is my mind: This has to be the hardest (and funniest) video game known to man.
Excellent
Great write-up Shake! Now if you could just make that required reading for the stooges at ESPN, Pro Football Talk, etc. The same formula that make the Patriots contenders year in and year out ie. Belichick making shrewd moves that pay off (Did I just say that? Oh well if he’s willing to give the Colts some props, I’ll give him his. ) should be applied to the Colts as well. We all know Polian has been an absolute monster when it come to restocking this team to stay at the elite status it has, and that’s not just the players but his choices of coaches as well (Dungy and hopefully Caldwell)
Playing Devil's advocate (albeit, probably in a stupid manner)
Couldn’t it be argued that the Colts’ long string of success over the last decade or so is helping them regress to the mean set by the 2 decades of failure that preceded it?
Bullets Forever: A blog dedicated to the Washington Wizards with analysis, commentary, and more YouTube videos than your eyes can handle.
explaining (albeit, possibly incorrectly)
that sounds more like the Gamblers Fallacy (A coin flip comes up heads 6 times in a row, what are the odds that if you flip one more time you’ll get a 7th heads in a row?)
Regression to the mean brings one point, back towards the mean in the later measurements. Regression to the mean moves teams towards the mean (who’d have guessed) not above it (from below) or below it (from above).
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
Eh...
This statistic, with all do respect, is rather meaningless. The statistic is a function of a number of factors that affect teams every year. If the team has a good coaching staff, stays healthy, a solid quarterback, and outscore their opponents – they win. Looking at the statistic first and forming an opinion about a team blindly based on numbers won’t yield positive results, even if the numbers “tend to” suggest a pattern. Elites regress to the mean when a number of things happen – when they play a higher level of competition (see Miami this year), when they suffer many or important injuries (see New England), when their draft picks bust, etc. It’s the factors, the “real” stuff that happens that creates this statistic.
I'm not really following what you are saying
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 9, 2009 10:00 PM EDT up reply actions
what did I say that you disagree with?
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 9, 2009 10:03 PM EDT up reply actions
What I'm saying is...
The general rule that this statistic suggests is that teams with great records one year tend to slip in following years and that team’s with poor records one year tend to have better records the following year. Problem is, if you take that general idea and apply it to individual teams with individual circumstances facing each one, it won’t mean anything. We’ve had six straight 12 win seasons. Whether we improve or decline from that record is primarily related to team health, the success of the new coaches, and our schedule. Even if the team is not as good this year from last (I think it will be better), our schedule is much easier than it was last year, meaning we may end up with more wins but not be as good of a team. Make sense? It doesn’t mean anything.
That said, it is interesting to see how these things play out so I appreciate that work you put into it.
it won't be right every time, but it's far from meaningless
if you apply it across the board instead of subjectively, you’ll miss on some teams like the Colts, but in general over the whole league you’ll be closer than if you had assumed every team was as good or as bad as their record.
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 9, 2009 10:30 PM EDT up reply actions
and if you do things that allow you to apply it more specifically
to teams that are more likely to regress, instead of just every outlier (like I did with the number for teams with recent 12+ win seasons and will do, looking at team attributes that make a team more or less likely to regress) then you’ll get even better results.
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 9, 2009 10:34 PM EDT up reply actions
So...
How is it any different than looking at the teams, looking at their record last year, looking at their competition, and forecasting their record based on that? It sounds like it is the factors, like I was saying, that dictate the likely record and not the statistics which dictate the likely success or failure of the teams. That’s all I was saying. It shows records but the story is deeper for each team and that it would take individual analysis for each team to make the statistics have any significance from the get go. Yes?
more info would mean more significance
but just regression to the mean alone has some significance.
If right at the end of the season, before knowing what changes a team would make, we made predictions. Would you predict every team would be at about the same level? and do you think that would have a better league wide result than predicting every team regressed a ways towards the mean?
It’s far far far from a complete story but it’s a factor in how teams performances change from year to year.
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 9, 2009 11:33 PM EDT up reply actions
maybe an analogy will get my ideas on this across more clearly
using just regression to the mean to predict records is like using just homefield in picking games. It’s a simple, system that ignores a lot of important information, BUT it does better than just random chance, showing that home field advantage is real, and that homefield should be considered as a part of more complex and detailed predictions.
Regression to the mean happens, it should be considered, and now with the previous years data and later with the team factors I’ll be looking and I’m trying to refine it to make it a more useful component of a system for predicting records (objectively or subjectively).
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 9, 2009 11:48 PM EDT up reply actions
8-8
it might be a more intuitive name to call it “regression to true talent level” in the case of the NFL
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 10, 2009 12:06 AM EDT up reply actions
Okay...
In that case, I want you to use your statistics to give me a forecast for NFL records for teams this year. Do not consider outside factors because it undermines the relevance of the statistic. If you’re figuring out things the statistic does not tell you in raw number, looking deeper at what factors make a team likely to win or lose, you’re undermining the importance of your statistical research to future projects of record.
Once you have finished using the statistics to project records for all 32 NFL team, get back to me and let me see what you’ve come up with. I’ll use my knowledge of each team, the teams schedule, injuries, etc. and come up with my own forecast for team records this year. If yours is right more often than mine, I’ll say that the statistic may be useful. If it is consistently more accurate than my prediction, I’ll say it is something to strong consider in making predictions.
However, the variable which we cannot control for is… do I know jack about the teams I’m forecasting, am I any good at predicting schedule strength, etc. Still, it would do something to prove to me that I should consider the statistic before considering all the kinds of things that I mentioned above and LukeNukem mentioned below.
when FO puts out their's you can go toe to toe with their predictions just like I did last year
they have more in-depth stuff that I don’t have the time and resources for. If you want to put subjective vs objective, they are the banner carriers for the objective predictions.
Though I will take you up on it. I’ll just want some time to build the system this summer.
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 10, 2009 12:17 AM EDT up reply actions
?
Do not consider outside factors because it undermines the relevance of the statistic. If you’re figuring out things the statistic does not tell you in raw number, looking deeper at what factors make a team likely to win or lose, you’re undermining the importance of your statistical research to future projects of record.
By this you mean I can’t use anything that isn’t an objective measure and I have to combine the stats I use in the same way from each team?
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 10, 2009 12:20 AM EDT up reply actions
I guess I don't understand...
How there is any way to gauge the relevance of the statistics you’ve come up with here if you don’t use just the statistics, and the bigger point you were making about teams moving to the mean, to make the objects statistics drive prediction of team records this year. If you use your head instead of the numbers, the numbers are irrelevant, anyone can see what a team’s record was last year and consider what things about that team impact its chances for the coming year. So… we’ll see if the statistic itself has any relevance or use for predicting team records by sticking to it. My hunch says it’ll be off, but we’ll see.
well of course it'll be off some
I doubt anyone anywhere will peg every team’s record no matter who they do it.
It’s not about statistics being perfect, it’s about them being an improvement from not using them. Personally I think that the subjective predictions of someone who buys into advanced statistics vs those of someone who doesn’t is also a test of the stats merit.
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 10, 2009 12:43 AM EDT up reply actions
I'm sure...
Anyone who is lucky enough to exactly predict team records is just that, lucky. However, I am confident that an objective statistic is less likely to “predict” a team’s record or level of success in the coming year because it is blind to circumstance.
IE – 12 was our wins last year, 8 is the mean, I would assume that it would mean we’ll likely have 10 wins next year (just going by the numbers).
I would argue that anything less than 12 next year would be the result of something terrible that happens which the numbers cannot predict. If I’m right that the team will win 12 or more games next year… say 13 for the sake of argument… and I have reasons for that belief and the season tends to follow the reasons I had and our record is 12+ wins, then it goes against the statistic.
Same for every other team. We’ll just see how things work out.
I'm going to do it too
And I won’t be using anything from my head. It will be all numbers based off what I am finding now.
NFC North and NFC South writer for SB Nation's NFL Draft blog: Mocking the Draft
this sounds like a site contest
open it up to everyone, enter in me, you and FO’s objective predictions in addition to everyone’s subjective predictions and see both who had the best subjective ideas of team abilities and see how the subjective systems match up.
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 10, 2009 12:28 AM EDT up reply actions
Possibly....
I bear the emotional scars of what might be called “the post-Bert Jones years” but while the 95/96 teams with Marchibroda leading the charge and Harbaugh pulling late wins from his colon felt like the karmic retribution (or reversion upward to the mean) 12+ wins six consecutive times appears to be a new paradigm.
As a wise coach said more than once, one time is a fluke. Twice is a coincidence. Three times is a trend.
I hate Joe Namath. That's how long I've been a Colts fan.
2 very excellent points. polian for one restocking the talent and having the judgement to do this year in and year out. the polians of the nfl are few and far between
and the point of …players in the first round from top to the bottom not being of a big difference in talent. that has been the case the last few years and its even obvious to the fans who watch college football and see it.
in fact the many flops are the top have actually shown the bottom of the first round has outperformed them. some by alot.
also I do think it has alot to do with the coaching. like a college coach who can take a walk on and put him in a completely different possition and he excels with it
great article. Im not buying we fail because of law of averages.
Peyton won't lose...
I hate to break this down into a very simple argument, but I think winning and losing all comes down to the talent on a given team.
First of all, the Law of Averages does not apply in professional sports. As Shake mentioned, the gambler’s fallacy is a pretty close corollary to the NFL. Every year is a completely new set of circumstances, just as every coin flip is a completely new coin flip. With every flip, you have a 50/50 chance of landing heads/tails. If you flip a coin 99 times and land heads 99 times, the 100th time, there’s still a 50/50 chance that you’ll land heads. I don’t know why we’re arguing this. As far as “regressing to the mean,” well, the mean is a brand new number every year, first. Second, I think it’s fair to say that with all the parity in the NFL, it’s hard for a team to complete season after season of 12-win success stories. Thus, it’s likely that the Colts will regress a little, or a lot, in ANY given year (not necessarily this year), but it has nothing to do with what they’ve done in the past. This is basic statistics. There are lots of external factors that may play into the team’s record (whether or not they’ll progress or regress). But as far as arbitrarily assuming that “it’s time,” well, that just doesn’t compute statistically or logically.
Furthermore, as I said, good players make good teams, and good teams have good records. This team is chock-full of good players. There’s historical, statistical evidence to back up the premise that when a team has a Hall of Fame-caliber QB, it generally succeeds. Manning has had two losing seasons in his career. During Joe Montana’s 49ers career he had one losing season. Marino had one losing season. Tom Brady has not had a losing season. Steve Young did not have a losing season. John Elway had just two losing seasons. Warren Moon had just three (his first three as a starter). Brett Favre had just one losing season. I have just named the best statistical QBs of the NFL’s history. Yes, a lot more goes into winning in this league than QB play, but it’s hard to ignore the connection between THE BEST QB play and winning.
The conclusion: The Colts very well may win less than 12 games this year, or next, or the year after. But the chances of that happening has not increased or decreased thanks to any aggregate statistical research other than that which takes into account external factors (i.e. strength of schedule, injuries, coaching changes, loss of key players, etc…). If anything, Manning’s sheer presence on this team strengthens a statistical argument that it will be good (maybe not 12-wins good, but good…at least “above the mean good”) indefinitely.
"You're hitting the wrong person. Don't you know you're hitting Ron Artest?"
is me going back to previous seasons confusing people?
the purpose of that was to try to start making a distinction between teams who’s true talent level is above the league average, and those that have just had better than normal luck and are in for a regression.
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 9, 2009 11:29 PM EDT up reply actions
I think (and I think when I work on this further it'll be supported by the data)
that the Colts lack of regression has a lot to do with Peyton. The reason being that (as Mgrex03’s winning factor’s series have shown) Passing success is more important to winning than running and (according to FO’s studies) good offenses stay good for longer than good defenses (the reason they suggest is that a good D takes 11+ very talented players, while a good O can be driven by one outstanding player with decent talent around him).
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 9, 2009 11:38 PM EDT up reply actions
suggestions for what to look at
to see which types of teams regress (or regress more strongly)?
I’m doing
-Scoring O
-Scoring D (both based off FO’s studies showing offenses remain good more consistently than defenses, so I’d think this would show up on the full team level, with defense fueled elites falling faster and more likely than elites with high flying offenses)
-Team QB Rating (because of the QB related comments, which I do buy)
-Difference between Pythag and actual wins (trying a measure of mostly luck to see how much of the regression it’ll account for)
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
Ah, statistics...
It was kind of odd seeing this post, after I just address the subject in the comments section in response to the Don Banks article. I honestly hadn’t intended to spend a whole lot of time talking about how Banks screwed up the principle of “regression to the mean”, but the more I kept typing, the more I had to say. With that experience in mind, I can totally understand how ‘Shake’ was was able to create an entire post on the subject.
The Gambler’s Fallacy is important to the debate, I agree. I mean, after all, Banks and others are arguing that, because the Colts’ coin has come up head six times in a row, the Colts’ coin is bound to land tails this time, causing the Colts to miss the playoffs. The Gambler’s Fallacy tells us right of the bat that – even if making the playoffs were purely up to chance – prior results have absolutely no bearing on the outcome of this coming season.
But more than that is the fact that making the playoffs is not a matter of chance. The NFL does not flip a coin to determine who will make the playoffs each year. Yes, there are surprises each year such as Miami, but there are also consistent, proven winners such as New England, Pittsburgh and Indy. These teams know what it takes to succeed in the league, even despite the NFL’s efforts towards parity, such as the draft and the salary cap. In fact, It would be an easy argument to state that the Colts’ (and Steelers’ and Patriots’) mastery of the draft and salary cap is part of what has led to their dominance over the last decade.
As Shake said, regression to the mean work as a tool when we’re talking about aberrations – it’s perfectly logical to assume that Miami will not benefit from the schedule they faced last year, and is thusly primed for a fall. However, for a team like the Colts who have a proven track record of a decade of success, regression to the mean is ineffectual, at best. The reality is that the mean for this team over the last six years has been 12+ wins. To predict that we will accomplish anything less than that, a pundit will have to provide a valid reason why we will perform at a level less than what we have been for the past six years. Schedule alone isn’t enough, as our schedule this year looks considerably weaker than it was last season. Additionally, our team this year looks considerably stronger (health-wise, if nothing else) than it was last season.
So while “regression to the mean” is important when it comes to unusual performances, the fact is that missing the playoffs would be an unusual performance for this Colts team.
The Don Banks article was what got me thinking about regression to the mean
the timing wasn’t odd at all. I just didn’t call him out my name because I had seen the same logic elsewhere as well.
Half the game too lazy
still sleepin' on me
but I'm 'bout to wake 'em
-Lil' Wayne "Fireman"
by shake n bake on May 10, 2009 2:51 AM EDT up reply actions
Thanks
This actually clear things up for me quite nicely. I can see how this would make the statistics shake has come up with useful as a good baseline. The teams with the biggest numbers in the column farthest right are the ones with the greater likelihood of “regressing to the mean” in a coming year. There are factors, which I discussed and LukeNukem discussed, which can alter those numbers a bit when forming a forecast. However, teams with small numbers most of the time on the right column will likely only move a small number of wins or losses in the coming year. Teams with bigger numbers most of the time, or at least recently, in the right column will likely move a large number of wins or losses in the coming year. Generally, not specifically, and subject to the factors we’ve discussed. I can dig that.

by 




























