Stampede Blue: An SB Nation Community

Navigation: Jump to content areas:


Sports blogs for fans, by fans.
Around SBN: The 2009-2010 Card Chronicle Big East basketball preview

The Predictor is (initially) done

Remember back 2 months ago when I asked for some help with a class project?  After following the exact same path as before, I procrastinated my way into a couple of late nights this week to complete my project.  This is definitely a first cut at this, with many improvements to be made before the season starts.  I'll give you a few highlights of my findings, with a full report this weekend, after I've actually written my report for class.

  • I used 2003-2006 stats as the basis of my model.  I then predicted 2007 based on probabilities found in the previous 4 years.
  • I used an average of the previous 7 weeks data to estimate what each team would do the next week.  Anything beyond 7 weeks was not significant.
  • I used Home/Away, Time of Year, Day of Week, and Opponent Group (Division, Conference, Non-Conference) as my Non-Mathematical stats.   I may try to incorporate weather as well, but did not have time, and only found a site with the information a few days ago.

Here's what I found out from 2007:

  • The Predictor was right 56% of the time, which is great for an initial stab at this.  Anything over 50% was going to be a victory for me.  I'll have all summer to tweak and make it better.
  • It got even better once we exclusively used stats from 2007 (week 7 on).  It was correct 62% of the time at the end of the year.
  • I tested out 4 teams individually:
    • Colts:  7-9  (Lots of room for improvement)
    • Redskins: 11-5 (Only predicted against them 4 times)
    • Giants: 10-6 (Started 1-5, finished 9-1)
    • Patriots: 12-4 (Picked the Colts to beat them, as they should have)
  • The four factors that caused the probability of winning to move the most:
    • Rushing Attempts
    • Rushing Yards
    • Turnovers
    • Time of Possession

Again, I haven't written up the full report yet, which is the project for tomorrow night.  If anyone is interesting in reading it, just shoot me an email.  As I keep updating it throughout the summer, I'll keep you posted on how it is improving.  My goal is 70% before the season starts.

0 recs  |  Comment 4 comments

Story-email Email Printer Print

Comments

Display:

Stats

Stats like these are misleading. As an example, a rush heavy team would have a higher number of attempts and theoretically a higher number of yards, along with a higher time of possession and a lower turnover rate (harder to generate turnovers on rushing than passing). Conversely, a talented pass-first team that builds large first half leads would rush more in the second half, and by virtue of a better offensive line would have a higher YPA average (and thus more yards). The same team would also, by virtue of it’s own success, have fewer turnovers and a higher time of possession.

Essentially, it’s the difference between the Vikings and the Colts.

I think better stats to follow would be YPA (passing and rushing) and sacks/pressures achieved and given up. I think you’d find that better teams have higher YPAs, and get more sacks while giving up fewer (thus proving common football wisdom that success in the trenches is more important than success at the skill positions),

Bob Sanders eats a forest on Friday so he can lay the wood on Sunday.

by MonkeyBusiness on May 8, 2008 2:47 PM EDT reply actions   0 recs

While I certainly agree there are potentially better stats

I wouldn’t call these stats “misleading.” The nature of this model is that it lets the data construct the model. It will find the greatest probability of a model given the data. I did not tell the model those were the most influential factors; the model told me that.

When I had the actual results of the game, stats included, the model was 220-36, which is 86%. That leads me to believe that, for this data, this model is correct. When I start trying other data, such as YPA, I’ll find a different model, which could lead to a better percentage (I hope).

There are obviously exceptions to every model, such as the one you presented. The Redskins are very much a run-oriented team, which is why they were picked so many times. They were picked to lose 4 times: 3 of which happened, and the 4th was week 17, when Dallas didn’t even show up. It picked the Redskins pretty accurately. The Patriots, on the other hand, were not a running team at all, yet were still picked 12 times.

by mgrex03 on May 8, 2008 3:19 PM EDT up reply actions   0 recs

Stats

I am a former analyst who has developed many models. You are on the right track here, it is just going to be very difficult to get too much accuracy because of the small sample of games that will be relevant to your analysis and the overall competitiveness of the league. I am also guessing that the predictive characteristics from the end of one year to the beginning of the next would be weakened because of player turnover.

Back in the day, I did a lot of similar work on the NBA to see if I could pick games, both to win and against the point spread. I figured it was the easiest sport to work with because of the large volume of games, consistent line-ups (unlike baseball, with a different starting pitcher each day), and predictable game results. Picking winners was quite easy, but against the point spread, the only characteristics of the nearly 100 I tried that were relevant were teams did poorly against the spread playing their 4th game in 5 nights and also their 1st game at home after a long road trip.

Bottom line is that this is good as a learning exercise, but to get the best game prediction, can’t do better that flipping to page 7 of the sports game and looking at the latest line.

by mmcrobe1115 on May 8, 2008 10:45 PM EDT reply actions   0 recs

Brian Burke

Best statistical prediction model of the NFL I’ve ever found.
70.8% accurate straight up
59% accurate against the spread (without changing the model at all from the straight up picks)

mgrex03,
The experts hit on 66.7% last year so you’re getting close to something really useful.

my blog http://shakennbaken.blogspot.com

by shake n bake on May 10, 2008 10:31 AM EDT up reply actions   0 recs

Comments For This Post Are Closed


User Tools

Welcome to the SB Nation blog about the Indianapolis Colts, 2006 NFL Champions!
Start posting about the Colts »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Naruto_rasengan_by_kishoto_small
The Day of Depth- An analysis of current injuries and their effects on our depth
Aaiu088_bob-sanders-posters_small
THE STATE OF THE TEXANS: WEEK 9 EDITION
Peyton_dungy_small
Perception altering Reality: Why the Colts are better than the Saints
No_1_small
Power Rankings - Week 9 - Final Update
Medium_jabe_small
One For the Ladies - The NFL's Best Looking Offense

Recent FanPosts

Makeitpersonal_small
Run Game
100_1550_small
Chiefs Release Larry Johnson
Snid160_small
What is up with Pierre Garçon?
Champ_small
Five Things I Think I Know: Week 9 Edition
Novemberdecember_2007-115_small
MarkFive's 5 Points
Small
5 Big Things, Houston @ Indy Edition
Tlbux4_small
Good Game Houston
Small
PRE GAME Manning Facts
Small
LATE NIGHT THREAD
Bobzilla_small
Just to clear things up

+ New FanPost All FanPosts >

Latest NFL Headlines from SB Nation


Head Writer, Editor-In-Chief

Stampedeblue_small BigBlueShoe

Site Editor

Bob-sanders-081107_small shake n bake

Contributing Writers

Masonair_small JakeTheSnake

Mgrex03_avatar_small mgrex03

Seyton_manning_feature_small KingRichard

Change_small Colts Homer