Advanced NFL Stats: Win Probability

Aug 7, 2008

Win Probability

Desperate for football, I'm watching the Hall of Fame Game tonight. There's four minutes left in the fourth quarter and the Redskins are up by 7 over the Colts. Can Jared Lorenzen lead his 4th-string squad to a comeback? I must be the only person in the country who cares. And I only care because I'm investigating Win Probability (WP) in NFL football.

WP is simply an in-game estimate of who's going to win based on the current score and other game variables. This post will examine the potential application of WP and will illustrate a first cut at actual WP for various scores and time remaining.

Two of my recent posts discussed measures of utility in football. I looked at first down probability and at point expectancy. First down probability analyzes how likely an offense is to convert their present down and distance situation to a first down. The success of a play can be judged based on how it changes the probability of a first down. Point expectancy measures how many points a team scores on average based on its field position. This technique not only measures success, but can provide coaches with a decision-making tool. We saw, however, that both of these techniques had their limitations.

Win Probability (WP) has been a facet of baseball sabermetrics for many years. In baseball it basically measures the probability one team will win based on score, inning, outs, and runners on base. I suppose the batter's count could be included in the calculations too.

The usefulness of WP goes beyond fan curiousity about whether the home team has a chance to win (but that's interesting in itself). Take a situation in baseball with a team down by 1 run in the 9th inning. There's a runner on first base and no outs. Should the manager call for the steal? WP should instruct his decision. We could calculate the WP of the steal decision by totaling the WPs of the potential outcomes. The WP of the steal decision would be:

WP(steal) = Pr(successful steal) * WP(runner on 2nd, no outs) + Pr(caught stealing) * WP (no runners, 1 out)

This would be compared to the various outcomes of the current at bat without stealing to decide which decision gives the team the best chance to win. There would be analagous applications of WP in decision-making in football.

Baseball is a sport well-suited to WP because it has a limited number of discrete states. There are 27 outs for each team, 3 bases, and 3 outs per inning, and there is enough historical data to accurately calculate the historical WP for each state. Football is far more complex. The states are continuous and non-discrete. For example, compare field position to runners on base. There are only eight (I think) combinations of base runners, but there are 99 yard lines. Or compare each baseball team's 27 outs (54 total) to the 3600 minute and second combinations in a 60 minute football game. There is literally over a billion potential combinations of score, field position, down and distance, and time remaining.

WP in football can be simplified, thankfully. For example, time remaining can be grouped into minute or 30 second increments. Field position could be grouped in chunks too. Even so, there are still so many combinations of states in a football game that a good WP model would need a lot of data. And I'm not talking about an entire season of play-by-play information. We'd need years of data, and even then we'd need a lot of mathematical smoothing and "best fit" estimation.

Others have looked at WP before. The site ProTrade.com built a model and it was used on ESPN.com for a time, but it seems like they used 5 minute intervals for time, something that doesn't help understand what's going on in the ever-critical final minutes. This site (which I highly recommend) also has a model, but it assumes a continuous effect of score differences. This is a fatal flaw because football scoring is in chunks of 3 and (almost always) 7. A 4 point lead is very different than a 3 point lead, and a 7 point lead is not just 1/6 better than a 6 point lead. The amount of the lead affects the strategies of the teams, which results in unique WP curves for each point of score differential.

The graph below is my own first cut at WP for the NFL. It is based on all regular season games from the 2000 through 2007 seasons. For the most common score differentials, it plots the WP of the team with possession of the ball. For example, the top curve, labeled +7, is the WP for a team that is winning by 7 points, and has the ball, at each minute remaining in a game. The -7 curve, at the bottom, is the WP for a team trailing by 7 points and has the ball.

It does not factor in field position or down and distance situations. So, this should be considered a baseline and not a finished product. But already we can see some interesting things.

A few things stand out to me right away. Notice the sudden drop off of the -7 curve. A team trailing by a touchdown with 20 minutes left in the game (5 min left in the 3rd quarter) sees its chance of winning fall dramatically. There are similar drop offs for teams trailing by 1 and 3 points at the beginning of the 4th quarter.

Also notice how the WP for a team down by 3 has a slight uptick, from .20 to .30, in the last few minutes of the game. I think this is because if they are able to score and at least force overtime, there is not much time left for the opponents to mount a scoring drive themselves.

One particular surprise is that teams down by 1 point early in the 4th quarter, who have the ball, are actually favored to win. By 7 minutes remaining, the team trailing by a point falls below even odds.

The application of WP could have profound consequences. Take a fairly common scenario. Your team is down by 3 points with 4 minutes left in the 4th quarter. You're facing a 4th down and goal from the 2 yard line. Conventional wisdom screams field goal! Get some points on the board and at least force OT. But WP might say something else. (Keep in mind field position is not considered yet, so this is rough).

If you kick the (virtually automatic) field goal, that ties the score and gives the opponent the ball with just less than 4 minutes remaining. Your opponent has a 70% chance of winning in this situation according to the graph. You've got a 30% chance.

Let's see what happens if you go for it. If a 4th down try for 2 yards would be successful about 30% of the time (which it is, at least for 2007), your PW is the total probability of the two outcomes:

WP(go for it) = 0.30 * WP(+4 point lead) + 0.70 * WP(-3 point deficit)
= 0.30 * 0.92 + 0.70 * 0.22
= 0.43

That's a 43% chance of winning by going for the TD vs. a 30% chance by going for the FG. I think this actually understates the chance of winning by going for it because if you fail, the opponent gets the ball very near his own goal line. So if you get the ball back, chances are you won't have far to go to get back into field goal range.

A lot of work still needs to be done, but the potential for WP in football is enormous.

14 comments:

Shake'n'bake said...: Wow, something besides a fat joke came from watching Jared Lorenzen play football.; Thursday, August 07, 2008
Tarr said...: I was in Vegas last weekend, and let me tell you, you were absolutely NOT the only person who cared about the result. Although I suppose most Vegas folk were more concerned about "Redskins -4.5" than Redskins.

Personally, it's always a little strange for me to watch Colts vs. Redskins. I'm happy since nobody got hurt.

I like the analysis. It's good to be getting back to football.; Thursday, August 07, 2008
Anonymous said...: The way I see it, you should be able to take the +'s and -'s and apply one to you and the other to your opponent. So there are 10 minutes left in the 4th quarter and you are u by one point.

Shouldn't your chance to win when up by 1 point + your opponents chance to win when down by 1 point for any given minute of the game add up to 100%.

I'd expect that the + curves the - curves would exact reverses of each other.

With 10 minutes left in the 4th quarter the team that's up by 1 has a 57% chance of winning and the team that's down by 1 has a 65% chance of winning. That adds up to 122%. Seems wrong to me but everything I know about statistics I learned from reading NFL stat sites so I could be wrong :); Thursday, August 07, 2008
Brian Burke said...: Anon-No, the probabilities for +1 and -1 would not add up to 100% in this case. But good question and this is something I should clarify.

The probability curves in the graph consider possession. So the probability curve labeled +1 is for a team up by 1 point and has the ball. Conversely, the curve labeled -1 for for a team down by a point and has the ball.

The extra 22% you point out could be considered the value of simply possessing the ball, in terms of probability of winning at that point in the game.

Thanks for the question.; Thursday, August 07, 2008
miles said...: Can you explain the +0 line? In that situation, both teams should have it, so, I don't grok how the line could deviate from 50%?; Tuesday, August 12, 2008
miles said...: Argh. Didn't read through all the comments. The "has the ball" was the part I missed.; Tuesday, August 12, 2008
Western Spartans said...: I understand how having the ball could give you a >50% chance of winning with -1 or 0 pt deficit, but how can being down 1 point (vs. 0 or +1) have a better WP?

In other words how can the purple line ever be above the green (let alone red). Ditto green over red for that mystery spot at the beginning of the 4th quarter.; Friday, August 29, 2008
Brian Burke said...: WS-Very perceptive. Regular reader JonnyMo pointed that out to me. See the explanation in my article "The End Game." Bottom line is that teams with very small leads play too conservatively and teams with small deficits play more aggressively, which may be closer to the generally optimum level of risk/reward balance.; Friday, August 29, 2008
Mark Kamal said...: Hey,

I just stumbled upon this here. I was the one who created Protrade's Win Probability model. Even though our business model has steered us away from analytics a bit for the time being, I still see it as my baby :-) It took a LONG time to build.

Some comments:
* we used about 7 years of Play by Play data (about 2M plays)
* we did not group into 5 minute intervals, and did account for end of game situations
* we attempt to account for discountinuous effects (scoring comes in 3/7 pt chunks and down by 4 vs 5 late is very similar)
* our model takes many factors into account: score differential, down, distance, field position, time, timeouts, field types,...)
* the output of our WP model does agree with Roemer that coaches are too conservative on 4th down.

Keep up the good work Brian! All these articles are very interesting.

-Mark mkamal@protrade.com; Wednesday, September 10, 2008
Brian Burke said...: Mark-
I really miss your WP graphs on espn.com. They were a lot of fun. It was amazing to see how a single play could sometimes swing the game so much. Another thing I'm re-realizing now is how tv announcers pretend the game is still within reach, even when a trailing team only has a sliver of a chance. I suppose that's their job.

I don't have a full model yet, and if I account for down and distance I have to fudge it manually for now. I'm not even close to including TOs or field types(!) yet.

I'm thinking I could account for TOs indirectly by adjusting the time remaining--adding time for having more than the average # of TOs, and subtracting time for fewer.

I'm using 8 yrs of regular season data, but so far just looking at 1st and 10s for WP.

Glad to hear (from your other comment) that you're using your model to consult. It's reassuring some teams have a clue, and all your work is not going to waste.; Wednesday, September 10, 2008
Anonymous said...: Great site. How do you generate the win probabilities?; Thursday, December 04, 2008
Brian Burke said...: Thanks.

The win probabilities are derived empirically. I simply look at the database and compare all games with the same time remaining/score difference/field position/etc. Whatever percent of the time a team in the same (or very similar) situation wins becomes the WP.

There's some data smoothing and interpolation too, but mostly that's it.; Thursday, December 04, 2008
Anonymous said...: Is it safe to assume that at 30min remaining the score differential reflects that half time score and is unaffected by field position and down?; Tuesday, March 17, 2009
Brian Burke said...: Yes. But that is a general average that does not account for the fact one team is due to receive the kick off.; Tuesday, March 17, 2009