The Tressel Factor (December 10, 2003)

Ohio State has been the source of controversy during the last few years, as they have won several games by narrow margins or in overtime. This gives them poor rankings in ranking systems that use margin of victory ("predictive" rankings), compared with high rankings in systems that do not. I guess I should start with the usual disclaimer: as I have explained before, I am not advocating a downward revision of Ohio State's BCS ranking or claiming that they did not deserve to be in the 2002 BCS title game.

The standard statistical interpretation is that Ohio State isn't as good a team as their record indicates. In any of their narrow victories, they might have just as easily have lost, and thus the predictive rankings are correct to rate them poorly. Ohio State supporters generally counter with the argument that Tressel's style of play favors low-scoring, close games, and as such close wins will be the norm. They also claim that the narrow wins result from his team's ability to gear it up when needed and do whatever is needed to win. I call this the "Tressel Factor".

Clearly this isn't always the case; an example that immediately comes to mind is their championship, which would not have been won had a controversial penalty been called. Now I don't care to argue the merits of the pass interference call; I am merely pointing that the official's decision of whether or not to throw that flag had nothing to do with any "Tressel Factor"; it was a break that went in Ohio State's direction over which they had no control.

However, even if Tressel isn't able to summon his magic in every single game, it is still worth examining whether or not there is evidence that the Tressel Factor is real. I address this question in two ways.

Tressel's History

As described above, the hallmark of a team showing the Tressel factor is that their predictive ranking should be consistently worse than their standard ranking. Here is a summary of Tressel's coaching years that are covered in my rankings.

Year	School	Standard	Med. Likely	Predictive
2003	Ohio State	11	10	21
2002	Ohio State	1	1	11
2001	Ohio State	37	37	31
2000	Youngstown St	15	15	21
1999	Youngstown St	3	2	9
1998	Youngstown St	26	25	34
1997	Youngstown St	4	4	2
1996	Youngstown St	18	17	8
1995	Youngstown St	64	63	49

What we see is that, in only 5 of 9 years has Tressel's team has had a win/loss ranking that is better than its predictive (score-based) ranking. In effect, Tressel's teams show no indication that they have been systematically underrated by a predictive ranking.

I haven't shown the data, but there is clear evidence that Tressel's style favors low-scoring teams, in that my scoring rating for Tressel's teams has been consistently on the low side of the spectrum. This fact related to the team's predictive ranking.

Imagine that the average score in a football game is 20 points. Suppose that team A has an average offense but a superb defense that allows only half as many points as average. In other words, if team A plays an average team, they will win on average by a score of 20-10, which implies that they will win 75% of the time and have a predictive rating of +0.667. Now imagine team B, which has average defense but a superb offense that scores twice as many points. Team B would beat an average team by 40-20, meaning that they win 84% of the time and have a predictive rating of +0.987. In other words, team B is the better team; for A to be equally good they would have to win by an average of 20-6. (Details on the game analysis calculations can be found here.)

This is a fundamental problem with defensive football and smashmouth offense. While it may conjure up nostalgic images of how the game "should be played", a team that wins its games 20-10 is inferior to one that wins its games 40-20 because one bad break is more likely to result in a loss.

Other Teams

Although there is no clear sign of a Tressel Factor in Tressel's historical results, I only have nine seasons of rankings of his teams. It is conceivable that something more subtle exists that can be determined through studies of larger numbers of teams.

What I wish to examine is the possibility that my statistical model for evaluating game results could be slightly wrong. As noted in the ranking descriptions, the significance given to a win is roughly equal to the score difference divided by the square root of the number of points. This means that a 17-14 win is more convincing than a 16-15 win to the same degree than a 16-15 win is more convincing than a 15-16 loss. If there really is a significant difference between winning and losing ("the better team just wanted it more and got the job done"), I should be able to improve the accuracy of my win-loss predictions by adjusting this formula slightly to put more weight on whether or not a team won. In fact, I find this to not be the case. Adjusting the game evaluation formula to give more significance to a narrow win, even slightly, measurably diminishes the accuracy of the predictions.

I find this to be somewhat surprising. I have to believe that there is some degree of clutch performance in football; after all different people react differently when the pressure is on. However, football is a team sport, and it seems that the overall clutch ability of one team is comparable to that of any other team. More to the point, there are a lot of random elements -- bounces of the ball, calls or non-calls, mistakes, etc. -- that dominate the outcome of a close game. Put differently, a team that put together a last-minute drive to win a game was probably just as likely to lose as to win. This shouldn't detract from the excitement of a close game or taint the win; it just means that it doesn't tell you as much about the "character" of the players as common wisdom would indicate.

To summarize briefly:

Contrary to public opinion, Tressel-coached teams do not show an unusually high ability to "just win". Over many years, his teams have won about as many games as expected based on a margin-of-victory rating.
Tressel-coached teams do tend to be primarily defensive. This has two effects. First, it creates larger numbers of close games (you can't lose by 20 if your opponent only scores 10), and second, those close games are more likely to go either way on the bounce of the ball.
Overall, I find that my statistical model for using scores to evaluate games works very well. There is no need for an adjustment to give the winning team a little bit of extra credit for having "done enough to win".

Return to ratings main page

Note: if you use any of the facts, equations, or mathematical principles introduced here, you must give me credit.