RPI/Basketball Notes (2003)

March 16, 2003

Final notes on this year's men's selection process:

I correctly picked 32 of the 34 at-large bids in the men's bracket. For comparison, ESPN also picked 32 correctly, while collegerpi picked only 31 correctly. Of the 63 teams I correctly predicted to be in the tournament, my seeds were within two for 60 cases. ESPN again made equal predictions (60), while collegerpi predicted only 56 teams' seeds to within 2 spots.
The committee's selection of LSU, Auburn, and Alabama but not Tennessee is an obvious mistake (most experts agree with me on that point). I was also wrong in predicting that UNLV would make the bracket, which is good because they didn't deserve it. My projection was based on their 21 wins and #42 RPI ranking, a combination that would normally get you in from a major or near-major conference.
In their places were Southern Illinois and Gonzaga. Southern Illinois deserved its invite, and I'm glad to see they were invited instead of UNLV.
Gonzaga's selection and #9 seed, frankly, is a gift. The Zags certainly deserve the gift after their treatment last season, but nevertheless there was no good reason for picking them over Tennessee. Yes, Gonzaga had a moderate RPI (#44), but Tennessee finished tied for third in the nation's toughest conference.
For some unknown reason, the committee didn't think very highly of the mountain west conference. Their two at-large bids (Utah and BYU) were seeded much lower than I predicted (or anyone else, for that matter).

Final notes on this year's women's selection process:

I correctly picked 32 of the 33 at-large bids. For comparison, ESPN, and collegerpi picked 31/33 correctly.
My lone error was the committee's selection of Miami instead of St. Joseph's. Miami was the 8th-place team in the Big East, was 7-8 in conference, and was #62 in the RPI. St. Joseph's was the regular-season champ of the northeast conference, with an RPI of #40. ESPN and collegerpi both agree with my assessment; ESPN had gone so far as to pick St. Joseph's as a 9-seed.

March 15, 2003

With only the four final championship games to play, here is an update to my predictions.

Automatic (27): Pitt, Louisville, Dayton, Creighton, Oregon, Central Michigan, Penn, Holy Cross, UW-Milwaukee, UNC-Wilmington, W. Kentucky, Manhattan, Weber State, Troy State, Tulsa, Austin Peay, Colorado St, Utah St, San Diego, Sam Houston St, E. Tennessee, Wagner, Vermont, Indiana-Purdue, SC State, Texas Southern, UNC-Asheville
Locks (34): Kentucky, Arizona, Wake Forest, Kansas, Texas, Oklahoma, Syracuse, Florida, Duke, Marquette, Illinois, Mississippi State, Missouri, Notre Dame, Xavier, Oklahoma State, Wisconsin, Stanford, Memphis, Utah, UConn, BYU, LSU, St Joseph's (PA), Cal, Maryland, Colorado, Auburn, Michigan State, Purdue, Indiana, Arizona State, Cincinnati, UNLV
Bubble/in (4): Tennessee, Alabama, Butler, NC State
Bubble/out: Texas Tech, Southern Illinois, Boston College, Seton Hall, Gonzaga, North Carolina, Ohio State, Providence, Wyoming, St Louis, DePaul, Minnesota

What changed:

Tennessee was the only team downgraded from "lock" status. With Georgia removed, they were tied for third in the SEC this season, ahead of LSU and Auburn, both of whom are locks. Only about once in two seasons does the selection committee skip over a team in this kind of case, so normally I'd call them a lock. However, with the Georgia mess, the committee might deviate from their typical behavior. I don't think they will, so I've put Tennessee on the bubble/in list.
Colorado, Auburn, Michigan State, and Indiana were all upgraded from bubble/in to locks. All but Michigan State won their 20th games this week. Michigan State moved up to #32 in the RPI, which along with their third-place finish in the Big 10 (ahead of Indiana) should secure their bid.
UNLV improved to 21-10 this week, while Wyoming lost early. UNLV has moved from bubble/out to lock; Wyoming from bubble/in to bubble/out. The committee may deviate from typical behavior since UNLV would be the fourth Mountain West team, but I doubt it.
Colorado State, to many people's surprise, beat several "lock" teams and won its conference tournament. This knocks a bubble team out that would have been in (see Southern Illinois); a second such case would happen if Ohio State wins tomorrow.
NC State has been moved from bubble/out to bubble/in. Obviously this is moot if they beat Duke tomorrow, but I feel they will be in unless they completely embarrass themselves. They might not quite have the credentials of other schools on the bubble, but their 4th-place finish in the ACC and win over Wake should garner enough respect from the committee to get them in.
Perhaps I'm being overly hopeful here, but Butler should get an invite. If the RPI did not have its schedule approximation error, Butler would be ranked #23 in the RPI and a solid lock. Butler is also #28 in my ratings and #35 in Sagarin's.
I still believe Alabama will be given an invite, despite its 7-9 SEC record. While teams under 0.500 in conference play rarely get invites, Alabama is #35 in the RPI and #37 in Sagarin's ratings, both of which weigh heavily into the selection process. In addition, the SEC was the toughest conference this season. That said, if Ohio State wins tomorrow, Alabama is probably the team that gets bumped.
Southern Illinois gets knocked out of the bracket. This is a shame. Southern Illinois should have been #31 in the RPI and thus a lock, but are instead #38 and on the bubble. I would love for the committee to realize this and put them in instead of Alabama, but I don't see that happening.
Seton Hall was likewise removed from my projected bracket, after getting knocked out early by UConn in its tournament. Unlike Southern Illinois, they don't have a compelling case to argue in their favor.
I keep hearing talk of how Boston College and Gonzaga are locks. Neither is. Both have poor RPIs and poor Sagarin rankings. Gonzaga could get in as a way to make up for getting a poor seed last year, but given the tough bubble I don't see it happening.
Texas Tech has turned heads with its tournament run and has a good RPI and good Sagarin ranking, but compiled a 6-10 record in conference play. If any team with a conference record under 0.500 gets in, it would probably be Alabama.

Women's Tournament:

Automatic (31): LSU, Duke, Villanova, Texas, Purdue, Stanford, Louisiana Tech, GWU, New Mexico, TCU, UT-Chattanooga, ODU, Pepperdine, Harvard, UW-Green Bay, UCSB, Austin Peay, Liberty, Holy Cross, W. Kentucky, Weber State, Manhattan, SW Missouri, Hampton, W. Michigan, Alabama St, St Francis (PA), Valpo, Boston U, Georgia St, SW Texas
Locks (29): Tennesee, UConn, UNC, Miss State, Kansas St, Texas Tech, Vanderbilt, Rutgers, South Carolina, Arkansas, Penn State, Boston College, Georgia, Minnesota, Ohio State, Virginia Tech, Colorado, Arizona, Georgia Tech, Notre Dame, Utah, Illinois, Oklahoma, Cincinnati, DePaul, Washington, Virginia, St Joseph's, Xavier
Bubble/in (4): Tulane, BYU, Michigan State, UNC-Charlotte
Bubble/out: Baylor, Colorado St, Miami, UCLA, UCLA, Auburn

March 12, 2003

With most of the midmajor tournaments in the books and the major tournaments gearing up, here's a first stab at how the NCAA tournament field will look:

Automatic (14): Creighton, Penn, UW-Milwaukee, UNC-Wilmington, Manhattan, W. Kentucky, Troy State, Weber State, Austin Peay, San Diego, E. Tennessee, Wagner, Indiana-Purdue, UNC-Asheville
Locks (33): Kentucky, Arizona, Texas, Wake Forest, Florida, Syracuse, Kansas, Oklahoma, Pitt, Marquette, Duke, Xavier, Notre Dame, Louisville, Wisconsin, Illinois, Stanford, Oklahoma State, Mississippi State, Dayton, Utah, BYU, Memphis, Maryland, Cal, Missouri, St Joseph's (PA), LSU, UConn, Purdue, Tennessee, Arizona State, Cincinnati
Bubble/in (10): Alabama, Colorado, Butler, Auburn, Michigan State, Oregon, Indiana, Southern Illinois, Seton Hall, Wyoming
Bubble/out: Boston College, Gonzaga, UNLV, NC State, Central Michigan, Holy Cross, St Louis, DePaul, Minnesota, Providence

In general, it will be tougher than usual for teams to secure at-large berths. The problem is the overall weakness of midmajor teams. Last season there were plenty (about 10) midmajor teams that would have had good shots at at-large berths; this year there are only three (and two are in the MVC). This means that there are roughly five more "bad" teams given automatic bids, and thus there are fewer spots available for "good" teams. This is slightly mitigated by Georgia's withdrawal from the postseason, but nevertheless means that there are going to be some teams that would have made it last season but are left out. Note that the above total is not 65; there are 8 midmajor conference tournaments pending, none of which will be won by teams I consider likely at-large invitees.

Most of the locks are pretty self-explanatory. LSU, UConn, Purdue, and Tennessee would be better off with at least one win in their conference tourneys, but even if they fail to do so I can't see them being left out.
Cincinnati lost tonight, but is protected by its high RPI rating. No team in recent memory has had an RPI of #32 or better and not been selected; Cincinnati's was #24 before tonight's game but shouldn't go down that dramatically.
Alabama is an odd case. They have an RPI rating of #32, but were 7-9 in conference play. My guess is that they'll get in, especially if the committee ignores their loss to ineligible Georgia.
Colorado and Auburn should be locks with one win in their respective tournaments, and Michigan State and Indiana with two wins.
Butler really threw a wrench in things by losing its conference title game. I think they deserve a tournament invite regardless, especially given that their RPI ranking *should* be #24. However, due to the same statistical oddity that bit them last year, their current RPI is #37 and thus out of lock contention. Sometime the NCAA will hoepfully fix the schedule approximation error (see below)...
Oregon has already achieved the 20-win mark, but needs to keep its RPI from sagging. It is at #51, which I consider the borderline for "lock" status for major conference 20-game winners; it would have to win at least two games in the Pac-10 tournament to be assured of an NCAA invite.
Seton Hall is ranked only #49 in my rankings and #60 in my improved RPI ranking, but is at #35 in the real RPI. Look for them to sneak into the tournament.
I can't see Wyoming and UNLV both being given at-large bids, even if both eclipse the 20-win mark. While UNLV has the better RPI and is the better team, Wyoming won both games the teams played against each other and has the better record. This could be different after Saturday, but for now I'd pick Wyoming.
Southern Illinois is a very strong candidate but, given the number of bubble teams that can be selected, is right on the edge of going or not. They would be a lock if the NCAA used my improved RPI, where they are ranked #31 instead of #36. Gonzaga falls in this category as well, with an RPI of #42 but an improved RPI of #32.
Boston College and NC State are the two best major conference teams I don't see getting in. Both have relatively poor RPIs (#50 and #63), and don't have the records to put them ahead of the other bubble teams.
Let's all root for Central Michigan and Holy Cross, which are ranked #43 and #44 in my rankings. This puts them ahead of Cincinnati, Auburn, Michigan State, Seton Hall, and Wyoming. Yet their RPI rankings will keep them out unless they win their conference tournaments.
St. Louis, DePaul, Minnesota, and Providence are technically alive on the bubble, but need to do great things in their conference tournaments to get serious consideration. The catch, of course, is that the best any of these teams can do is to go 2-1 or 3-1 unless they win it all and get the automatic bid.

February 22, 2003

With selection Sunday only a few weeks away, the college basketball selection process and RPI in particular are beginning to be noticed. In particular, ESPN.com writer Andy Katz recently ran a piece critical of the RPI's listing of BYU ahead of Maryland and Pitt. This is certainly a valid criticism, but unfortunately his analysis is largely incorrect. I describe the various aspects of the RPI that can cause "problems", in decreasing order of importance.

Margin of Victory. The RPI ratings are calculated solely from knowledge of who beat whom, and not by how much. I fully agree with such a philosophy, as using scores in an important rating system causes some coaches to run up scores, thus invalidating the use of scores as an accurate and unbiased rating factor. So while the NCAA is correct is using scores only, it means that the resulting ratings do not use all available data, and thus do not consider everything we see when we watch games. The difference is clear from my own ratings. Ignoring margin of victory in team ratings, I rank Pitt #13, Maryland #31, and BYU #35. Including scores, the ratings instead become Pitt #5, Maryland #9, and BYU #26. Maryland's remarkable difference is due to the fact that their average winning margin has been 23 points while their average losing margin has been 8 points. Thus they are better than their record would indicate. Given that people have a very good understanding of who is better than whom, the lack of score data in the RPI is the #1 reason why somebody would make a double-take at the RPI ratings.
Schedule Strength. To a reasonable approximation, the RPI computes schedule strength roughly as a straight average of the RPI ratings of a team's opponents. This is not the best approach. You can find a detailed explanation in my predictive ratings page; the bottom line is that a team's schedule strength should be weighted in favor of games against comparably-matched opponents. The reason is that a game against an opponent that you have virtually no chance of beating (or one that you have no chance of losing to) tells nothing about your team's strength. However the RPI does not recognize this fact, and a team would improve its schedule strength as much from playing the #162 team instead of the #323 team as it would from playing the #1 team instead of the #162 team. Realistically, a top-flight program should have virtually no problem dispatching either the #162 or #323 teams, while the #1 team would be a tough challenge. The end result is that, in my ratings, BYU has the #58 schedule strength, compared with #10 for Maryland and #13 for Pitt. As BYU's record was comparable to Maryland's and worse than Pitt's, this would have been enough to put the three teams in a more "reasonable" order.
Schedule Approximation. The previous two issues are well-understood and generaly are correctly adjusted by the committee to produce reasonable selections and seeds. What is almost universally overlooked is that the RPI's accuracy is based on an assumed correlation between the average record of a team's opponents and the average record of a team's opponents' opponents' opponents, opponents' opponents' opponents' opponents, and beyond. Overall the approximation is a good one -- because a team plays over half of its games against conference foes, there is a good correlation between the strength of its opponents and that of its opponents' opponents' opponents and beyond. However, in single cases, this can be a very poor approximation, in that a team will profit from playing teams with weak opponents. What is worse, the selection committee does not appear to be aware of this fact, and is thus unlikely to make mental adjustments for teams benefiting from or hurt by this inaccuracy.
Game Location. The most common (and most overrated) complaint about the RPI is that it does not factor game locations into account. In reality, a typical opponent is 0.033 RPI points more difficult when at home, and is 0.033 RPI points easier when on the road. Since a typical tournament-bound team will play over 30 games during the season, its RPI is lowered by about 0.001 points for every road game and improved by about 0.001 points for every home game. On average, this equates to one spot in the RPI rankings per extra game at home or on the road. What is particularly interesting is that Katz' article featured the BYU coach complaining about this aspect of the RPI. Apparently he needs some math lessons, as BYU had played 11 home games, 8 road games, and 4 neutral-site games at the time the article was written.

OK, so now that the issues are explained, how do these things really affect the RPI ratings?

There's really no complaint here. Scores should not be used as part of the RPI, as it would discourage unsportsmanlike behavior from teams on the bubble. What bothers me is that the Sagarin ratings are used by the selection committee, which are 50% based on scores and 50% on win-loss information. The effect of scores on ratings can be seen by comparing my "predictive" and "standard" ratings.
The schedule strength error penalizes teams that have several weak opponents on their schedule. I doubt anyone will shed tears over Maryland being penalized for playing cupcakes out of conference; who really gets hurt are the elite midmajor teams who are forced to play weak conference opponents. You can judge the effect of this by comparing the "improved RPI" with my standard ratings.
Schedule approximation is a sneaky one to account for, as it is primarily based on how tough a schedule your opponents are playing. You get helped if your opponents rack up lots of wins against easy opponents, since your opponents record counts as 50% of the RPI while your opponents' opponents' record counts as only 25%. Similarly, you are penalized if you play teams that scheduled tough opponents. The most glaring recent example is Butler, whose RPI was in the bottom-70's (out of contention) but should have been in the mid-50's. You can judge the effect of this by comparing the "RPI" and the "improved RPI" in my ratings.
As noted above, your RPI is improved 0.001 points per home game and lowered 0.001 points per road game. Realistically, nobody plays all that imbalanced of a schedule, so the effect is minimal. I can't forsee anybody's rating messed up by more than 5-6 RPI spots because of this.

Return to ratings main page

Note: if you use any of the facts, equations, or mathematical principles introduced here, you must give me credit.