Thursday, April 21, 2011

A Tale of Two Demons

We now turn our attention to one of the most vexing aspects of RPI, illustrated this season by the Tale of Two Demons:  the first being the DePaul Blue Demons and the second being the Northwestern State Demons.

DePaul University finished the season 7-24 and a miserable 1-18 in conference.  DePaul's wins included 4-25 Chicago St., 8-21 Northern Illinois and 9-21 Central Michigan.

In contrast, Northwestern St. finished the season 16-13 with a respectable 10-7 record in conference.  They split home-and-home with Southland East Division champion McNeese St. and came within 1 point of winning the conference tournament and advancing to the NCAAs.

Yet curiously, DePaul has an RPI of 0.4590 and Northwestern State an RPI of 0.4566!  How does this happen?

Recall the oft-cited formula for RPI: 
RPI = (WP * 0.25) + (OWP * 0.50) + (OOWP * 0.25)
The biggest factor in this equation is the Opponents' Winning Percentage.  But critically, the opponents' winning percentage is calculated from all of a team's opponents.  So DePaul University benefits more from their 18 losses to strong Big East opponents than Northwestern State does from it's winning record in the Southland Conference.

No doubt when the NCAA concocted this aspect of the RPI formula, they were thinking of the case where a team has run up a good record against a bunch of patsies.  In that case, the team's RPI gets docked because the OWP is low; and that makes sense because those are also (mostly) opponents the team has beaten.  (The NCAA might also have been intentionally motivating teams to play strong out-of-conference schedules.)  But it certainly seems counter-intuitive to give a team more credit for being beaten by good teams than for beating mediocre teams.  And surely it makes for worse predictability.

Doesn't it?

Well, time to roll out the code and test.  A reasonable first approach is to calculate OWP as the average of all the opponents a team actually beat, rather than all the opponents.  (A similar reasoning applies to OOWP.)  Let us try that approach using our current best RPI formula as well as the "Infinite Depth" RPI:

  Predictor    % Correct    MOV Error  
1-Bit62.6%14.17
RPI (unw,15+15+70)75.4%11.49
RPI (unw,15+15+70,winners)72.0%12.20
RPI (infinite)74.6%11.33
RPI (infinite,winners)74.0%11.97

In both cases, this change makes for a worse predictor!  That's hard to comprehend.  Essentially, this says that losing to good teams makes a team more likely to win future games.  While I can certainly come up with a rationalization for that (e.g., playing good teams makes you a better team, even if you lose), it's hard to put much faith in it.  Still, the numbers don't lie.

As a second approach, we can break out the OWP for both the opponents we've beaten as well as the opponents that beat us.  So we'll have a "Winners OWP" and a "Losers OWP".  Instead of completely dropping the OWPs of the teams that beat us, we can include them as a less-weighted factor. Weighting the beaten opponents by about 3x the "lost to" opponents does improve performance over just the beaten opponents (to 369/11.47 for the unweighted, 15+15+70 version of RPI) but not enough to make it better than the standard versions.  So (at least for this approach) we have to conclude that RPI may be right about the relative strengths of the DePaul Blue Demons and the Northwestern St. Demons.  Apparently playing good opponents, even if you lose to them, makes you a stronger team.

We've about beaten RPI to death at this point, but we'll take a look at one more tweak before moving on to look at some other ratings that also make use of only won-loss records.

2 comments:

  1. I've thought about this before too. Seems like if a team won a game, you'd only want to attribute to them the wins of their opponent (and the wins of teams beaten by that opponent, etc), and if they lost a game, you'd only want to attribute to them the losses of their opponent (and the losses of teams that lost to that opponent, etc). And so you don't end up with a bunch of undefeated, perfectly ranked teams, you could start each team off with an arbitrary 1-1 record against "the average team" or something.

    ReplyDelete
  2. Exactly. So it seems very surprising to me that RPI works better without that sort of approach. Some of the other approaches I'll delve into after RPI take a more sophisticated approach to this problem, so maybe they'll prove to be more accurate predictors.

    ReplyDelete