Wednesday, March 21, 2012

Upset Review

For the past three years that the Pain Machine has participated in the Machine Madness contest, I've maintained (without any real justification) that the proper strategy is to pick the correct upsets -- as opposed to simply picking the most likely outcome, which will be the higher seed in every case where the committee hasn't completely blown the seeding.  In light of that, I wanted to review the PM's upset-picking strategy and see how it has worked out this year.

The PM predicts the Margin of Victory for each tournament game.  With two exceptions this year, the predicted winner was the higher-seeded team.  Historically, we know that the upset rate in the first round has been around 22%, and the upset rate for the whole tournament around 15%.  (An upset is where a team seeded at least 2 lower than its opponent wins the game.  A #9 over a #8 is not considered an upset.)  In light of this, I force the PM's tournament picks to include 6 upsets in the first round and 5 more in the rest of the tournament.

The picking strategy is fairly straightforward.  First of all, any games where the PM thinks an upset will happen are marked as upsets.  After that, the PM marks the remaining of 6 games in the first round which have the lowest predicted MOVs as upsets and (after recalculating the rest of the bracket based upon those upsets) the remainder of 5 games in the rest of the bracket by the same criterion.

This year, that resulted in these upset picks (predicted MOV shown in parentheses, correct picks bolded) for the first round:

(11) Texas over (6) Cincinnati (-0.6)
(12) Cal/USF over (5) Temple (1.4)
(11) NC State over (6) SDSU (1.9)
(10) Purdue over (7) St. Mary's (3.3)
(10) WVU over (7) Gonzaga (3.3)
(9) UConn over (5) Iowa St. (3.6)

The PM picked 3 of these 6 upsets correctly: USF, NC State and Purdue.  Texas shot just 16% in the first half and still managed to tie the game in the second half but couldn't finish the rally.  The other two games were not very close.  Still, getting 50% correct on upsets is probably pretty good performance.

The PM has the following upsets picked in later rounds:

(2) OSU over (1) Syracuse (-0.8)
(2) Kansas over (1) Kentucky (0.6)
(11) Texas over (3) FSU (2)
(5) New Mexico over (4) Louisville (2.7)
(6) Baylor over (2) Duke (3)

The FSU and Duke upsets cannot happen.  The New Mexico upset did not happen.  The other two games have not yet occurred.

We can also look at the (say) the most likely upsets at each seed position.  These were:

(16) UNC-Asheville vs. (1) Syracuse (16.1)
(15) Lehigh vs. (2) Duke (12.8)
(14) Belmont vs. (3) Georgetown (6.3)
(13) Ohio vs. (4) Michigan (7.9)
(12) Cal/USF over (5) Temple (1.4)
(11) Texas over (6) Cincinnati (-0.6)
(10) Purdue over (7) St. Mary's (3.3)
(9) UConn over (8) Iowa State (3.6)

Again, the PM got 50% correct.

Of course, the PM also missed a number of upsets:

(12) VCU over (5) Wichita St. (9.6)
(10) Xavier over (7) Notre Dame (7.7)
(15) Norfolk St. over (2) Missouri (23.2)
(11) NC State over (3) Georgetown (5.4)

The Norfolk State win really stands out here as the outlier -- it was at least twice as unlikely as the Duke-Lehigh upset.  I don't have the statistic handy, but 23 point upsets have to be greater 1 in a 1000 historically.  (The beating Norfolk St. took in the next round is indicative of how anomalous the first round upset was.)  VCU was a darling upset pick for many, in part due to their Cinderella status last year.  This year's VCU team was considerably weaker, and the win over Wichita State was another very unlikely result.  The Georgetown upset was the least surprising.  The 5 point differential is well within the ~10 point error margin of the PM's predictions.

Overall, I give the PM a very positive grade for it's upset picks.  It's clearly able to identify games where upsets are likely.  I may have to work on how it selects upsets, though.  There isn't a strong correlation between the magnitude of MOV and the likelihood of upset when MOV is under about 6 points, so it may not make sense to pick the games with the lowest MOVs.  It may make more sense to pick upsets based upon other factors.

1 comment:

  1. Upsets would seem more likely when one or both teams have greater variance in their performance. This year, for example, Kansas was dominant, but they also lost bad 3 or 4 times to not good teams, and lost badly. So their performance is more variable, so they would seem more likely to be an upset candidate.

    ReplyDelete