Friday, March 4, 2016

Scoring the Kaggle Contest

In this previous post, I talked briefly about whether competitors in the Kaggle March Madness contest should "gamble" with their entries.  The short answer is "yes" -- if your goal is to win money, then your best strategy is to gamble with at least some of your game predictions.  (How many games you should gamble with is an interesting question for another time.) In my opinion, that's a sign that the contest is broken.  Rather than testing who can make the best predictions about the NCAA Tournament, the contest is testing who can formulate the best meta-strategy for winning the contest.  

So, is it possible to fix the contest so that the results more accurately identify the best predictions?

The log-loss scoring function asks competitors to provide a confidence in their predictions, and then scores them based upon how confident their correct (or incorrect) prediction was.  If you analyze this scoring approach, you find that the best strategy in the long run is for the competitors to set their confidence in each prediction to exactly their real confidence.  And that's exactly what you want for a fair and accurate scoring system.

The problem is that this contest is not a "long run".  In fact, it's anything but a long run -- there are only 63 games being scored.  That's a lot compared to predicting (say) just the Superbowl, but for a contest like this it's not nearly long enough to ensure that true predictions are the best strategy.  

So, how can we fix the scoring to better reward true predictions?

The obvious fix of having the teams play a few thousand games is probably a non-starter.  But it does point towards the necessary condition:  We want the competitors to be making many choices instead of just 63.  My suggestion is to have the competitors predict the Margin of Victory (MOV) for each game, and score them on how close they get to the actual MOVs.  Now instead of making 63 binary predictions, the competitors are making 63 predictions with many more choices, and -- crucially -- they don't have control over how much they will win/lose on each prediction.

It should be obvious that this makes it more difficult to "gamble" for an improved score.  Consider last year, where Kentucky was viewed as an overwhelming favorite coming into the Tournament.  Under the current scoring system there was an easy and obvious "gambling" strategy -- predict that Kentucky would win every game and set your confidence in each of those games very high.  (And in fact, if Kentucky had won the championship game, a gambling strategy would probably have won the contest.)  However, under the Margin of Victory scoring system, how would you "gamble" to improve your chances of winning the contest?  It's hard to imagine any approach that would work better than submitting your actual best predictions.

The Kaggle contest is a fun diversion and I think the results have provided some interesting insight into predicting college basketball games.  But I think the contest would be improved by using a scoring system that more accurately identified the best predictor, and I'll continue my low-key lobbying efforts (*) for that change.

(*) Which consist entirely of posting something like this every year :-)


  1. To me the bigger issue is the reward structure. Because the rewards are concentrated in the top few spots and everything else is zero, there will always be incentives to gamble because trading off an increased chance at a lower finish for an increased chance at a higher finish yields higher expected value.

    While predicting more games or predicting MOV would help some, it still wouldn't eliminate the issue. Consider a contest of a million games where all contestants know the exact correct probabilities. It will obviously still be better to gamble (albeit to a lesser extent) to separate yourself from the pack.

    So to me the only real solution would be to pay out prizes proportionally to the scores for each entry. And that's just assuming that prize money is the only consideration. Factoring in the prestige of "winning", etc could still lead to gambling being the optimal strategy.

  2. I think you're right that the reward structure is another distorting factor. Certainly I think this year's structure is a move in the right direction to address that.

  3. Right, although taken to the extreme that I discussed, while encouraging true predictions, would be really boring!

  4. I like this idea. Though, I don't plan on entering regardless.