## Wednesday, August 17, 2011

### Possessions, Part 3

In a comment on the last posting, ProbablePicks suggested trying to predict the number of possessions in a game by a regression on the average number of possessions for both teams in their previous games.  That was an excuse to add the ability to create averaged stats to my processing framework, so I put that in, debugged it for a while and then created the suggested regression.  Here was its performance:

Predictor    Error
Possessions (67)6.30
Possessions (Split Model)5.20
Possessions (Regression on Averages) 5.10

It does a little better than the Split Model.  The regression equation looks like this:
Poss = 0.665*HPossave + 0.620*APossave - 20.540
This weights the home team slightly more (about 7%) than the away team -- I speculated on this possibility with the Split Model but didn't see a performance improvement in that case.

One speculation I've had is that possessions might be harder to predict in close games.  There will usually be more fouling and aggressive defense that might create turnovers and additional possessions.  We can (sort of) look at that by filtering out games where the MOV was above some cutoff:

Predictor    Error
Possessions (Regression on Averages) 5.10
Possessions (MOV > 8) 4.97
Possessions (MOV > 12) 4.77
Possessions (MOV < -8) 5.17
Possessions (MOV < -12) 5.19

There's an interesting result here:  We can do a better job predicting possessions when the home team is winning a blowout, but we do worse predicting possessions when the away team is winning a blowout.  I'd be inclined to think this was because games with MOV > 12 (say) are going to skew to the top end of the possessions range anyway, and the compression of the range of possible results will reduce the error.  But that's contradicted by the results for away team blowouts, so there's presumably some other explanation.  Of course, for predictive purposes this doesn't matter because we won't know the MOV anyway, but it's an intriguing result nonetheless.