Monday, February 16, 2009

Why I look at the numbers

Michael Lewis, author of the seminal book "Moneyball", penned a piece on under-appreciated basketball player Shane Battier for the NY Times recently. It's a long, involved 8 pager that discusses the "new stats" movement in major sports in general, and provides a few quotables on why I - and some of the more numerically inclined bloggers upon whom I rely on so much - employ seemingly foreign and arcane figures and percentages for discussion and analysis of the sport. A taste:

In 2005, the Houston Rockets’ owner, Leslie Alexander, decided to hire new management for his losing team and went looking specifically for someone willing to rethink the game. “We now have all this data,” Alexander told me. “And we have computers that can analyze that data. And I wanted to use that data in a progressive way. When I hired Daryl, it was because I wanted somebody that was doing more than just looking at players in the normal way. I mean, I’m not even sure we’re playing the game the right way.”

The virus that infected professional baseball in the 1990s, the use of statistics to find new and better ways to value players and strategies, has found its way into every major sport. Not just basketball and football, but also soccer and cricket and rugby and, for all I know, snooker and darts — each one now supports a subculture of smart people who view it not just as a game to be played but as a problem to be solved. Outcomes that seem, after the fact, all but inevitable — of course LeBron James hit that buzzer beater, of course the Pittsburgh Steelers won the Super Bowl — are instead treated as a set of probabilities, even after the fact. The games are games of odds. Like professional card counters, the modern thinkers want to play the odds as efficiently as they can; but of course to play the odds efficiently they must first know the odds. Hence the new statistics, and the quest to acquire new data, and the intense interest in measuring the impact of every little thing a player does on his team’s chances of winning. In its spirit of inquiry, this subculture inside professional basketball is no different from the subculture inside baseball or football or darts.


The article is interesting, also, due to basketball's similarity to hockey, relative to say, baseball. The latter is more static than the other two, and the dynamic nature of hockey and basketball has created challenges in teasing apart individual effects on winning (and has often been used as a bludgeon by anti-new-stats folks to denounce efforts to understand the game quantitatively):

There are other things Morey has noticed too, but declines to discuss as there is right now in pro basketball real value to new information, and the Rockets feel they have some. What he will say, however, is that the big challenge on any basketball court is to measure the right things. The five players on any basketball team are far more than the sum of their parts; the Rockets devote a lot of energy to untangling subtle interactions among the team’s elements. To get at this they need something that basketball hasn’t historically supplied: meaningful statistics. For most of its history basketball has measured not so much what is important as what is easy to measure — points, rebounds, assists, steals, blocked shots — and these measurements have warped perceptions of the game. (“Someone created the box score,” Morey says, “and he should be shot.”) How many points a player scores, for example, is no true indication of how much he has helped his team. Another example: if you want to know a player’s value as a ­rebounder, you need to know not whether he got a rebound but the likelihood of the team getting the rebound when a missed shot enters that player’s zone.

There's no question that hockey analysis is just taking it's first few stumbling steps in this direction. We're only now starting to learn which stats are truly indicative (and predictive) of winning and which are merely effects of other processes (such as chance or randomness). When I started this thing a few years ago, I was introduced to the parsing of stats into situation (ES, PP, PK) and the tracking of offensive efficiency (ie: rates) rather than summing totals. Recently, the focus has become percentages, the role of chance and variance over spans of time, and how those can potentially result in the over or undervaluation of a player.

We aren't there yet and some mistakes are going to be made along the way, Im sure...but my education in the quantitative methods of understanding hockey has actually informed and improved the way I watch the games now, I think. Perhaps what might be most interesting in the long run, however, is how methods of analysis will converge across different (yet similar) sports:

One well-known statistic the Rockets’ front office pays attention to is plus-minus, which simply measures what happens to the score when any given player is on the court. In its crude form, plus-minus is hardly perfect: a player who finds himself on the same team with the world’s four best basketball players, and who plays only when they do, will have a plus-minus that looks pretty good, even if it says little about his play. Morey says that he and his staff can adjust for these potential distortions — though he is coy about how they do it — and render plus-minus a useful measure of a player’s effect on a basketball game. A good player might be a plus 3 — that is, his team averages 3 points more per game than its opponent when he is on the floor. In his best season, the superstar point guard Steve Nash was a plus 14.5. At the time of the Lakers game, Battier was a plus 10, which put him in the company of Dwight Howard and Kevin Garnett, both perennial All-Stars. For his career he’s a plus 6. “Plus 6 is enormous,” Morey says. “It’s the difference between 41 wins and 60 wins.” He names a few other players who were a plus 6 last season: Vince Carter, Carmelo Anthony, Tracy McGrady.

Corsi, anyone?