Thursday, November 08, 2012

Math! Numbers! Signal! Noise!

There's been a fair amount of gloating about how Nate Silver's election models were vindicated by the actual results of the election.  Silver's defenders have claimed this is a victory for numbers, data, math, science, and rationality over intuition and sentimentality.

This plays nicely into the current preferred narratives regarding the parties.  The GOP is home to creationists and climate change deniers, so the Democrats see themselves as the defenders of science, and Republicans as the troglodytes favoring what they wish was true over what has been empirically demonstrated to be the case.  Republicans criticizing a complicated election model predicting an Obama victory aligns perfectly with this vision.

Of course, the parties don't play these roles in all cases.  For example, Republicans are more likely to push for evaluating schools and teachers based on test scores, whereas Democrats are inclined to be more skeptical.  This isn't (necessarily) because Democrats are opposed to numerical analysis, but because they believe that standardized tests do not capture the whole of the effectiveness of a school or teacher, and that student achievement is influenced by a number of factors, and it is unfair to base our evaluation of teachers and schools on something they may have little or no control over.

This echoes the more thoughtful criticisms of Nate Silver's models.  Yes, there were some suggesting that big rally crowds were a more important indicator than all the polling data. But others challenged whether the inputs were valid -- specifically, they challenged whether the polls that were the main input to the model were correctly accounting for party ID (Not having a particular dog in this fight, I didn't follow this debate terribly closely, and am not terribly inclined to relive it now, but my idea was that this was the main beef).  The critics turned out to be wrong about that.  But their error was not a refusal to accept data and numbers and empirical evidence, but a simple error of what inputs matter and a degree of wishful thinking.

As has been noted, Silver worked in baseball sabermetrics* before modelling elections.  That all baseball teams now incorporate statistical analysis into their decision making is cited as vindication for their methods.

And yes, it is true that the validity of statistical analysis is now widely accepted.  But that doesn't mean that there weren't some missteps along the way.  The Oakland Athletics were the first to move to this model, but weren't able to parlay that advantage into world championships.  A look at the most successful franchises over the past decade include some, notably the Boston Red Sox, who embraced these tactics, and others that didn't.

In any instance, in his 2000 Baseball Abstract, Bill James included a listing of the top 50 players at each position using his newly developed Win Shares system.  James started by explaining the system, which included a "subjective component," which allowed James to fudge the numbers a bit for players whose impact on the game may have escaped the statistics.  An obvious application of this was African American players who arrived in the majors after Jackie Robinson, and likely lost a good chunk of their careers to segregation.

James pointed out that this was not the truly subjective portion of the formula.  The truly subjective portion of it was James coming up with the formula for the not explicitly portions of the formula.  This necessarily included James' assumptions about what statistics mattered in evaluating a player's performance, which ones do not, and the relative weights of those that do.

I think this is an important lesson that I hope we don't forget in light of Silver's vindication.

Just because something is a numerical model doesn't mean it must be blindly accepted by all wishing to claim to be rational.  I could have constructed a model of the presidential election based entirely on the height of the candidates, or their weight, or the populations of their home states, or any other tangential data points (even the size of the crowds at rallies the week before the election) .  I could have done extensive analysis of how these numbers correlate to election results, and come out with a predictive numerical answer as a probability for each candidate's victory.

Or, for another example:



But even though this model is based on data and evidence, it would be close to worthless, and criticizing it would not be synonymous with rejecting rationality and science.  The current manner of giving teachers raised based on seniority and certifications is empirical in its own way, just that it's based on data that many believe has little to do with effective teaching.

In fact, may would claim that the opinion of an expert observer would likely be superior. My (non-empircal!) suspicion is that the continuum of accuracy alternates between different qualities of models, and opinions from people with different levels of expertise.  The best scout is better than the worst statistical model, and the best statistical model is better than the worst scout.

Nate Silver's success isn't based on his willingness to base his predictions on data and math; it is in selecting the data points that are relevant and rejecting those that are not, and weighting those data points appropriately (hence the title of his book being "The Signal and the Noise").

The model is based on numbers.  But which numbers matter is determined by a person, and can be subject to feedback and refinement. It is unlikely someone would get this exactly right the first time. It is both valid and proper to challenge the assumptions of certain models. In baseball, the test is the game. In this case, the test is the election. Silver passed; his critics failed.

If the lesson we draw from this is that we must trust all conclusions from models based on data, that's the wrong lesson. Rather, it we should credit Silver and others for basing his model on the right data.




* I have my own reservations about sabermetrics.  I don't have a problem with it informing team management decisions, but challenge that fan conversations, and even all-star elections and postseason awards must be statistically based.  Being a sports fan is supposed to be fun, not a grim business analysis.  If a guy wins the Triple Crown and leads his team to the World Series, give him the MVP, for crying out loud, even if RBI is a team-dependent statistic, and batting average is not a good proxy for offensive effectiveness.


Post a Comment