(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)
Zach Shahn saw this and writes:
I just heard a talk by Peter Bartlett about model selection in “unlimited” data situations that essentially addresses this curve.
He talks about the problem of model selection given a computational budget (rather than given a sample size). You can either use your computational budget to get more data or fit a more complex model. He shows that you can get oracle inequalities for model selection algorithms under this paradigm (as long as the candidate models are nested).
I can’t follow all the details but it looks cool! This is what they should be teaching in theoretical statistics class, instead of sufficient statistics and the Neyman-Pearson lemma and all that other old stuff.
Zach also asks:
I have a question about political science. I always hear that the direction of the economy is one of the best predictors of election outcome. What’s your thinking about the causal mechanism(s) behind the success of economic trend indicators as predictors? Are voters reacting to what they hear on the news about how the economy is doing? Or are they reacting to an observed improvement in their quality of life and the quality of life of those around them? Or both or neither?
My response: Many people have written about this but I’m not up on the latest literature. Here’s a discussion with Doug Hibbs, Larry Bartels, and Jim Campbell on related issues. My best answer to your question is that different voters are reacting to different things, and that personal experiences count as news to people too. After the 1992 election some Republicans tried to make the case that George H. W. Bush lost reelection because the economic numbers were bad, even though the actual economy was improving. But it’s tough to separate these.
Please comment on the article here: Statistical Modeling, Causal Inference, and Social Science