(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)
Jordan Ellenberg writes:
Lots of people sharing this today.
Isn’t this exactly the kind of situation where they should have done some kind of shrinkage towards the national mean, as in that thing you wrote about kidney cancer rates by county? i.e. you see, just as you might expect, the extreme values of “proportion of people who said they were gay” are disproportionately taken by small states.
If I don’t have the individual-level survey data that would allow me to do full-scale Mister P, yes, I’d fit a multilevel model to the state-level averages. I wouldn’t quite just partially pool toward the national mean; I think it would make sense to include some state-level predictors.
In any case, I think it’s tacky to report poll numbers to fractional percentage points. That kind of precision simply isn’t there.
P.S. More discussion of variances of large and small states in the comments.
Please comment on the article here: Statistical Modeling, Causal Inference, and Social Science