This was something we used a few years ago in one of our research projects and in the paper, Difficulty of selecting among multilevel models using predictive accuracy, with Wei Wang, but didn’t follow up on. I think it’s such a great idea I want to share it with all of you.

We were applying a statistical method to survey data, and we had a survey to work with. So far, so usual: it’s a real-data application, but just one case. Our trick was that we evaluated our method separately on 71 different survey responses, taking each in turn as the outcome.

So now we have 71 cases, not just 1. But it takes very little extra work because it’s the same survey and the same poststratification variables each time.

In contrast, applying our method to 71 different surveys, that would be a lot of work, as it would require wrangling each dataset, dealing with different question wordings and codings, etc.

The corpus formed by these 71 questions is not *quite* the same as a corpus of 71 different surveys. For one thing, the respondents are the same, so if the particular sample happens to overrepresent Democrats, or Republicans, or whatever, then this will be the case for all 71 analyses. But this problem is somewhat mitigated if the 71 responses are on different topics, so that nonrepresentativeness in any particular dimension won’t be relevant for all the questions.