# Power analysis and NIH-style statistical practice: What’s the implicit model?

So. Following up on our discussion of “the 80% power lie,” I was thinking about the implicit model underlying NIH’s 80% power rule.

Several commenters pointed out that, to have your study design approved by NSF, it’s not required that you demonstrate that you have 80% power for real; what’s needed is to show 80% power conditional on an effect size of interest, and also you must demonstrate that this particular effect size is plausible. On the other hand, in NIH-world the null hypothesis could be true: indeed, in some way the purpose of the study is to see, well, not if the null hypothesis is true, but if there’s enough evidence to reject the null.

So, given all this, what’s the implicit model? Let “theta” be the parameter of interest, and suppose the power analysis is performed assuming theta = 0.5, say, on some scale.

My guess, based on how power analysis is usually done and on how studies actually end up, is that in this sort of setting the true average effect size is more like 0.1, with a lot of variation: perhaps it’s -0.1 in some settings and +0.3 in others.

But forget about what I think. Let’s ask: what does the NIH think, or what distribution for theta is implied by NIH’s policies and actions?

To start with, if the effect is real, we’re supposed to think that theta = 0.5 is a conservative estimate. So maybe we can imagine some distribution of effect sizes like normal with mean 0.75, sd 0.25, so that the effect is probably larger than the minimal level specified in the power analysis.

Next, I think there’s some expectation that the effect is probably real, let’s say there’s at least a 50% chance of there being a large effect as hypothesized.

Finally, the NIH accepts that researcher’s model could’ve been wrong, in which case theta is some low value. Not exactly zero, but maybe somewhere in a normal distribution with mean 0 and standard deviation 0.1, say.

Put this together and you get a bimodal distribution:

And this doesn’t typically make sense, that something would either have a near-zero, undetectable effect, or a huge effect with little possibility of anything in between. But that’s what’s the NIH is implicitly assuming, I think.