(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)
From my new article in the journal Epidemiology:
Sander Greenland and Charles Poole accept that P values are here to stay but recognize that some of their most common interpretations have problems. The casual view of the P value as posterior probability of the truth of the null hypothesis is false and not even close to valid under any reasonable model, yet this misunderstanding persists even in high-stakes settings (as discussed, for example, by Greenland in 2011). The formal view of the P value as a probability conditional on the null is mathematically correct but typically irrelevant to research goals (hence, the popularity of alternative—if wrong—interpretations). A Bayesian interpretation based on a spike-and-slab model makes little sense in applied contexts in epidemiology, political science, and other fields in which true effects are typically nonzero and bounded (thus violating both the “spike” and the “slab” parts of the model).
I find Greenland and Poole’s perspective to be valuable: it is important to go beyond criticism and to understand what information is actually contained in a P value. These authors discuss some connections between P values and Bayesian posterior probabilities. I am not so optimistic about the practical value of these connections. Conditional on the continuing omnipresence of P values in applications, however, these are important results that should be generally understood.
Greenland and Poole make two points. First, they describe how P values approximate posterior probabilities under prior distributions that contain little information relative to the data:
This misuse [of P values] may be lessened by recognizing correct Bayesian interpretations. For example, under weak priors, 95% confidence intervals approximate 95% posterior probability intervals, one-sided P values approximate directional posterior probabilities, and point estimates approximate posterior medians.
I used to think this way, too (see many examples in our books), but in recent years have moved to the position that I do not trust such direct posterior probabilities. Unfortunately, I think we cannot avoid informative priors if we wish to make reasonable unconditional probability statements. To put it another way, I agree with the mathematical truth of the quotation above, but I think it can mislead in practice because of serious problems with apparently noninformative or weak priors. . . .
I really like this article. At its center are three examples: “A P value that worked” (to dismiss a hypothesis of fraud in a local election), “A P value that was reasonable but unnecessary” (in our estimates of the effects of redistricting) and “A misleading P value” (from the notorious Daryl Bem). My statistical thinking has changed a lot in the past few years—more and more, I’ve been favoring informative priors, in that way I’m going with the entire statistical and machine learning communities which have been moving away from least squares and toward regularization—and Sander Greenland has been a big influence on my attitudes here, so it was great to have an opportunity to explore these ideas in the context of his paper, and in a journal where I’d never published before (#97).
Greenland and Poole’s original article does not appear to be available online, but here’s the abstract, and here’s their rejoinder to my discussion. One reason my article came out so well is that, after writing it, I sent it to Greenland, who pointed out a number of places where I’d misunderstood what he’d written. We went through a few iterations. It was annoying at first, but at any point I could’ve stopped and just published what I had. Instead I stuck it out, swallowed my pride, and ended up with something much improved.
Greenland is one tough town, indeed.
Please comment on the article here: Statistical Modeling, Causal Inference, and Social Science