Daryl Bem and Arthur Conan Doyle

July 12, 2017

(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)

Daniel Engber wrote an excellent news article on the replication crisis, offering a historically-informed perspective similar to my take in last year’s post, “What has happened down here is the winds have changed.”

The only thing I don’t like about Engber’s article is its title, “Daryl Bem Proved ESP Is Real. Which means science is broken.” I understand that “Daryl Bem Proved ESP Is Real” is kind of a joke, but to me this is a bit too close to the original reporting on Bem, back in 2011, where people kept saying that Bem’s study was high quality, state-of-the-art psychology, etc. Actually, Bem’s study was crap. It’s every much as bad as the famously bad papers on beauty and sex ratio, ovulation on voting, elderly-related words and slow walking, etc.

And “science” is not broken. Crappy science is broken. Good science is fine. If “science” is defined as bad articles published in PPNAS—himmicanes, air rage, ages ending in 9, etc.—then, sure, science is broken. But if science is defined as the real stuff, then, no, it’s not broken at all. Science could be improved, sure. And, to the extent that some top scientists operate on the goal of tabloid publication and Ted-talk fame, then, sure, the system of publication and promotion could be said to be broken. But to say “science is broken” . . . . I think that’s going too far.

Anyway, I agree with Engber on the substance and I admire his ability to present the perspectives of many players in this story. A grabby if potentially misleading title is not such a big deal.

But what about that Bem paper?

One of the people who pointed me to Engber’s article knows some of the people involved and assured me that the Journal of Personality and Social Psychology editor who handled Bem’s paper is, and was, no fool.

So how obvious were the problems in that original article?

Here, I’m speaking not of problems with Bem’s theoretical foundation or with his physics—I won’t go there—but rather with his experimental design and empirical analysis.

I do think that paper is terrible. Just to speak of the analysis, the evidence is entirely from p-values but these p-values are entirely meaningless because of forking paths. The number of potential interactions to be studied is nearly limitless, as we can see from the many many different main effects and interactions mentioned in the paper itself.

But then the question is, how could smart people miss these problems?

Here’s my answer: It’s all obvious in retrospect but wasn’t obvious at the time. Remember, Arthur Conan Doyle was fooled by amateurish photos of fairies. The JPSP editor was no fool either. Much depends on expectations.

Here are the fairy photos that fooled Doyle, along with others. The photos are obviously faked, and it was obvious at the time too. Doyle just really really wanted to believe in fairies. From everything I’ve heard about the publication of Bem’s article, I doubt that the journal editor really really wanted to believe in ESP. But I wouldn’t be surprised if this editor really really wanted to believe that an eminent psychology professor would not do really bad research.

P.S. I wrote the post a few months ago and it just happened to appear the day after a post of mine on why “Clinical trials are broken.” So we’ll need to discuss further.

P.P.S. Just to clarify the Bem issue, here are a few more quotes from Engber’s article:

Even with all that extra care, Bem would not have dared to send in such a controversial finding had he not been able to replicate the results in his lab, and replicate them again, and then replicate them five more times. His finished paper lists nine separate ministudies of ESP. Eight of those returned the same effect.

Bem’s paper has zero preregistered replications. What he has are “conceptual replications,” which are open-ended studies that can be freely interpreted as successes through the garden of forking paths.

Here’s Engber again:

But for most observers, at least the mainstream ones, the paper posed a very difficult dilemma. It was both methodologically sound and logically insane.

No, the paper is not methodologically sound. Its conclusions are based on p-values, which are statements regarding what the data summaries would look like, had the data come out differently, but Bem offers no evidence that, had the data come out differently, his analyses would’ve been the same. Indeed, the nine studies of his paper feature all sorts of different data analyses.

Engber gets to these criticisms later in his article. I just worry that people who just read the beginning will take the above quotes at face value.

The post Daryl Bem and Arthur Conan Doyle appeared first on Statistical Modeling, Causal Inference, and Social Science.

Please comment on the article here: Statistical Modeling, Causal Inference, and Social Science

Tags: ,