Should we talk less about bad social science research and more about bad medical research?

Paul Alper pointed me to this news story, “Harvard Calls for Retraction of Dozens of Studies by Noted Cardiac Researcher: Some 31 studies by Dr. Piero Anversa contain fabricated or falsified data, officials concluded. Dr. Anversa popularized the idea of stem cell treatment for damaged hearts.”

I replied: Ahhh, Harvard . . . the reporter should’ve asked Marc Hauser for a quote.

Alper responded:

Marc Hauser’s research involved “cotton-top tamarin monkeys” while Piero Anversa was falsifying and spawning research on damaged hearts:

The cardiologist rocketed to fame in 2001 with a flashy paper claiming that, contrary to scientific consensus, heart muscle could be regenerated. If true, the research would have had enormous significance for patients worldwide.

I, and I suspect that virtually all of the other contributors to your blog know nothing** about cotton-top tamarin monkeys but are fascinated and interested in stem cells and heart regeneration. Consequently, are Hauser and Anversa separated by a chasm or should they be lumped together in the Hall of Shame? Put another way, do we have yet an additional instance of crime and appropriate punishment?

**Your blog audience is so broad that there well may be cotton-top tamarin monkey mavens out there dying to hit the enter key.

Good point. It’s not up to me at all: I don’t administer punishment of any sort; as a blogger I function as a very small news organization, and my only role is to sometimes look into these cases, bring them to others’ notice, and host discussions. If it were up to me, David Weakliem and Jay Livingston would be regular New York Times columnists, and Mark Palko and Joseph Delaney would be the must-read bloggers that everyone would check each morning. Also, if it were up to me, everyone would have to post all their data and code—at least, that would be the default policy; researchers would have to give very good reasons to get out of this requirement. (Not that I always or even usually post my data and code; but I should do better too.) But none of these things are up to me.

From Harvard’s point of view, perhaps the question is whether they should go easy on people like Hauser, a person who is basically an entertainer, and whose main crime was to fake some of his entertainment—a sort of Doris Kearns Goodwin, if you will—. and be tougher on people such as Anversa, whose misdeeds can cost lives. (I don’t know where you should put someone like John Yoo who advocated for actual torture, but I suppose that someone who agreed with Yoo politically would make a similar argument against, say, old-style apologists for the Soviet Union.)

One argument for not taking people like Hauser, Wansink, etc., seriously, even in their misdeeds, is that after the flaws in their methods were revealed—after it turned out that their blithe confidence (in Wansink’s case) or attacks on whistleblowers (in Hauser’s case) were not borne out by the data—these guys just continued to say their original claims were valid. So, for them, it was never about the data at all, it was always about their stunning ideas. Or, to put it another way, the data were there to modify the details of their existing hypotheses, or to allow them to gently develop and extend their models, in a way comparable to how Philip K. Dick used the I Ching to decide what would happen next in his books. (Actually, that analogy is pretty good, as one could just as well say that Dick he used randomness not so much to “decide what would happen” but rather “to discover what would happen” next.)

Anyway, to get back to the noise-miners: The supposed empirical support was just there for them to satisfy the conventions of modern-day science. So when it turned out that the promised data had never been there . . . so what, really? The data never mattered in the first place, as these researchers implicitly admitted by not giving up on any of their substantive claims. So maybe these profs should just move into the Department of Imaginative Literature and the universities can call it a day. The medical researchers who misreport their data: That’s a bigger problem.

And what about the news media, myself included? Should I spend more time blogging about medical research and less time blogging about social science research? It’s a tough call. Social science is my own area of expertise, so I think I’m making more of a contribution by leveraging that expertise than by opining on medical research that I don’t really understand.

A related issue is accessibility: people send me more items on social science, and it takes me less effort to evaluate social science claims.

Also, I think social science is important. It does not seem that there’s any good evidence that elections are determined by shark attacks or the outcomes of college football games, or that subliminal smiley faces cause large swings in opinion, or that women’s political preferences vary greatly based on time of the month—but if any (or, lord help us, all) of these claims were true, then this would be consequential: it would “punch a big hole in democratic theory,” in the memorable words of Larry Bartels.

Monkey language and bottomless soup bowls: I don’t care about those so much. So why have I devoted so much blog space to those silly cases? Partly its from a fascination with people who refuse to admit error even when it’s staring them in the face, partly because it can give insights into general issues and statistics and science, and partly because I think people can miss the point in these cases by focusing on the drama and missing out on the statistics; see for example here and here. But mostly I write more about social science because social science is my “thing.” Just like I write more about football and baseball than about rugby and cricket.

P.S. One more thing: Don’t forget that in all these fields, social science, medical science, whatever, the problem’s is not just with bad research, cheaters, or even incompetents. No, there are big problems even with solid research done by honest researchers who are doing their best but are still using methods that misrepresent what we learn from the data. For example, the ORBITA study of heart stents, where p=0.20 (actually p=0.09 when the data were analyzed more appropriately) was widely reported as implying no effect. Honesty and transparency—and even skill and competence in the use of standard methods—are not enough. Sometimes, as in the above post, it makes sense to talk about flat-out bad research and the prominent people who do it, but that’s only one part of the story.