What is needed to do good research (hint: it’s not just the avoidance of “too much weight given to small samples, a tendency to publish positive results and not negative results, and perhaps an unconscious bias from the researchers themselves”)

May 15, 2017
By

(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)

[cat picture]

In a news article entitled, “No, Wearing Red Doesn’t Make You Hotter,” Dalmeet Singh Chawla recounts the story of yet another Psychological Science / PPNAS-style study (this one actually appeared back in 2008 in Journal of Personality and Social Psychology, the same prestigious journal which published Daryl Bem’s ESP study a couple years later).

Chawla’s article is just fine, and I think these non-replications should continue to get press, as much press as the original flawed studies.

I have just two problem. The first is when Chawla writes:

The issues at hand seem to be the same ones surfacing again and again in the replication crisis—too much weight given to small samples, a tendency to publish positive results and not negative results, and perhaps an unconscious bias from the researchers themselves.

I mean, sure, yeah, I agree with the above paragraph. But there are deeper problems going on. First, any effects being studied are small and highly variable: there are some settings where red will do the trick, and other settings where red will make you less appealing. Color and attractiveness are context-dependent, and it’s just inherently more difficult to study phenomena that are highly variable. Second, the experiment in question used a between-person design, thus making things even noisier (see here for more on this topic). Third, the treatment itself was minimal, of the “priming” variety: the color of a background of a photo that was seen for five seconds. It’s hard enough to appear attractive to someone in real life: we can put huge amounts of effort into the task, and so it’s a bit of a stretch to think that this sort of five-second intervention could do much of anything.

Put it together, and you’re studying a highly variable phenomenon using a minimal treatment, using a statistically inefficient design. The study is dead on arrival. Sure, small samples, the garden of forking paths, and the publication process make it all worse, but there’s no getting around the kangaroo problem. Increase your sample and publish everything, and you still won’t be doing useful science; you’ll just be publishing piles of noise. Better than what was done before—I’d much prefer JPSP to publish piles of noisy raw data than to fake people out with ridiculous claims—but still not good science.

My second problem is with this final quote:

In an interview with Slate, Elliot admitted that sample sizes in his earlier works were “too small relative to contemporary standards.” He added, “I have an inclination to think that red does influence attraction, but it is important for me to be open to the possibility that it does not.”

My reply: of course red influences attraction! So does blue! So does the cut of your clothes and whether you chewed on breath mints recently. All these things have effects. But . . . trying to study the effect of red in isolation, using the background of an image . . . that’s just hopeless. That’s the outmoded, button-pushing model of social and behavioral science, which is tied to an outmoded, significance-testing model of statistics.

The post What is needed to do good research (hint: it’s not just the avoidance of “too much weight given to small samples, a tendency to publish positive results and not negative results, and perhaps an unconscious bias from the researchers themselves”) appeared first on Statistical Modeling, Causal Inference, and Social Science.



Please comment on the article here: Statistical Modeling, Causal Inference, and Social Science

Tags:


Subscribe

Email:

  Subscribe