The “Psychological Science Accelerator”: it’s probably a good idea but I’m still skeptical

Asher Meir points us to this post by Christie Aschwanden entitled, “Can Teamwork Solve One Of Psychology’s Biggest Problems?”, which begins:

Psychologist Christopher Chartier admits to a case of “physics envy.” That field boasts numerous projects on which international research teams come together to tackle big questions. Just think of CERN’s Large Hadron Collider or LIGO, which recently detected gravitational waves for the first time. Both are huge collaborations that study problems too big for one group to solve alone. Chartier, a researcher at Ashland University, doesn’t think massively scaled group projects should only be the domain of physicists. So he’s starting the “Psychological Science Accelerator,” which has a simple idea behind it: Psychological studies will take place simultaneously at multiple labs around the globe. Through these collaborations, the research will produce much bigger data sets with a far more diverse pool of study subjects than if it were done in just one place.

Aschwanden continues:

The accelerator approach eliminates two problems that can contribute to psychology’s much-discussed reproducibility problem, the finding that some studies aren’t replicated in subsequent studies. It removes both small sample sizes and the so-called weird samples problem . . .

So far, the project has enlisted 183 labs on six continents. The idea is to create a standing network of researchers who are available to consider and potentially take part in study proposals . . .

Studies that are selected go through a collaborative process in which researchers hammer out protocols and commit to publish a research plan in advance — a process known as preregistration . . .

The idea isn’t to pump out a bunch of studies, but to produce a lot of data, Chartier said: “Everything is open and transparent from the start, so what we’re going to end up with is a really solid data set on a specific hypothesis.”

This all sounds fine. But . . . what concerns me is the weakness of the underlying theory. This is all a step forward and a great idea, but the big difference between the physics particle accelerator and the psychological science accelerator is that in physics there are strong theories that make precise predictions, and in social science we’re mostly stumbling in the dark. So, yeah, go for it, but who knows if much of anything useful will come from all this.

I say this not with any intention to criticize the particular research projects mentioned in the linked article; I just want to say Whoa on the comparison to physics here: if it’s works, great, but let’s not be surprised if the data come out pretty damn noisy.

P.S. I wrote this post around 6 months ago. The topic just happened to come up recently in comments, where someone pointed to an article by James Coyne from 2016, Replication initiatives will not salvage the trustworthiness of psychology, that made similar points. Coyne’s article is stunning: it’s nothing that regular readers of this blog haven’t seen before, but it’s kind of amazing to see it all in one place. We discussed the article when it came out, but I guess that back then I thought that major progress was around the corner; I’d underestimated how slow the change would continue to be.

P.P.S. Alexander Aarts saw this post on the schedule and sends along these thoughts:

The people behind the initiative have recently posted a pre-print about it.

I (Aarts) am also skeptical, because I worry it will (1) not really (optimally) accelerate anything, and (2) will make the same sort of mistakes that have probably happened in the last decades.

Concerning (1) I think they will not really be optimally accelerating anything because:

# It seems to me that that the studies they will performing are very different concerning the topics
# If I understood things correctly there is no clear follow-up planned concerning the studies, and relevant theories.
# If I understood things correctly lots of participants will be used in these studies (possibly too many?)

These things altogether makes me think that this might not the most optimally manner to (a) perform psychological research, and (b) accelerate psychological research. I reason this might be much better accomplished by research programs that use an “optimal” number of participants, and directly follow-up previous studies in order to (re-) formulate, and test theories, and design better measurements, etc. I reason smaller groups of researchers might be way more efficient in optimally performing research and accelerating psychological science (e.g. here)

Concering (2) I think they will possibly make the same sort of mistakes that have probably happened in the last decades:

# It seems to me that they favor a hierarchical approach where there is talk of “study selection committee”, “experts”, etc. This to me is worrying in the sense that a lot of power will be handed to relatively few people. History, and science, have shown this might not be a good idea. To me, they are sort of building a network of “research assistant”-labs that will execute what the “principal investigator/professor”-lab has come up with. To me, that is not what a true collaboration is about.

# Their idea of “collaboration” is somewhat different than mine. I would view, and call, their collaboration-style via the Psychological Accelerator as more “dictatorial” in the sense that 1 idea/researcher/study is responsible for what the rest of the labs will do. My idea of collaboration (as attempted to best explain in the link to your blogpost called “stranger than fiction”) is more “democratic” in the sense that more ideas/studies by more different reserachers will be performed.

# I am a fan of a “democratic” approach to science in the sense that I reason everyone should be able to try and contribute to science, but not in the sense that everyone’s contribution is equal, or that scientists should vote for which study to perform. The science should be most important, and the science should lead the scientists, not the other way around. I think the Psychological Science Accelerator will make the same mistakes that probably happened in the past decades: many factors concerning the scientists that should have nothing to do with the actual science will play a role with the Psychological Science Accelerator.

Concerning the possible use of way too many participants, I posted a comment on your blog which might be a way to try and determine what the “optimal” no. of labs/participants could be for collaboration project: I don’t know much about computers and statistics, but I reason the information from all the large-scale collaborations from all the Registered Replication Reports could be used for the idea presented in the link, and could be a argument for/against the use of that many labs/participants.

David A. Kenny & Charles M. Judd wrote:

All of this leads to very different ideas about the conduct of research and the quest to establish the true effect in the presence of random variation. Replication research, it seems to us, should search to do more than simply confirm or disconfirm earlier results in the literature. Replication researchers should not strive to conduct the definitive large N study in an effort to establish whether a given effect exists or not. The goal of replication research should instead be to establish both typical effects in a domain and the range of possible effects, given all of what Campbell called the“heterogeneity of irrelevancies” (Cook, 1990) that affect studies and their results. Many smaller studies that vary those irrelevancies likely serve us better than one single large study. Moreover, in this era of increasing preregistration and collaborative research efforts, multiple studies by different groups of researchers is increasingly feasible.

Now if I understand this correctly, it fits nicely with the Psychological Science Accelerator (PSA), and with my smaller groups collaboration format, in the sense that multiple labs will perform the same (replication) research. However, it is not clear to me how many labs the PSA will use per (replication) study, and whether “enough is enough” at some point. For instance, I read around 200 labs have now signed up for the PSA. I think it might be a giant waste of resources of all of these labs will be executing the same study (perhaps even more so if it’s not a replication of a well-known, and influential, effect as has been the case concerning the “Registered Replication Reports” but some sort of relatively new study).

I think the PSA does some things possibly right (I like collaboration, only a different version of it), but also might be doing some things wrong (possibly waste resources, not following up research in a way that in my view optimally accelerates psychological science via theory (re-) formulation and -testing, etc.). I reason a big part of my critique that may be useful, and listened to, is about the possible “optimal” no. of labs/participants per study.

I thought the idea about using the data from all the RRR’s performed thusfar could provide a possibly useful argument why, or why not, the smaller groups format I described might be better. Should the data from the RRR’s provide an argument against super large scale collaboration projects and for smaller collaborations, I reason it could form the base of a possibly pretty strong case as to why smaller collaboration groups might be much better in optimally accelerating Psychological Science. For instance, to me it makes much more sense to (1) have immediate follow-up research concerning a certain theory/phenomenon/effect (e.g. see my comment here), and (2) have researchers involved with this whole process that have been working on the specific theory/phenomenon/effect under investigation. I reason both will increase the chances of optimally (re-) formulating theories, coming up with useful experiments and measurement-tools, etc, and if I understood things correctly the PSA doesn’t have any of this.

Why are we talking about this in the first place??

When considering these arguments, I think at some point we need to step back and consider why we are studying psychology at all. Here are some motivations for psychology research:
– Understanding and treatment of severe mental illness;
– Improvement of the lives of people who do not have debilitating psychological problems but still have difficulties in their lives;
– Enabling the smoother functioning of modern life (this would include a lot of things such as psychometrics, employee evaluation, nudges, etc.);
– Understanding problems of modern life (for example, studies of bias and stereotypes);
– Pure science (with most of the examples relating to the applications above).

Lots of the famous examples of failed replications are on topics that are uninteresting and unimportant. Or, we could say, interesting if true but uninteresting if not true. For example, if Cornell students had ESP, that would be kind of amazing as it would overturn much of what we knew about science. But if we learn that there’s no evidence that Cornell students have ESP, that’s pretty boring. For another example, in the above-linked paper, James Coyne writes:

These problems are compounded by the publicity machines of professional organizations and journals screaming “Listen up consumers, here are scientific results that you must accommodate in your life.” . . . For instance, consider a 2011 press release from the Association for Psychological Science, “Life is one big priming experiment”:

Scientists have shown again and again that they can very subtly cue people’s unconscious minds to think and act certain ways. These cues might be concepts—like cold or fast or elderly—or they might be goals like professional success; either way, these signals shape our behavior, often without any awareness that we are being manipulated. This is humbling, especially when you think about what it means for our everyday beliefs and actions. The priming experiments take place in laboratories, using deliberately contrived signals, but in fact our world is full of cues that act on our minds all the time, for better or for worse. Indeed, many of our actions are reactions to random stimuli outside our consciousness, meaning that the lives we lead are much more automated than we like to acknowledge.

Interesting, maybe important—if true. If not true, though, this claim is about as interesting as flat-earth beliefs in physics, creationism in biology, or Obama-birther conspiracy theories in political science: that is, what’s interesting is not the theories themselves but rather the fact that influential people believe in them. (OK, I guess nobody influential believes in the flat earth, but that’s kind of interesting too, that this particular theory does not happen to have any powerful adherents.) The interesting about that APS statement, or about later claims such as that notorious Harvard statement that “The replication rate in psychology is quite high—indeed, it is statistically indistinguishable from 100%,” is that influential people in the field of psychology were saying it.

Anyway, my point in bringing up that foolish APS press release is that much of the discussion of replication has focused on silly topics such as social priming, a subject which for historical reasons has been important in psychology’s replication crisis which is ultimately uninteresting and unimportant in itself. It could be helpful to step back and remember why we care about psychology research in the first place.

The post The “Psychological Science Accelerator”: it’s probably a good idea but I’m still skeptical appeared first on Statistical Modeling, Causal Inference, and Social Science.