When are complicated models helpful in psychology research and when are they overkill?

January 23, 2013

(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)

Nick Brown is bothered by this article, “An unscented Kalman filter approach to the estimation of nonlinear dynamical systems models,” by Sy-Miin Chow, Emilio Ferrer, and John Nesselroade. The introduction of the article cites a bunch of articles in serious psych/statistics journals. The question is, are such advanced statistical techniques really needed, or even legitimate, with the kind of very rough data that is usually available in psych applications? Or is it just fishing in the hope of discovering patterns that are not really there?

I wrote:

It seems like a pretty innocuous literature review. I agree that many of the applications are silly (for example, they cite the work of the notorious John Gottman in fitting a predator-prey model to spousal relations (!)), but overall they just seem to be presenting very standard ideas for the mathematical-psychology audience. It’s not clear whether advanced techniques are always appropriate here, but they come in through a natural progression: you start with simple models (linear regression, errors-in-variables regression), these simple models don’t quite fit, you make the models more complicated, the complicated models aren’t quite right either, etc. Ultimately you can get pretty big complicated models because the alternative is even worse. Whether this all makes sense is another question. Two areas in psychology where it does seem to make sense to use complicated models are: (1) personality types (we really are complicated multidimensional people) and (2) educational testing (where many different skills and abilities are tested at once).

Nick replied:

I wouldn’t have found the article objectionable, had it not been for the glowing write-up it received in a recent book chapter (Algoe, Fredrickson & Chow, in “Designing positive psychology: Taking stock and moving forward”), and this quote in particular:

First, be willing to leave the pack, think outside the box, all the while attending to the subtle yet recurrent patterns whispered by your data. Keep in mind that advances often represent bold and risky departures from current understanding. . . . Second, be open to capitalize on the rapid advances in measurement tools and mathematical and statistical models. Armed with these new . . . advances, while maintaining empirical and methodological rigor, emotion scientists working in positive psychology will be better equipped than ever before to find practical answers to age-old questions about what makes life good.

It seems to me that the authors are encouraging their readers to go on fishing expeditions, much like Bem (2000) (literally) did in a paragraph that Wagenmakers et al cited in a review of Bem’s appalling “psi is real” paper in 2011:

Examine [the data] from every angle. Analyze the sexes separately. Make up new composite indexes. If a datum suggests a new hypothesis, try to find further evidencefor it elsewhere in the data. If you see dim traces of interesting patterns, try to reorganize the data to bring them into bolder relief. If there are participants you don’t like, or trials, observers, or interviewers who gave you anomalous results,place them aside temporarily and see if any coherent patterns emerge. Go on afishing expedition for something–anything–interesting.

Chow et al’s use of some pre-existing empirical data to “validate” their model – the data coming from a previous study by Nesselroade – suggests to me that there is some cherrypicking going on. Without confirmatory analysis on a fresh data set, this is just so much circular reasoning.

To which I replied:

There’s a lot of misunderstanding here but I don’t think the paper you sent is particularly bad, it’s just part of a general attitude people have that there is a high-tech solution to any problem. This attitude is not limited to psychologists. For example, Bannerjee and Duflo are extremely well-respected economists but they have a very naive view (unfortunately, a view that is common among economists, especially among high-status economists, I believe, for whom its important to be connected with what they view as the most technically advanced statistics) of what is important in statistics. See the P.S. here.

As I wrote a couple years ago, the problem, I think, is that they (like many economists) think of statistical methods not as a tool for learning but as a tool for rigor. So they gravitate toward math-heavy methods based on testing, asymptotics, and abstract theories, rather than toward complex modeling. The result is a disconnect between statistical methods and applied goals.

For the psychologists you’re looking at, the problem is somewhat different: they do want to use statistics to learn, they’re just willing to learn things that aren’t true.

Please comment on the article here: Statistical Modeling, Causal Inference, and Social Science