When does the quest for beauty lead science astray?

Under the heading, “please blog about this,” Shravan Vasishth writes:

This book by a theoretical physicist [Sabine Hossenfelder] is awesome. The book trailer is here.

Some quotes from her blog:

“theorists in the foundations of physics have been spectacularly unsuccessful with their predictions for more than 30 years now.”

“Everyone is happily producing papers in record numbers, but I go around and say this is a waste of money. Would you give me a job? You probably wouldn’t. I probably wouldn’t give me a job either.”

“The story here isn’t that theorists have been unsuccessful per se, but that they’ve been unsuccessful and yet don’t change their methods.”

“And that’s the real story here: Scientists get stuck on unsuccessful methods.”

She deserves to be world famous.

I have no idea who deserves to be world famous, but Shravan’s email was intriguing enough to motivate me to follow the link and read Hossenfelder’s blog, which had some discussion of the idea that the quest for beauty in mathematical theories has led physics astray.

Hossenfelder also draws some connections between the crisis in physics and the reproducibility crisis in social and behavioral science. The two crises are different—in physics, the problem (as I see it from the outside) is that the theories are so complicated and so difficult to test with data (requiring extremely high energies, etc.), whereas in the human sciences many prominent theories are so ridiculous and so easy to refute that this creates an entirely different sort of crisis or panic.

Hossenfelder writes, “In my community [in physics], it has become common to justify the publication of new theories by claiming the theories are falsifiable. But falsifiability is a weak criterion for a scientific hypothesis. It’s necessary, but certainly not sufficient, for many hypotheses are falsifiable yet almost certainly wrong.” Yup. Theories in the human sciences are typically too vague to ever be wrong, exactly; instead, they are set up to make predictions which, when falsified, are simply folded back into the theory (as in this recent example). Rather than say I think a particular social science theory is false, I prefer to say that I expect the phenomenon of interest to be highly variable, with effects that depend unpredictably on the context, hence trying to verify or estimate parameters of these theories using a black-box experimental approach will typically be hopeless. Again, I can’t really comment on how this works in physics.

Back to beauty

I will not try to define what is beauty in a scientific theory; instead I’ll point you toward Hossenfelder’s discussions in her blog. I’ll share a few impressions, though. Newton’s laws, relativity theory, quantum mechanics, classical electromagnetism, the second law of thermodynamics, the ideal gas law: these all do seem beautiful to me. At a lesser level, various statistical theories such as the central limit theorem, stable laws, the convergence of various statistical estimators, Bayes’ theorem, regression to the mean, they’re beautiful too. And I have a long list of statistics stories that I keep misplacing . . . they’re all beautiful, at a slightly lower level than the above theorems. I don’t want to pit theories against each other in a beauty context; I’m just listing the above to acknowledge that I too think about beauty when constructing and evaluating theories.

And I do see Hossenfelder’s point, that criteria of beauty do not always work when guiding research choices.

Let me give some examples. They’re all in my own subfields of research within statistics and political science, so not really of general interest, but I bet you could come up with similar stories in other fields. Anyway, these are my examples where the quest for beauty can get in the way of progress:

1. Conjugate prior distributions. People used to use inverse-gamma(1,1) or inverse-gamma(0.001, 0.001) priors because of their seeming naturalness; it was only relatively recently realized that these priors embodied very strong information. Similarly with Jeffreys priors, noninformative priors, and other ideas out there that had mathematical simplicity and few if any adjustable parameters (hence, some “beauty” in the sense of physics models). It took me awhile to see the benefit of weakly informative priors. The apparent ugliness of user-specified parameters gives real benefits in allowing one to include constraining information for regularization. That said, I do think that transforming to unit scale can make sense, and so, once we understood what we were doing, we could recover some of the lost beauty.

2. Predictive information criteria. AIC was fine under certain asymptotic limits and for linear models with flat priors but does not work in general; hence DIC, WAIC, and other alternatives. Aki and I spent a lot of time trying to figure out the right formula for effective number of parameters, and then we suddenly realized that there was no magic formula. Freed from this alchemic goal, we were able to attack the problem of predictive model evaluation directly using leave-one-out cross-validation and leave the beautiful formulas behind.

3. Lots of other examples: R-hat, R-squared, all sorts of other things. Sometimes there’s a beautiful formula, sometimes not. Using beauty as a criterion is not so terrible, as long as you realize that sometimes the best solution, at least for now, is not so beautiful.

4. Five ways to write the same model. That’s the title of section 12.5 of my book with Jennifer, and it represents a breakthrough for us: after years spent trying to construct the perfect general notation, we realized the (obvious in retrospect) point that different notations will make sense for different problems. And this in turn freed us when writing our new book to be even more flexible with our notation for regression.

5. Social science theories. Daniel Drezner once memorably criticized “piss-poor monocausal social science”—but it’s my impression that, to many people who have not thought seriously about the human sciences, monocausal explanations are more beautiful. A naive researcher might well think that there’s something clean and beautiful about the theory that women are much more likely to support Barack Obama during certain times of the month, or that college students with fat arms are more likely to support redistribution of income, without realizing the piranha problem that these monocausal theories can’t coexist in a single world. In this case, it’s an unsophisticated quest for beauty that’s leading certain fields of science astray—so this is different than physics, where the ideas of beauty are more refined (I mean it, I’m not being sarcastic here; actually this entire post and just about everything I write is sincere; I’m just emphasizing my sincerity right here because I’m thinking that there’s something about talking about a “refined” idea of beauty that might sound sarcastic, and it’s not). But maybe it’s another version of the same impulse.

Why care about beauty in theories?

Let’s try to unpack this a bit.

Constructing and choosing theories based on mathematical beauty and simplicity has led to some famous successes, associated with the names of Copernicus, Kepler, Newton, Laplace, Hamilton, Maxwell, Einstein, Dirac, and Gell-Mann—just to name a few! Or, to move away from physics, there’s Pasteur, Darwin, Galton, Smith, Ricardo, etc. The so-called modern synthesis in biology reconciling Mendelian inheritance and natural selection—that’s beautiful. The unification of various laws of chemistry based on quantum mechanics: again, a seeming menagerie is reframed as the product of an underlying simple structure.

Then, more recently, the quest of mathematical beauty has led to some dead ends, as discussed above. From a sociology-of-science perspective, that makes sense: if a method yields success, you keep pushing it to its limits until it stops working.

The question remains: Is there some fundamental reason why it makes sense can make sense to prefer beautiful theories, beyond slogans such as “Occam’s razor” (which I hate; see here and here) or “the unreasonable effectiveness of mathematics” or whatever? I think there is.

Here’s how I see it. Rather than focusing on beautiful theories that make us happy, let’s consider theories that are not so beautiful and make us uncomfortable. Where does this discomfort come from? Putting on my falsificationist Bayesian hat for a moment (actually, I never took it off; I even sleep with it on!), I’d say the discomfort has to come from some clash of knowledge, some way in which the posterior distribution corresponding to our fitted model in not in accord with some prior belief we have. Similar to the Why ask why? problem in social science.

But where exactly is this happening? Let’s think about some examples. In astronomy, the Copernican or Keplerian system is more pleasant to contemplate than an endless set of spinning epicycles. In economics, the invisible hand of Smith etc. seems like a better explanation, overall, than “the benevolence of the butcher, the brewer, or the baker.” Ugly theories are full of moving parts that all have to fit together just right to work. In contrast, beautiful theories are more self-sustaining. The problem is that the world is complicated, and beautiful theories only explain part of what we see. So we’re always in some intermediate zone, where our beautiful theories explain some stylized facts about the world, but some ugliness is needed to explain the rest.

P.S. In response to the above post, Hossenfelder writes:

I think though you have some things backwards. Take the epicycles: People held onto these because they thought that circles were beautiful. Also, I do not consider Occam’s razor to be an argument from beauty. Besides, note that the context in which I am discussing problems with arguments from beauty is in the development of new theories. The ‘discomfort’ that you refer to (which I discuss in my book as an argument from experience) is something you can apply to when you refer to theories you know, but not if you want to develop new ones.

That’s interesting that epicycles could be considered beautiful. I suppose that the characterization of beauty will depend a lot on context.

The post When does the quest for beauty lead science astray? appeared first on Statistical Modeling, Causal Inference, and Social Science.