(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)
[image of Schrodinger’s cat, of course]
Stan collaborator Michael Betancourt wrote an article, “The Convergence of Markov chain Monte Carlo Methods: From the Metropolis method to Hamiltonian Monte Carlo,” discussing how various ideas of computational probability moved from physics to statistics.
Three things I wanted to add to Betancourt’s story:
1. My paper with Rubin on R-hat, that measure of mixing for iterative simulation, came in part from my reading of old papers in the computational physics literature, in particular Fosdick (1959), which proposed a multiple-chain approach to monitoring convergence. What we added in our 1992 paper was the within-chain comparison: instead of simply comparing multiple chains to each other, we compared their variance to the within-chain variance. This enabled the diagnostic to be much more automatic.
2. Related to point 1 above: It’s my impression that computational physics is all about hard problems, each of which is a research effort on its own. In contrast, computational statistics often involves relatively easy problems—not so easy that they have closed-form solutions, but easy enough that they can be solved with suitable automatic iterative algorithms. It could be that the physicists didn’t have much need for an automatic convergence diagnostic such as R-hat because they (the physicists) were working so hard on each problem that they already had a sense of the solution and how close they were to it. In statistics, though, there was an immediate need for automatic diagnostics.
3. It’s also my impression that computational physics problems typically centered on computation of the normalizing constant or partition function, Z(theta), or related functions such as (d/dtheta) log(Z). But in Bayesian statistics, we usually aren’t so interested in that, indeed we’re not really looking for expectations at all; rather, we want random draws from the posterior distribution. This changes our function, in part because we’re just trying to get close, we’re not looking for high precision.
I think points 1, 2, and 3 help to explain some of the differences between the simulation literatures in physics and statistics. In short: physicists work on harder problems so they’ve developed fancier algorithms; statisticians work on easier problems and and so have made more advances in automatic methods.
And we can help each other! From one direction, statisticians use physics-developed methods such as Metropolis and HMC; from the other, physicists use Stan to do applied modeling more effectively so that they are less tied to specific conventional choices of modeling.
The post How does probabilistic computation differ in physics and statistics? appeared first on Statistical Modeling, Causal Inference, and Social Science.
Please comment on the article here: Statistical Modeling, Causal Inference, and Social Science