The study of American politics as a window into understanding uncertainty in science
We begin by discussing recent American elections in the context of political polarization, and we consider similarities and differences with European politics. We then discuss statistical challenges in the measurement of public opinion: inference from opinion polls with declining response rates has much in common with challenges in big-data analytics. From here we move to the recent replication crisis in science, and we argue that Bayesian methods are well suited to resolve some of these problems, if researchers can move away from inappropriate demands for certainty. We illustrate with examples in many different fields of research, our own and others’.
Some background reading:
19 things we learned from the 2016 election (with Julia Azari), http://www.stat.columbia.edu/~gelman/research/published/what_learned_in_2016_5.pdf
The mythical swing voter (with Sharad Goel, Doug Rivers, and David Rothschild). http://www.stat.columbia.edu/~gelman/research/published/swingers.pdf
The failure of null hypothesis significance testing when studying incremental changes, and what to do about it. http://www.stat.columbia.edu/~gelman/research/published/incrementalism_3.pdf
Honesty and transparency are not enough. http://www.stat.columbia.edu/~gelman/research/published/ChanceEthics14.pdf
The connection between varying treatment effects and the crisis of unreplicable research: A Bayesian perspective. http://www.stat.columbia.edu/~gelman/research/published/bayes_management.pdf
Methods in statistics and data science are often framed as solutions to particular problems, in which a particular model or method is applied to a dataset. But good practice typically requires multiplicity, in two dimensions: fitting many different models to better understand a single dataset, and applying a method to a series of different but related problems. To understand and make appropriate inferences from real-world data analysis, we should account for the set of models we might fit, and for the set of problems to which we would apply a method. This is known as the reference set in frequentist statistics or the prior distribution in Bayesian statistics. We shall discuss recent research of ours that addresses these issues, involving the following statistical ideas: Type M errors, the multiverse, weakly informative priors, Bayesian stacking and cross-validation, simulation-based model checking, divide-and-conquer algorithms, and validation of approximate computations.
Some background reading:
Beyond power calculations: Assessing Type S (sign) and Type M (magnitude) errors (with John Carlin). http://www.stat.columbia.edu/~gelman/research/published/retropower_final.pdf
Increasing transparency through a multiverse analysis (with Sara Steegen, Francis Tuerlinckx, and Wolf Vanpaemel). http://www.stat.columbia.edu/~gelman/research/published/multiverse_published.pdf
Prior choice recommendations wiki (with Daniel Simpson and others). https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations
Using stacking to average Bayesian predictive distributions (with Yuling Yao, Aki Vehtari, and Daniel Simpson). http://www.stat.columbia.edu/~gelman/research/published/stacking_paper_discussion_rejoinder.pdf
Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC (with Aki Vehtari and Jonah Gabry). http://www.stat.columbia.edu/~gelman/research/published/loo_stan.pdf
Expectation propagation as a way of life: A framework for Bayesian inference on partitioned data (with Aki Vehtari, Tuomas Sivula, Pasi Jylanki, Dustin Tran, Swupnil Sahai, Paul Blomstedt, John Cunningham, David Schiminovich, and Christian Robert). http://www.stat.columbia.edu/~gelman/research/unpublished/ep_stan_revised.pdf
Yes, but did it work?: Evaluating variational inference (with Yuling Yao, Aki Vehtari, and Daniel Simpson). http://www.stat.columbia.edu/~gelman/research/published/Evaluating_Variational_Inference.pdf
Visualization in Bayesian workflow (with Jonah Gabry, Daniel Simpson, Aki Vehtari, and Michael Betancourt). http://www.stat.columbia.edu/~gelman/research/published/bayes-vis.pdf
I felt like in Vienna I should really speak on this paper, but I don’t think I’ll be talking to an audience of philosophers. I guess the place has changed a bit since 1934.
P.S. I was careful to arrange zero overlap between the two talks. Realistically, though, I don’t expect many people to go to both!