Stan is Turing complete. The post Stan is Turing complete appeared first on Statistical Modeling, Causal Inference, and Social Science.

Stan is Turing complete. The post Stan is Turing complete appeared first on Statistical Modeling, Causal Inference, and Social Science.

Aki, Jonah, and I have released the much-discussed paper on LOO and WAIC in Stan: Efficient implementation of leave-one-out cross-validation and WAIC for evaluating fitted Bayesian models. We (that is, Aki) now recommend LOO rather than WAIC, especially now that we have an R function to quickly compute LOO using Pareto smoothed importance sampling. In […] The post New papers on LOO/WAIC and Stan appeared first on Statistical Modeling, Causal…

We've opened officially registration for our short course on Bayesian methods in health economics (this is a link to last year's edition, with a little more information than the official webpage for this year's course). When we decided to do this, we a...

At the recent International Symposium on Forecasting, held in Riverside, California, Tillman Gneiting gave a great talk on “Evaluating forecasts: why proper scoring rules and consistent scoring functions matter”. It will be the subject of an IJF invited paper in due course. One of the things he talked about was the “Murphy diagram” for comparing forecasts, […]

Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!

This is cool. The #1 psychology department in the world is looking for a quantitative researcher: The Department of Psychology at the University of Michigan, Ann Arbor, invites applications for a tenure-track faculty position. The expected start date is September 1, 2016. The primary criterion for appointment is excellence in research and teaching. We are […] The post Psych dept: “We are especially interested in candidates whose research program contributes…

In our previous post in this series, we introduced sessionization, or converting log data into a form that’s suitable for analysis. We looked at basic considerations, like dealing with time, choosing an appropriate dataset for training models, and choosing appropriate (and achievable) business goals. In that previous example, we sessionized the data by considering all … Continue reading Working with Sessionized Data 2: Variable Selection →

The prior distribution p(theta) in a Bayesian analysis is often presented as a researcher’s beliefs about theta. I prefer to think of p(theta) as an expression of information about theta. Consider this sort of question that a classically-trained statistician asked me the other day: If two Bayesians are given the same data, they will come […] The post Prior information, not prior belief appeared first on Statistical Modeling, Causal Inference,…

Recently a SAS customer asked how to Winsorize data in SAS. Winsorization is best known as a way to construct robust univariate statistics. The Winsorized mean is a robust estimate of location. The Winsorized mean is similar to the trimmed mean, and both are described in the documentation for PROC […] The post How to Winsorize data in SAS appeared first on The DO Loop.

Last week, I gave one of the visualization primer talks at BioVis in Dublin. My goal was to show people some examples, but also criticize the rather poor visualization culture in bioinformatics and challenge people to do better. Here is a write-up of that talk. Seán O’Donoghue introduced me by calling me “infamous” for speaking … Continue reading Talk: How to Visualize Data

Spot the fallacy! The power of a test is the probability of correctly rejecting the null hypothesis. Write it as 1 – β. So, the probability of incorrectly rejecting the null hypothesis is β. But the probability of incorrectly rejecting the null is α (the type 1 error probability). So α = β. I’ve actually […]

“…our findings shall lead to us be critical of certain current practices. Specifically, most papers seem content with comparing some new algorithm with Gibbs sampling, on a few small datasets, such as the well-known Pima Indians diabetes dataset (8 covariates). But we shall see that, for such datasets, approaches that are even more basic than […]

(Sent to all the American Politics faculty at Columbia, including me) RE: Donald Trump presidential candidacy Hi, Firstly, apologies for the group email but I wasn’t sure who would be best prized to answer this query as we’ve not had much luck so far. I am a Dubai-based reporter for **. Donald Trump recently announced […] The post Awesomest media request of the year appeared first on Statistical Modeling, Causal…

Yphtach Lelkes points us to a recent article on survey weighting by three economists, Gary Solon, Steven Haider, and Jeffrey Wooldridge, who write: We start by distinguishing two purposes of estimation: to estimate population descriptive statistics and to estimate causal effects. In the former type of research, weighting is called for when it is needed […] The post Survey weighting and regression modeling appeared first on Statistical Modeling, Causal Inference,…

One of the most misguided and dangerous ideas floated around by a group of Big Data enthusiasts is the notion that it is not important to understand why something happens, just because "we have a boatload of data". This is one of the central arguments in the bestseller Big Data, and it reached the mainstream much earlier when Chris Anderson, then chief editor of Wired, published his flamboyantly-titled op-ed proclaiming…