A most interesting link I got when reading Le Monde, about MatLab proposing deep learning tools…

An interesting question about stratified sampling came up on X validated last week, namely how to optimise a Monte Carlo estimate based on two subsequent simulations, one, X, from a marginal and one or several Y from the corresponding conditional given X, when the costs of producing those two simulations significantly differ. When looking at […]

Mikhail Balyasin writes: I have come across this paper by Jacob Westfall and Tal Yarkoni, “Statistically Controlling for Confounding Constructs Is Harder than You Think.” I think it talks about very similar issues you raise on your blog, but in this case they advise to use SEM [structural equation models] to control for confounding constructs. […] The post In Bayesian regression, it’s easy to account for measurement error appeared first…

Writing a book is a sacrifice. It takes a lot of time, represents a lot of missed opportunities, and does not (directly) pay very well. If you do a good job it may pay back in good-will, but producing a serious book is a great challenge. Nina Zumel and I definitely troubled over possibilities for …

Dominic on stan-users writes: I was reading through http://arxiv.org/pdf/1410.5110v1.pdf and came across the term with which I was not familiar: "paracompact." I wrote a short blog post about it: https://idontgetoutmuch.wordpress.com/2016/04/17/every-manifold-is-paracompact. It may be of interest to other folks reading the aforementioned paper. I would have used a partition of unity to justify the corollary myself […]

Max Joseph writes: Conditional autoregressive (CAR) models are popular as prior distributions for spatial random effects with areal spatial data. Historically, MCMC algorithms for CAR models have benefitted from efficient Gibbs sampling via full conditional distributions for the spatial random effects. But, these conditional specifications do not work in Stan, where the joint density needs […]

Here's a pair of figures from a 2003 report by the National Academies 'Committee to Review the Scientific Evidence on the Polygraph' (full text), which includes several well-known statisticians. The figure below shows the sensitivity versus false-positive rate for 52 controlled laboratory studies of naive examinees, untrained in polygraph countermeasures. Each study examinee was assigned …

Statistics can be useful, even if it’s idealizations fall apart on close inspection. For example, take English letter frequencies. These frequencies are fairly well known. E is the most common letter, followed by T, then A, etc. The string of letters “ETAOIN SHRDLU” comes from the days of Linotype when letters were arranged in that order, […]

It started with a project that Sharad Goel is doing, comparing decisions of judges in an urban court system. Sharad was talking with Avi Feller, Art Owen, and me about estimating the effect of a certain decision option that judges have, controlling for pre-treatment differences between defendants. Art: I'm interested in what that data shows […]

Sharad Goel writes: We just launched an experiment about online privacy, and I was wondering if you could post this on your blog. In a nutshell, people upload their browsing history, which we then fingerprint and compare to the profiles of 100s of millions of Twitter users to find a match. Browsing history is something […]

1996 On the 2nd of September 1996, Statistics Switzerland published its brand-new website, www.bfs.admin.ch. It was one of the first (if not the first) of the Swiss Administration (www.admin.ch). In three languages… … and already with quite rich structure and content. The Wayback Machine … … shows the developments since 1996 https://archive.org/web/. 1996: Handmade with Frontpage …

Introduction Suppose we have the task of predicting an outcome y given a number of variables v1,..,vk. We often want to "prune variables" or build models with fewer than all the variables. This can be to speed up modeling, decrease the cost of producing future data, improve robustness, improve explain-ability, even reduce over-fit, and improve …

I will tell a story and then ask a question. The story: "Thousands of Americans are alive today because they were luckily selected to be in the placebo arm of the study" Paul Alper writes: As far as I can tell, you have never written about Tambocor (Flecainide) and the so-called CAST study. A locally […]

It has been well-known since at least 1969, when Bates and Granger wrote their famous paper on “The Combination of Forecasts”, that combining forecasts often leads to better forecast accuracy. So it is helpful to have a couple of new R packages which do just that: opera and forecastHybrid. opera Opera stands for “Online Prediction […]

Nadia Hassan writes: I saw your article in Slate. For what it's worth, this new article, "Ideologically Extreme Candidates in U.S. Presidential Elections, 1948–2012," by Marty Cohen, Mary McGrath, Peter Aronow, and John Zaller, looks at ideology-based extremism and finds weak effects of ideology. Like the high end is 1980 and the authors estimate Carter […]

Mike Carniello writes: I wondered what you make of this. I pay for the NYT online and tablet – but not paper, so I don't know how they're representing this content in two dimensions. I've paged through the thing a couple of times, not sure how useful it is – it seems like a series […]

My friend John R. sent me this excellent Buzzfeed feature on music playlists. Here are some choice quotes to whet your appetite: In 2014, when Tim Cook explained Apple’s stunning $3 billion purchase of Beats by repeatedly invoking its “very rare and hard to find” team of music experts, he was talking about these guys. And their efforts since, which have pointed toward curated playlists (specifically, an industrial-scale trove of…