Big data is all the rage, but sometimes you don’t have big data. Sometimes you don’t even have average size data. Sometimes you only have eleven unique socks: Karl Broman is here putting forward a very interesting problem. Interesting, not onl...

A few years ago I wrote an article that shows how to compute the log-determinant of a covariance matrix in SAS. This computation is often required to evaluate a log-likelihood function. My algorithm used the ROOT function in SAS/IML to compute a Cholesky decomposition of the covariance matrix. The Cholesky […]

The hts package for R allows for forecasting hierarchical and grouped time series data. The idea is to generate forecasts for all series at all levels of aggregation without imposing the aggregation constraints, and then to reconcile the forecasts so they satisfy the aggregation constraints. (An introduction to reconciling hierarchical and grouped time series is […]

When you think of a conference, does sitting around a lot come to mind? Lots of food? Bad coffee? No time to work out? For the first time in VIS history, there will be a way to exercise your body, not just your mind. The VIS Sports Authority, which is totally an official thing that I didn’t just make up,…

I received the following email from the Social Science Research Network, which is a (legitimate) preprint server for research papers: Dear Andrew Gelman: Your paper, “WHY HIGH-ORDER POLYNOMIALS SHOULD NOT BE USED IN REGRESSION DISCONTINUITY DESIGNS”, was recently listed on SSRN’s Top Ten download list for: PSN: Econometrics, Polimetrics, & Statistics (Topic) and Political Methods: […] The post “Your Paper…

The following is from Nathan Schachtman’s legal blog, with various comments and added emphases (by me). He will try to reply to comments/queries. “Courts Can and Must Acknowledge Multiple Comparisons in Statistical Analyses” Nathan Schachtman, Esq., PC * October 14th, 2014 In excluding the proffered testimony of Dr. Anick Bérard, a Canadian perinatal epidemiologist in the […]

Haynes Goddard writes: Reviewing my notes and books on categorical data analysis, the term “nominal” is widely employed to refer to variables without any natural ordering. I was a language major in UG school and knew that the etymology of nominal is the Latin word nomen (from the Online Etymological Dictionary: early 15c., “pertaining to […] The post Hoe noem…

Jason May writes: I’m in Northwestern’s Predictive Analytics grad program. I’m working on a project providing Case Studies of how companies use certain analytic processes and want to use Bayesian Analysis as my focus. The problem: I can find tons of work on how one might apply Bayesian Statistics to different industries but very little […] The post How do…

In the medical sciences, there is a discipline called "evidence based medicine". The basic idea is to study the actual practice of medicine using experimental techniques. The reason is that while we may have good experimental evidence about specific medicines or … Continue reading →

Anna Dreber Almenberg writes: The second prediction market project for the reproducibility project will soon be up and running – please participate! There will be around 25 prediction markets, each representing a particular study that is currently being replicated. Each study (and thus market) can be summarized by a key hypothesis that is being tested, which […] The post Prediction Market…

Consider the following question: Is there a reproducibility/replication crisis in epidemiology? I think there are only two possible ways to answer that question: No, there is no replication crisis in epidemiology because no one ever believes the result of an … Continue reading →

This paper by Weixuan Zhu, Juan Miguel Marín [from Carlos III in Madrid, not to be confused with Jean-Michel Marin, from Montpellier!], and Fabrizio Leisen proposes an alternative to our 2013 PNAS paper with Kerrie Mengersen and Pierre Pudlo on empirical likelihood ABC, or BCel. The alternative is based on Davison, Hinkley and Worton’s (1992) […]

Statistical communication includes graphing data and fitted models, programming, writing for specialized and general audiences, lecturing, working with students, and combining words and pictures in different ways. The common theme of all these interactions is that we need to consider our statistical tools in the context of our goals. Communication is not just about conveying […] The post Statistical Communication…

We will study and practice many different aspects of statistical communication, including graphing data and fitted models, programming in Rrrrrrrr, writing for specialized and general audiences, lecturing, working with students and colleagues, and combining words and pictures in different ways. You learn by doing: each week we have two classes that are full of student […] The post My course…

In our recent discussion of publication bias, a commenter link to a recent paper, “Star Wars: The Empirics Strike Back,” by Abel Brodeur, Mathias Le, Marc Sangnier, Yanos Zylberberg, who point to the notorious overrepresentation in scientific publications of p-values that are just below 0.05 (that is, just barely statistically significant at the conventional level) […] The post The Fault…

[Update Oct 2014: Due to some changes to the Bayes factor calculator webpage, and as I understand BFs much better now, this post has been updated ...] I started to familiarize myself with Bayesian statistics. In this post I’ll show some insights ...