In the Northern Hemisphere, it is the start of the school year, and thousands of eager students are beginning their study of statistics. I know this because this is the time of year when lots of people watch my video, … Continue reading →

Principal-components regression (PCR) is routine in applied time-series econometrics.Why so much PCR, and so little ridge regression? Ridge and PCR are both shrinkage procedures involving PC's. The difference is that ridge effectively includes all...

Political scientist Brian Silver points me to his post by economist Paul Romer, who writes: The style that I [Romer] am calling mathiness lets academic politics masquerade as science. Like mathematical theory, mathiness uses a mixture of words and symbols, but instead of making tight links, it leaves ample room for slippage between statements in […] The post Flamebait: “Mathiness” in economics and political science appeared first on Statistical Modeling,…

Junk Science (as first coined).* Have you ever noticed in wranglings over evidence-based policy that it’s always one side that’s politicizing the evidence—the side whose policy one doesn’t like? The evidence on the near side, or your side, however, is solid science. Let’s call those who first coined the term “junk science” Group 1. For […]

Two weeks ago I used STAN to create predictions after just throwing in all independent variables. This week I aim to refine the STAN model. For this it is convenient to use the loo package (Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian...

Yesterday I learned something interesting from a talk given by Professor Bikas K Sinha. The following is an excerpt from the reference [1], which exactly shows the interesting point of the problem. “A population consisting of an unknown number of distinct species is searched by selecting one member at a time. No a priori information is available concerning […]

John Tukey preferred the term “data analysis” over “statistics.” In his paper Data Anaysis, Computation and Mathematics, he explains why. My title speaks of “data analysis” not “statistics”, and of “computation” not “computing science”; it does not speak of “mathematics”, but only last. Why? … My brother-in-squared-law, Francis J. Anscombe has commented on my use of […]

The "This is Statistics" campaign has a nice profile of Shannon Cebron, a data scientist working at the Baltimore-based Pegged Software. What advice would you give to someone thinking of a career in data science? Take some advanced statistics courses if you want to see what it’s like to be a statistician or data scientist.

Ed Green writes: I have fitted 5 models in Stan and computed WAIC and its standard error for each. The standard errors are all roughly the same (all between 209 and 213). If WAIC_1 is within one standard error (of WAIC_1) of WAIC_2, is it fair to say that WAIC is inconclusive? My reply: No, […] The post Comparing Waic (or loo, or any other predictive error measure) appeared first…

Attention conservation notice: I have no taste. Roland and Sabrina Michaud, Mirror of the Orient The Michauds' gorgeous photos from the 1960s and 1970s — mostly of Afghanistan, but also Turkey, Iran, and India — aptly paired with Persian...

Attention conservation notice: I have no taste. Linda Nagata, The Trials Sequel to First Light, where the consequences of that adventure come home to roost. — If I say that these novels are near-future military hard science fiction, full of de...

Bill Gillespie, of Metrum, is giving a tutorial next week at ACoP: Getting Started with Bayesian PK/PD Modeling Using Stan: Practical use of Stan and R for PK/PD applications Thursday 8 October 2015, 8 AM — 5 PM, Crystal City, VA This is super cool for us, because Bill’s not one of our core developers […] The post Stan PK/PD Tutorial at the American Conference on Pharmacometrics, 8 Oct 2015…

If you missed it the first time around, here’s a link to: Stan Puzzle 1: Inferring Ability from Streaks First, a hat-tip to Mike, who posted the correct answer as a comment. So as not to spoil the surprise for everyone else, Michael Betancourt (different Mike), emailed me the answer right away (as he always […] The post Solution to Stan Puzzle 1: Inferring Ability from Streaks appeared first on…

During one of our Department's weekly biostatistics "clinics", a visitor was interested in creating confidence bands for a Gaussian density estimate (or a Gaussian mixture density estimate). The mean, variance, and two "nuisance" parameters, were simultaneously estimated using least-squares. Thus, the approximate sampling variance-covariance matrix (4x4) was readily available. The two nuisance parameters do not … Continue reading Delta Method Confidence Bands for Gaussian Density →

Differential privacy was originally developed to facilitate secure analysis over sensitive data, with mixed success. It’s back in the news again now, with exciting results from Cynthia Dwork, et. al. (see references at the end of the article) that apply results from differential privacy to machine learning. In this article we’ll work through the definition … Continue reading A Simpler Explanation of Differential Privacy

On October 18, 2015 Swiss voters will elect a new Parliament for the next four years. There are some very useful and also beautiful visual tools that help voters to get informed about developments in the political landscape and about candidates. . Background: The Swiss Political System The full picture of Switzerland’s political institutions and … Continue reading Elections, visual

Actually the course is called Statistical Communication and Graphics, but I was griping about how few students were taking the class, and someone suggested the title Communicating Data and Statistics as being a bit more appealing. So I’ll go with that for now. I love love love this class and everything that’s come from it […] The post Syllabus for my course on Communicating Data and Statistics appeared first on…