Summarising with Box and Whisker plots

October 5, 2015
By

In the Northern Hemisphere, it is the start of the school year, and thousands of eager students are beginning their study of statistics. I know this because this is the time of year when lots of people watch my video, … Continue reading →

Whither Econometric Principal-Components Regressions?

October 4, 2015
By

Principal-components regression (PCR) is routine in applied time-series econometrics.Why so much PCR, and so little ridge regression? Ridge and PCR are both shrinkage procedures involving PC's. The difference is that ridge effectively includes all...

Cointegration & Granger Causality

October 4, 2015
By

Today, I had a query from a reader of this blog regarding cointegration and Granger causality. Essentially, the email said:"I tested two economic time-series and found them to be cointegrated. However, when I then tested for Granger  causalit...

Flamebait: “Mathiness” in economics and political science

October 4, 2015
By

Political scientist Brian Silver points me to his post by economist Paul Romer, who writes: The style that I [Romer] am calling mathiness lets academic politics masquerade as science. Like mathematical theory, mathiness uses a mixture of words and symbols, but instead of making tight links, it leaves ample room for slippage between statements in […] The post Flamebait: “Mathiness” in economics and political science appeared first on Statistical Modeling,…

Will the Real Junk Science Please Stand Up?

October 4, 2015
By

Junk Science (as first coined).* Have you ever noticed in wranglings over evidence-based policy that it’s always one side that’s politicizing the evidence—the side whose policy one doesn’t like? The evidence on the near side, or your side, however, is solid science. Let’s call those who first coined the term “junk science” Group 1. For […]

Predicting Titanic deaths on Kaggle VII: More Stan

October 4, 2015
By

Two weeks ago I used STAN to create predictions after just throwing in all independent variables. This week I aim to refine the STAN model. For this it is convenient to use the loo package (Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian...

Field Statistics

October 3, 2015
By

Yesterday I learned something interesting from a talk given by Professor Bikas K Sinha. The following is an excerpt from the reference [1], which exactly shows the interesting point of the problem. “A population consisting of an unknown number of distinct species is searched by selecting one member at a time. No a priori information is available concerning […]

Data analysis vs statistics

October 3, 2015
By

John Tukey preferred the term “data analysis” over “statistics.” In his paper Data Anaysis, Computation and Mathematics, he explains why. My title speaks of “data analysis” not “statistics”, and of “computation” not “computing science”; it does not speak of “mathematics”, but only last. Why? … My brother-in-squared-law, Francis J. Anscombe has commented on my use of […]

RuleFit: When disassembled trees meet Lasso

October 3, 2015
By

The RuleFit algorithm from Friedman and Propescu is an interesting regression and classification approach that uses decision rules in a linear model.RuleFit is not a completely new idea, but it combines a bunch of algorithms in a cl...

Profile of Data Scientist Shannon Cebron

October 3, 2015
By

The "This is Statistics" campaign has a nice profile of Shannon Cebron, a data scientist working at the Baltimore-based Pegged Software. What advice would you give to someone thinking of a career in data science? Take some advanced statistics courses if you want to see what it’s like to be a statistician or data scientist.

Comparing Waic (or loo, or any other predictive error measure)

October 3, 2015
By

Ed Green writes: I have fitted 5 models in Stan and computed WAIC and its standard error for each. The standard errors are all roughly the same (all between 209 and 213). If WAIC_1 is within one standard error (of WAIC_1) of WAIC_2, is it fair to say that WAIC is inconclusive? My reply: No, […] The post Comparing Waic (or loo, or any other predictive error measure) appeared first…

Books to Read While the Algae Grow in Your Fur, August 2015

October 3, 2015
By

Attention conservation notice: I have no taste. Roland and Sabrina Michaud, Mirror of the Orient The Michauds' gorgeous photos from the 1960s and 1970s — mostly of Afghanistan, but also Turkey, Iran, and India — aptly paired with Persian...

Books to Read While the Algae Grow in Your Fur, September 2015

October 3, 2015
By

Attention conservation notice: I have no taste. Linda Nagata, The Trials Sequel to First Light, where the consequences of that adventure come home to roost. — If I say that these novels are near-future military hard science fiction, full of de...

Stan PK/PD Tutorial at the American Conference on Pharmacometrics, 8 Oct 2015

October 2, 2015
By

Bill Gillespie, of Metrum, is giving a tutorial next week at ACoP: Getting Started with Bayesian PK/PD Modeling Using Stan: Practical use of Stan and R for PK/PD applications Thursday 8 October 2015, 8 AM — 5 PM, Crystal City, VA This is super cool for us, because Bill’s not one of our core developers […] The post Stan PK/PD Tutorial at the American Conference on Pharmacometrics, 8 Oct 2015…

Solution to Stan Puzzle 1: Inferring Ability from Streaks

October 2, 2015
By
$Solution to Stan Puzzle 1: Inferring Ability from Streaks$

If you missed it the first time around, here’s a link to: Stan Puzzle 1: Inferring Ability from Streaks First, a hat-tip to Mike, who posted the correct answer as a comment. So as not to spoil the surprise for everyone else, Michael Betancourt (different Mike), emailed me the answer right away (as he always […] The post Solution to Stan Puzzle 1: Inferring Ability from Streaks appeared first on…

Delta Method Confidence Bands for Gaussian Density

October 2, 2015
By

During one of our Department's weekly biostatistics "clinics", a visitor was interested in creating confidence bands for a Gaussian density estimate (or a Gaussian mixture density estimate). The mean, variance, and two "nuisance" parameters, were simultaneously estimated using least-squares. Thus, the approximate sampling variance-covariance matrix (4x4) was readily available. The two nuisance parameters do not … Continue reading Delta Method Confidence Bands for Gaussian Density →

A Simpler Explanation of Differential Privacy

October 2, 2015
By

Differential privacy was originally developed to facilitate secure analysis over sensitive data, with mixed success. It’s back in the news again now, with exciting results from Cynthia Dwork, et. al. (see references at the end of the article) that apply results from differential privacy to machine learning. In this article we’ll work through the definition … Continue reading A Simpler Explanation of Differential Privacy

Elections, visual

October 2, 2015
By

On October 18, 2015 Swiss voters will elect a new Parliament for the next four years. There are some very useful and also beautiful visual tools that help voters to get informed about developments in the political landscape and about candidates. . Background: The Swiss Political System The full picture of Switzerland’s political institutions and … Continue reading Elections, visual

Illustrating Spurious Regressions

October 2, 2015
By

I've talked a bit about spurious regressions a bit in some earlier posts (here and here). I was updating an example for my time-series course the other day, and I thought that some readers might find it useful.Let's begin by reviewing what is usually m...

Syllabus for my course on Communicating Data and Statistics

October 2, 2015
By

Actually the course is called Statistical Communication and Graphics, but I was griping about how few students were taking the class, and someone suggested the title Communicating Data and Statistics as being a bit more appealing. So I’ll go with that for now. I love love love this class and everything that’s come from it […] The post Syllabus for my course on Communicating Data and Statistics appeared first on…

