## Predicting Titanic deaths on Kaggle VII: More Stan

October 4, 2015
By

Two weeks ago I used STAN to create predictions after just throwing in all independent variables. This week I aim to refine the STAN model. For this it is convenient to use the loo package (Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian...

## Field Statistics

October 3, 2015
By

Yesterday I learned something interesting from a talk given by Professor Bikas K Sinha. The following is an excerpt from the reference [1], which exactly shows the interesting point of the problem. “A population consisting of an unknown number of distinct species is searched by selecting one member at a time. No a priori information is available concerning […]

## Field Statistics

October 3, 2015
By

Yesterday I learned something interesting from a talk given by Professor Bikas K Sinha. The following is an excerpt from the reference [1], which exactly shows the interesting point of the problem. “A population consisting of an unknown number of distinct species is searched by selecting one member at a time. No a priori information is available concerning […]

## Data analysis vs statistics

October 3, 2015
By

John Tukey preferred the term “data analysis” over “statistics.” In his paper Data Anaysis, Computation and Mathematics, he explains why. My title speaks of “data analysis” not “statistics”, and of “computation” not “computing science”; it does not speak of “mathematics”, but only last. Why? … My brother-in-squared-law, Francis J. Anscombe has commented on my use of […]

## RuleFit: When disassembled trees meet Lasso

October 3, 2015
By

The RuleFit algorithm from Friedman and Propescu is an interesting regression and classification approach that uses decision rules in a linear model.RuleFit is not a completely new idea, but it combines a bunch of algorithms in a cl...

## Profile of Data Scientist Shannon Cebron

October 3, 2015
By

The "This is Statistics" campaign has a nice profile of Shannon Cebron, a data scientist working at the Baltimore-based Pegged Software. What advice would you give to someone thinking of a career in data science? Take some advanced statistics courses if you want to see what it’s like to be a statistician or data scientist.

## Comparing Waic (or loo, or any other predictive error measure)

October 3, 2015
By

Ed Green writes: I have fitted 5 models in Stan and computed WAIC and its standard error for each. The standard errors are all roughly the same (all between 209 and 213). If WAIC_1 is within one standard error (of WAIC_1) of WAIC_2, is it fair to say that WAIC is inconclusive? My reply: No, […] The post Comparing Waic (or loo, or any other predictive error measure) appeared first…

## Books to Read While the Algae Grow in Your Fur, August 2015

October 3, 2015
By

Attention conservation notice: I have no taste. Roland and Sabrina Michaud, Mirror of the Orient The Michauds' gorgeous photos from the 1960s and 1970s — mostly of Afghanistan, but also Turkey, Iran, and India — aptly paired with Persian...

## Books to Read While the Algae Grow in Your Fur, September 2015

October 3, 2015
By

Attention conservation notice: I have no taste. Linda Nagata, The Trials Sequel to First Light, where the consequences of that adventure come home to roost. — If I say that these novels are near-future military hard science fiction, full of de...

## Stan PK/PD Tutorial at the American Conference on Pharmacometrics, 8 Oct 2015

October 2, 2015
By

Bill Gillespie, of Metrum, is giving a tutorial next week at ACoP: Getting Started with Bayesian PK/PD Modeling Using Stan: Practical use of Stan and R for PK/PD applications Thursday 8 October 2015, 8 AM — 5 PM, Crystal City, VA This is super cool for us, because Bill’s not one of our core developers […] The post Stan PK/PD Tutorial at the American Conference on Pharmacometrics, 8 Oct 2015…

## Solution to Stan Puzzle 1: Inferring Ability from Streaks

October 2, 2015
By
$Solution to Stan Puzzle 1: Inferring Ability from Streaks$

If you missed it the first time around, here’s a link to: Stan Puzzle 1: Inferring Ability from Streaks First, a hat-tip to Mike, who posted the correct answer as a comment. So as not to spoil the surprise for everyone else, Michael Betancourt (different Mike), emailed me the answer right away (as he always […] The post Solution to Stan Puzzle 1: Inferring Ability from Streaks appeared first on…

## Delta Method Confidence Bands for Gaussian Density

October 2, 2015
By

During one of our Department's weekly biostatistics "clinics", a visitor was interested in creating confidence bands for a Gaussian density estimate (or a Gaussian mixture density estimate). The mean, variance, and two "nuisance" parameters, were simultaneously estimated using least-squares. Thus, the approximate sampling variance-covariance matrix (4x4) was readily available. The two nuisance parameters do not … Continue reading Delta Method Confidence Bands for Gaussian Density →

## A Simpler Explanation of Differential Privacy

October 2, 2015
By

Differential privacy was originally developed to facilitate secure analysis over sensitive data, with mixed success. It’s back in the news again now, with exciting results from Cynthia Dwork, et. al. (see references at the end of the article) that apply results from differential privacy to machine learning. In this article we’ll work through the definition … Continue reading A Simpler Explanation of Differential Privacy

## Elections, visual

October 2, 2015
By

On October 18, 2015 Swiss voters will elect a new Parliament for the next four years. There are some very useful and also beautiful visual tools that help voters to get informed about developments in the political landscape and about candidates. . Background: The Swiss Political System The full picture of Switzerland’s political institutions and … Continue reading Elections, visual

## Elections, visual

October 2, 2015
By

On October 18, 2015 Swiss voters will elect a new Parliament for the next four years. There are some very useful and also beautiful visual tools that help voters to get informed about developments in the political landscape and about candidates. . Background: The Swiss Political System The full picture of Switzerland’s political institutions and … Continue reading Elections, visual

## Illustrating Spurious Regressions

October 2, 2015
By

I've talked a bit about spurious regressions a bit in some earlier posts (here and here). I was updating an example for my time-series course the other day, and I thought that some readers might find it useful.Let's begin by reviewing what is usually m...

## Syllabus for my course on Communicating Data and Statistics

October 2, 2015
By

Actually the course is called Statistical Communication and Graphics, but I was griping about how few students were taking the class, and someone suggested the title Communicating Data and Statistics as being a bit more appealing. So I’ll go with that for now. I love love love this class and everything that’s come from it […] The post Syllabus for my course on Communicating Data and Statistics appeared first on…

## Not So Standard Deviations: Episode 2 – We Got it Under 40 Minutes

October 2, 2015
By

Episode 2 of my podcast with Hilary Parker, Not So Standard Deviations, is out! In this episode, we talk about user testing for statistical methods, navigating the Hadleyverse, the crucial significance of rename(), and the secret reason for creating the podcast (hint: it rhymes with "bee"). Also, I erroneously claim that Bill Cleveland is way older than

## Balls and urns Part 2: Multi-colored balls

October 2, 2015
By

In a previous post I described how to simulate random samples from an urn that contains colored balls. The previous article described the case where the balls can be either of two colors. In that csae, all the distributions are univariate. In this article I examine the case where the […] The post Balls and urns Part 2: Multi-colored balls appeared first on The DO Loop.

## What NOT To Do When Data Are Missing

October 1, 2015
By

Here's something that's very tempting, but it's not a good idea.Suppose that we want to estimate a regression model by OLS. We have a full sample of size n for the regressors, but one of the values for our dependent variable, y, isn't available. Rather...

## A glass half full interpretation of the replicability of psychological science

October 1, 2015
By

tl;dr: 77% of replication effects from the psychology replication study were in (or above) the 95% prediction interval based on the original effect size. This isn't perfect and suggests (a) there is still room for improvement, (b) the scientists who did the replication study are pretty awesome at replicating, (c) we need a better definition of

## Jason Chaffetz is the Garo Yepremian of the U.S. House of Representatives, and I don’t mean that in a good way.

October 1, 2015
By

Mike Spagat and Paul Alper points us to this truly immoral bit of graphical manipulation, courtesy of U.S. Representative Jason Chaffetz. Here’s the evil graph: Here’s the correction: From the news article by Zachary Roth: As part of a contentious back-and-forth in which Chaffetz repeatedly cut off [Planned Parenthood president Cecile] Richards, the congressman displayed […] The post Jason Chaffetz is the Garo Yepremian of the U.S. House of Representatives,…

## Balke et al. on Real-Time Nowcasting

October 1, 2015
By

Check out the new paper, "Incorporating the Beige Book in a Quantitative Index of Economic Activity," by Nathan Balke, Michael Fulmer and Ren Zhang (BFZ).[The Beige Book (BB) is a written description of U.S. economic conditions, produced by the Federal...