## Le Monde puzzle [#857]

March 21, 2014
A rather bland case of Le Monde mathematical puzzle : Two positive integers x and y are turned into s=x+y and p=xy. If Sarah and Primrose are given S and P, respectively, how can the following dialogue happen? I am sure you cannot find my number Now you told me that, I can, it is […]

## Death of A. L. Nagar

March 21, 2014
I was saddened to learn that Anirudh Lal Nagar passed away on 4 February 2014. Nagar was an exceptional  Indian statistician and econometrician who made many fundamental contributions to our discipline. He was 83 years old at the time of his ...

## The sampling frame is a list, but not every list is a sampling frame

March 21, 2014
Yesterday and today, I spent some time marking the in-course assessment (ICA) for my course (the teaching term is over next week \$-\$ yay!). The course is called "Social Statistics" and it's intended to deal with surveys and sampling. However, sinc...

## Random matrices in the news

March 21, 2014
From 2010: Mark Buchanan wrote a cover article for the New Scientist on random matrices, a heretofore obscure area of probability theory that his headline writer characterizes as "the deep law that shapes our reality." It's interesting stuff, and he gets into some statistical applications at the end, so I'll give you my take on it. But

## Some chart types are not scalable

March 21, 2014
Peter Cock sent this Venn diagram to me via twitter. (Original from this paper.) For someone who doesn't know genetics, it is very hard to make sense of this chart. It seems like there are five characteristics that each unit...

March 21, 2014
## R: Text Mining on Twitter #PrayForMH370 Malaysia Airlines

March 21, 2014
It's been two weeks for search and rescue operations of the Malaysia Airlines Flight MH370, after it vanished from the radar on March 8, 2014. And wherever they are, we hope and pray for them.Photo from VENUS - Wall of Hope & Prayers for MH370In th...

## Pre-processing for approximate Bayesian computation in image analysis

March 20, 2014
With Matt Moores and Kerrie Mengersen, from QUT, we wrote this short paper just in time for the MCMSki IV Special Issue of Statistics & Computing. And arXived it, as well. The global idea is to cut down on the cost of running an ABC experiment by removing the simulation of a humongous state-space vector, […]

## It is time for RData files to become the standard for Data Transfer

March 20, 2014
It is time Rdata files become the primary means of disseminating publicly available data online.1. R is the most efficient Statistical software at compressing dataI was recently attempting to download weather data from the US government and found mysel...

## How to make an absurd twitter bot in python

March 20, 2014
In my last post, I outlined the steps I took to programmatically mimic the wine reviews of a dilettante sommelier. In this post, I'll explain the steps I took to create the twitter bot @HorseWineReview which combines a random wine

## Teaching Bayesian applied statistics to graduate students in political science, sociology, public health, education, economics, . . .

March 20, 2014
One of the most satisfying experiences for an academic is when someone asks a question that you've already answered. This happened in the comments today. Daniel Gotthardt wrote: So for applied stat courses like for sociologists, political scientists, psychologists and maybe also for economics, what do we actually want to accomplish with our intro courses?

## Visualize coverage for targeted NGS (exome) experiments

March 20, 2014
I'm calling variants from exome sequencing data and I need to evaluate the efficiency of the capture and the coverage along the target regions.This sounds like a great use case for bedtools, your swiss-army knife for genomic arithmetic and interval man...

## Big Data and SQL

March 20, 2014
I happen to think that SQL is a very viable option for analyzing big data.  I was thinking about this when I a book review recently:For instance, Siegel reports, people who buy small felt pads that adhere to the bottom of chair legs (to protect th...

## The 80/20 rule of statistical methods development

March 20, 2014
Developing statistical methods is hard and often frustrating work. One of the under appreciated rules in statistical methods development is what I call the 80/20 rule (maybe could even by the 90/10 rule). The basic idea is that the first

## Seasonal, or periodic, time series

March 20, 2014
Monday, in our MAT8181 class, we’ve discussed seasonal unit roots from a practical perspective (the theory will be briefly mentioned in a few weeks, once we’ve seen multivariate models). Consider some time series , for instance traffic on ...

## The candy weighing demonstration, or, the unwisdom of crowds

March 20, 2014
From 2008: The candy weighing demonstration, or, the unwisdom of crowds My favorite statistics demonstration is the one with the bag of candies. I've elaborated upon it since including it in the Teaching Statistics book and I thought these tips might be useful to some of you. Preparation Buy 100 candies of different sizes and

## Machine Learning Lesson of the Day – Overfitting and Underfitting

Overfitting occurs when a statistical model or machine learning algorithm captures the noise of the data.  Intuitively, overfitting occurs when the model or the algorithm fits the data too well.  Specifically, overfitting occurs if the model or algorithm shows low bias but high variance.  Overfitting is often a result of an excessively complicated model, and […]

## Mathematical and Applied Statistics Lesson of the Day – The Central Limit Theorem Applies to the Sample Mean

$Mathematical and Applied Statistics Lesson of the Day – The Central Limit Theorem Applies to the Sample Mean$

Having taught and tutored introductory statistics numerous times, I often hear students misinterpret the Central Limit Theorem by saying that, as the sample size gets bigger, the distribution of the data approaches a normal distribution.  This is not true.  If your data come from a non-normal distribution, their distribution stays the same regardless of the […]

## MCMC for Econometrics Students – III

March 20, 2014
As its title suggests, this post is the third in a sequence of posts designed to introduce econometrics students to the use of Markov Chain Monte Carlo (MCMC, or MC2) methods for Bayesian inference. The first two posts can be found here and here, and&n...

## fine-sliced Poisson [a.k.a. sashimi]

March 19, 2014
As my student Kévin Guimard had not mailed me his own Poisson slice sampler of a Poisson distribution, I could not tell why the code was not working! My earlier post prompted him to do so and a somewhat optimised version is given below: As you can easily check by running the code, it does […]

## LEGO Calendar: a Tangible Wall-Mounted Planner that Can be Digitized

March 19, 2014
The LEGO Calendar [vitaminsdesign.com], developed by design and invention studio Vitamins, is a wall-mounted time planner that simply can be photographed to create an online, digital counterpart. The calendar is big, visible, tactile and flexible, as...

## The time traveler’s challenge.

March 19, 2014
Editor's note: This has nothing to do with statistics.  I do a lot of statistics for a living and would claim to know a relatively large amount about it. I also know a little bit about a bunch of other scientific … Continue reading →

## How Americans vote

March 19, 2014
An interview with me from 2012: You're a statistician and wrote a book, Red State, Blue State, Rich State, Poor State, looking at why Americans vote the way they do. In an election year I think it would be a good time to revisit that question, not just for people in the US, but anyone around