# Blog Archives

## Interim analysis, futility monitoring, and predictive probability

October 19, 2016
By

An interim analysis of a clinical trial is an unusual analysis. At the end of the trial you want to estimate how well some treatment X works. For example, you want to how likely is it that treatment X works better than the control treatment Y. But in the middle of the trial you want to know something more subtle. It’s […]

## Gentle introduction to R

October 13, 2016
By

The R language is closely tied to statistics. It’s ancestor was named S, because it was a language for Statistics. The open source descendant could have been named ‘T’, but its creators chose to call it’R.’ Most people learn R as they learn statistics: Here’s a statistical concept, and here’s how you can compute it in R. […]

## Uncertainty in a probability

September 20, 2016
By

Suppose you did a pilot study with 10 subjects and found a treatment was effective in 7 out of the 10 subjects. With no more information than this, what would you estimate the probability to be that the treatment is effective in the next subject? Easy: 0.7. Now what would you estimate the probability to be […]

## Insufficient statistics

September 12, 2016
By

Experience with the normal distribution makes people think all distributions have (useful) sufficient statistics [1]. If you have data from a normal distribution, then the sufficient statistics are the sample mean and sample variance. These statistics are “sufficient” in that the entire data set isn’t any more informative than those two statistics. They effectively condense […]

## ETAOIN SHRDLU and all that

September 2, 2016
By

Statistics can be useful, even if it’s idealizations fall apart on close inspection. For example, take English letter frequencies. These frequencies are fairly well known. E is the most common letter, followed by T, then A, etc. The string of letters “ETAOIN SHRDLU” comes from the days of Linotype when letters were arranged in that order, […]

## Mittag-Leffler function and probability distribution

July 17, 2016
By

The Mittag-Leffler function is a generalization of the exponential function. Since k!= Γ(k + 1), we can write the exponential function’s power series as and we can generalize this to the Mittag=Leffler function which reduces to the exponential function when α = β = 1. There are a few other values of α and β for […]

## Sparsely populated zip codes

June 30, 2016
By

The dormitory I lived in as an undergraduate had its own five-digit zip code at one time. It was rumored to be the largest dorm in the US, or maybe the largest west of the Mississippi, or something like that. There were about 3,000 of us living there. Although the dorm had enough people to justify […]

## Cepstrum, quefrency, and pitch

May 18, 2016
By

John Tukey coined many terms that have passed into common use, such as bit (a shortening of binary digit) and software. Other terms he coined are well known within their niche: boxplot, ANOVA, rootogram, etc. Some of his terms, such as jackknife and vacuum cleaner, were not new words per se but common words he […]

## Continuum between anecdote and data

March 4, 2016
By

The difference between anecdotal evidence and data is overstated. People often have in mind this dividing line where observations on one side are worthless and observations on the other side are trustworthy. But there’s no such dividing line. Observations are data, but some observations are more valuable than others, and there’s a continuum of value. I believe […]

## The empty middle: why no one is average

February 20, 2016
By

In 1945, a Cleveland newspaper held a contest to find the woman whose measurements were closest to average. This average was based on a study of 15,000 women by Dr. Robert Dickinson and embodied in a statue called Norma by Abram Belskie. Out of 3,864 contestants, no one was average on all nine factors, and fewer than 40 […]