Blog Archives

Uncertainty in a probability

September 20, 2016
By
Uncertainty in a probability

Suppose you did a pilot study with 10 subjects and found a treatment was effective in 7 out of the 10 subjects. With no more information than this, what would you estimate the probability to be that the treatment is effective in the next subject? Easy: 0.7. Now what would you estimate the probability to be […]

Read more »

Insufficient statistics

September 12, 2016
By

Experience with the normal distribution makes people think all distributions have (useful) sufficient statistics [1]. If you have data from a normal distribution, then the sufficient statistics are the sample mean and sample variance. These statistics are “sufficient” in that the entire data set isn’t any more informative than those two statistics. They effectively condense […]

Read more »

ETAOIN SHRDLU and all that

September 2, 2016
By
ETAOIN SHRDLU and all that

Statistics can be useful, even if it’s idealizations fall apart on close inspection. For example, take English letter frequencies. These frequencies are fairly well known. E is the most common letter, followed by T, then A, etc. The string of letters “ETAOIN SHRDLU” comes from the days of Linotype when letters were arranged in that order, […]

Read more »

Mittag-Leffler function and probability distribution

July 17, 2016
By
Mittag-Leffler function and probability distribution

The Mittag-Leffler function is a generalization of the exponential function. Since k!= Γ(k + 1), we can write the exponential function’s power series as and we can generalize this to the Mittag=Leffler function which reduces to the exponential function when α = β = 1. There are a few other values of α and β for […]

Read more »

Sparsely populated zip codes

June 30, 2016
By

The dormitory I lived in as an undergraduate had its own five-digit zip code at one time. It was rumored to be the largest dorm in the US, or maybe the largest west of the Mississippi, or something like that. There were about 3,000 of us living there. Although the dorm had enough people to justify […]

Read more »

Cepstrum, quefrency, and pitch

May 18, 2016
By
Cepstrum, quefrency, and pitch

John Tukey coined many terms that have passed into common use, such as bit (a shortening of binary digit) and software. Other terms he coined are well known within their niche: boxplot, ANOVA, rootogram, etc. Some of his terms, such as jackknife and vacuum cleaner, were not new words per se but common words he […]

Read more »

Continuum between anecdote and data

March 4, 2016
By
Continuum between anecdote and data

The difference between anecdotal evidence and data is overstated. People often have in mind this dividing line where observations on one side are worthless and observations on the other side are trustworthy. But there’s no such dividing line. Observations are data, but some observations are more valuable than others, and there’s a continuum of value. I believe […]

Read more »

The empty middle: why no one is average

February 20, 2016
By
The empty middle: why no one is average

In 1945, a Cleveland newspaper held a contest to find the woman whose measurements were closest to average. This average was based on a study of 15,000 women by Dr. Robert Dickinson and embodied in a statue called Norma by Abram Belskie. Out of 3,864 contestants, no one was average on all nine factors, and fewer than 40 […]

Read more »

Bayesian and nonlinear

February 13, 2016
By

Someone said years ago that you’ll know Bayesian statistics has become mainstream when people no longer put “Bayesian” in the titles of their papers. That day has come. While the Bayesian approach is still the preferred approach of a minority of statisticians, it’s no longer a novelty. If you want people to find your paper interesting, the substance […]

Read more »

Improving on Chebyshev’s inequality

February 12, 2016
By
Improving on Chebyshev’s inequality

Chebyshev’s inequality says that the probability of a random variable being more than k standard deviations away from its mean is less than 1/k2. In symbols, This inequality is very general, but also very weak. It assumes very little about the random variable X but it also gives a loose bound. If we assume slightly more, […]

Read more »


Subscribe

Email:

  Subscribe