Blog Archives

Why do linear prediction confidence regions flare out?

June 26, 2017
By

Suppose you’re tracking some object based on its initial position x0 and initial velocity v0. The initial position and initial velocity are estimated from normal distributions with standard deviations σx and σv. (To keep things simple, let’s assume our object is moving in only one dimension and that the distributions around initial position and velocity […]

Read more »

Extreme beta distributions

June 20, 2017
By
Extreme beta distributions

A beta probability distribution has two parameters, a and b. You can think of these as the number of successes and failures out of a+b trials. The PDF of a beta distribution is approximately normal if a and b are approximately equal and a + b is large. If a and b are close, they don’t have to be very large for the beta […]

Read more »

Quantile-quantile plots and powers of 3/2

April 2, 2017
By
Quantile-quantile plots and powers of 3/2

This post serves two purposes. It will empirically explore a question in number theory and demonstrate quantile-quantile (q-q) plots. It will shed light on a question raised in the previous post. And if you’re not familiar with q-q plots, it will serve as an introduction to such plots. The previous post said that for almost all x > […]

Read more »

Freudian hypothesis testing

March 23, 2017
By
Freudian hypothesis testing

In his paper Mindless statistics, Gerd Gigerenzer uses a Freudian analogy to describe the mental conflict researchers experience over statistical hypothesis testing. He says that the “statistical ritual” of NHST (null hypothesis significance testing) “is a form of conflict resolution, like compulsive hand washing.” In Gigerenzer’s analogy, the id represents Bayesian analysis. Deep down, a […]

Read more »

Big data and the law

February 2, 2017
By
Big data and the law

Excerpt from the new book Big Data of Complex Networks: Big Data and data protection law provide for a number of mutual conflicts: from the perspective of Big Data analytics, a strict application of data protection law as we know it today would set an immediate end to most Big Data applications. From the perspective of […]

Read more »

Subjectivity in statistics

December 15, 2016
By
Subjectivity in statistics

Andrew Gelman on subjectivity in statistics: Bayesian methods are often characterized as “subjective” because the user must choose a prior distribution, that is, a mathematical expression of prior information. The prior distribution requires information and user input, that’s for sure, but I don’t see this as being any more “subjective” than other aspects of a […]

Read more »

Interim analysis, futility monitoring, and predictive probability

October 19, 2016
By
Interim analysis, futility monitoring, and predictive probability

An interim analysis of a clinical trial is an unusual analysis. At the end of the trial you want to estimate how well some treatment X works. For example, you want to how likely is it that treatment X works better than the control treatment Y. But in the middle of the trial you want to know something more subtle. It’s […]

Read more »

Gentle introduction to R

October 13, 2016
By
Gentle introduction to R

The R language is closely tied to statistics. It’s ancestor was named S, because it was a language for Statistics. The open source descendant could have been named ‘T’, but its creators chose to call it’R.’ Most people learn R as they learn statistics: Here’s a statistical concept, and here’s how you can compute it in R. […]

Read more »

Uncertainty in a probability

September 20, 2016
By
Uncertainty in a probability

Suppose you did a pilot study with 10 subjects and found a treatment was effective in 7 out of the 10 subjects. With no more information than this, what would you estimate the probability to be that the treatment is effective in the next subject? Easy: 0.7. Now what would you estimate the probability to be […]

Read more »

Insufficient statistics

September 12, 2016
By

Experience with the normal distribution makes people think all distributions have (useful) sufficient statistics [1]. If you have data from a normal distribution, then the sufficient statistics are the sample mean and sample variance. These statistics are “sufficient” in that the entire data set isn’t any more informative than those two statistics. They effectively condense […]

Read more »


Subscribe

Email:

  Subscribe