Posts Tagged ‘ Probability and Statistics ’

Reproducible randomized controlled trials

February 1, 2016
By
Reproducible randomized controlled trials

“Reproducible” and “randomized” don’t seem to go together. If something was unpredictable the first time, shouldn’t it be unpredictable if you start over and run it again? As is often the case, we want incompatible things. But the combination of reproducible and random can be reconciled. Why would we want a randomized controlled trial (RCT) to […]

Read more »

Random number generator seed mistakes

January 29, 2016
By
Random number generator seed mistakes

Long run or broken software? I got a call one time to take a look at randomization software that wasn’t randomizing. My first thought was that the software was working as designed, and that the users were just seeing a long run. Long sequences of the same assignment are more likely than you think. You […]

Read more »

Big p, Little n

January 7, 2016
By

Statisticians use n to denote the number of subjects in a data set and p to denote nearly everything else. You’re supposed to know from context what each p means. In the phrase “big n, little p” the symbol p means the number of measurements per subject. Traditional data sets are “big n, little p” […]

Read more »

The longer it has taken, the longer it will take

December 21, 2015
By

Suppose project completion time follows a Pareto (power law) distribution with parameter α. That is, for t > 1, the probability that completion time is bigger than t is t-α. (We start out time at t = 1 because that makes the calculations a little simpler.) Now suppose we know that a project has lasted […]

Read more »

Skin in the game for observational studies

November 4, 2015
By

The article Deming, data and observational studies by S. Stanley Young and Alan Karr opens with Any claim coming from an observational study is most likely to be wrong. They back up this assertion with data about observational studies later contradicted by prospective studies. Much has been said lately about the assertion that most published results are false, particularly […]

Read more »

Balancing profit and learning in A/B testing

October 28, 2015
By

A/B testing, or split testing, is commonly used in web marketing to decide which of two design options performs better. If you have so many visitors to a site that the number of visitors used in a test is negligible, conventional randomization schemes are the way to go. They’re simple and effective. But if you […]

Read more »

A rose by any other name: Data science etc.

October 14, 2015
By

I help people make decisions in the face of uncertainty. Sounds interesting. I’m a data scientist. Not sure what that means, but it sounds cool. I study machine learning. Hmm. Maybe interesting, maybe a little ominous. I’m into big data. Exciting or passé, depending on how many times you’ve heard the term. Even though each […]

Read more »

Data analysis vs statistics

October 3, 2015
By

John Tukey preferred the term “data analysis” over “statistics.” In his paper Data Anaysis, Computation and Mathematics, he explains why. My title speaks of “data analysis” not “statistics”, and of “computation” not “computing science”; it does not speak of “mathematics”, but only last. Why? … My brother-in-squared-law, Francis J. Anscombe has commented on my use of […]

Read more »

Why not statistics

April 9, 2015
By

Jordan Ellenberg’s parents were both statisticians. In his interview with Strongly Connected Components Jordan explains why he went into mathematics rather than statistics. I tried. I tried to learn some statistics actually when I was younger and it’s a beautiful subject. But at the time I think I found the shakiness of the philosophical underpinnings […]

Read more »

Bayes factors vs p-values

March 31, 2015
By

Bayesian analysis and Frequentist analysis often lead to the same conclusions by different routes. But sometimes the two forms of analysis lead to starkly different conclusions. The following illustration of this difference comes from a talk by Luis Pericci last week. He attributes the example to “Bernardo (2010)” though I have not been able to find the exact […]

Read more »


Subscribe

Email:

  Subscribe