Posts Tagged ‘ Significance ’

How Do You Know if Your Data Has Signal?

August 10, 2015
By
How Do You Know if Your Data Has Signal?

Image by Liz Sullivan, Creative Commons. Source: Wikimedia An all too common approach to modeling in data science is to throw all possible variables at a modeling procedure and “let the algorithm sort it out.” This is tempting when you are not sure what are the true causes or predictors of the phenomenon you are … Continue reading How Do You Know if Your Data Has Signal? →

Read more »

Statistically significant. What does it mean?

July 22, 2015
By

Andrew Gelman has a great post about the concept of statistical significance, starting with a published definition by the Department of Health that is technically wrong on many levels. (link) Statistical significance is one of the most important concepts in statistics. In recent years, there is a vocal group who claims this idea is misguided and/or useless. But what they are angry about is the use (and frequently, mis-use) of…

Read more »

It is possible to not learn real causes from some A/B tests

July 20, 2015
By

It is conventional wisdom that A/B testing (or in proper terms, randomized controlled experiments) is the gold standard for causal analysis, meaning if you run an A/B test, you know what caused an effect. In practice, this is not always true. Sometimes, the A/B test only provides a statistical understanding of causes but not an average Joe's understanding. Let's start with a hypothetical example in which both definitions are aligned.…

Read more »

Is data privacy a fundamental right?

July 4, 2015
By

This piece is part of the StatBusters column written jointly with Andrew Gelman. Hope they fix the labeling soon. In it, we talk about two recent studies on data privacy, which leads to contradictory conclusions. How should the media report such surveys? Is the brand name of the organization enough? In addition, we debunk the notion that consumers will definitely get something valuable out of sharing their data.

Read more »

Some statistics about nutrition statistics

May 26, 2015
By

I only read nutrition studies in the service of this blog but otherwise, I don't trust them or care. Nevertheless, the health beat of most media outlets is obsessed with printing the latest research on coffee or eggs or fats or alcohol or what have you. Now, the estimable John Ioannidis has published an editorial in BMJ titled "Implausible Results in Human Nutrition Research". John previously told us about the…

Read more »

Story time, known unknowns and the endowment effect in an HBR article on customer data

May 6, 2015
By
Story time, known unknowns and the endowment effect in an HBR article on customer data

Harvard Business Review devotes a long article to customer data privacy in the May issue (link). The article raises important issues, such as the low degree of knowledge about what data are being collected and traded, the value people place on their data privacy, and so on. In a separate post, I will discuss why I don't think the recommendations issued by the authors will resolve the issues they raised.…

Read more »

Gelman speed read

April 23, 2015
By

For those who have found it tough to keep up with Andrew Gelman's prolificacy, here are some brief summaries of several recent posts: On people obsessed with proving the statistical significance of tiny effects: "they are trying to use a bathroom scale to weigh a feather—and the feather is resting loosely in the pouch of a kangaroo that is vigorously jumping up and down." (link) [I left a comment. In…

Read more »

Yet another popular nutrition headline doesn’t stand up to scrutiny

April 1, 2015
By
Yet another popular nutrition headline doesn’t stand up to scrutiny

Are science journalists required to take one good statistics course? That is the question in my head when I read this Science Times article, titled "One Cup of Coffee Could Offset Three Drinks a Day" (link). We are used to seeing rather tenuous conclusions such as "Four Cups of Coffee Reduces Your Risk of X". This headline takes it up another notch. A result is claimed about the substitution effect…

Read more »

One place not to use the Sharpe ratio

March 23, 2015
By
One place not to use the Sharpe ratio

Having worked in finance I am a public fan of the Sharpe ratio. I have written about this here and here. One thing I have often forgotten (driving some bad analyses) is: the Sharpe ratio isn’t appropriate for models of repeated events that already have linked mean and variance (such as Poisson or Binomial models) … Continue reading One place not to use the Sharpe ratio → Related posts: A…

Read more »

Optimizely Stats Engine 2: what about advanced users?

February 9, 2015
By
Optimizely Stats Engine 2: what about advanced users?

In Part 1, I covered the logic behind recent changes to the statistical analysis used in standard reports by Optimizely. In Part 2, I ponder what this change means for more sophisticated customers--those who are following the proper protocols for classical design of experiments, such as running tests of predetermined sample sizes, adjusting for multiple comparisons, and constructing and analyzing multivariate tests using regression with interactions. For this segment, the…

Read more »


Subscribe

Email:

  Subscribe