## Probabilities and P-Values

December 2, 2013
P-values seem to be the bane of a statistician’s existence.  I’ve seen situations where entire narratives are written without p-values and only provide the effects. It can also be used as a data reduction tool but ultimately it reduces the world into a binary system: yes/no, accept/reject. Not only that but the binary threshold is […]

## Simudidactic

November 23, 2013
auto·di·dact n. A self-taught person. From Greek autodidaktos, self-taught : auto-, auto- + didaktos, taught; + sim·u·late v. To create a representation or model of (a physical system or particular situation, for example). From Latin simulre, simult-, from similis, like; = (If you can get past the mixing of Latin and Greek roots) sim·u·di·dactic adj. To learn by creating a representation or model of a physical system or […]

## Probabilité et géométrie

November 17, 2013
$\mathbb{P}(Y=y)=\sum_x \mathbb{P}(Y=y,X=x)$

Une des formules les plus importantes en probabilité (je trouve) est la “formule des probabilités totales” qui dit tout simplement que que l’ont peut aussi écrire, à l’aide de la formule de Bayes Une des conséquences de ce résultat est la “law of total expectation“, souvent appelé théorème de double projection, que l’on écrit souvent sous la forme raccourcie  (dans la formule de droite, le premier symbole est un espérance, c’est…

## Generating functions

November 8, 2013
$F(x)=1-e^{-x}/3$

Today, I wanted to publish a post on generating functions, based on discussions I had with Jean-Francois while having our coffee after lunch a couple of times already. The other reason is that I publish my post while my student just finished their Probability exam (and there were a few questions on generating functions). A short introduction (back on a specific exercise) In the Probability exam, I included an exercise we’ve…

## Halloween and candies (a ballot problem)

October 31, 2013
This year, for Halloween, a post on candies (I promise, next year I will write another post on zombies). But I don’t want to focus on the kids problems (last year, we tried to minimize their walking distance to collect as much candies as possible, with part 1 and part 2), I want to discuss my own problems. Because usually, the kids wear their costumes, and they go in the streets, they knock on the…

## Detecting Unfair Dice in Casinos with Bayes’ Theorem

Introduction I saw an interesting problem that requires Bayes’ Theorem and some simple R programming while reading a bioinformatics textbook.  I will discuss the math behind solving this problem in detail, and I will illustrate some very useful plotting functions to generate a plot from R that visualizes the solution effectively. The Problem The following question is […]

## Follow up to Johnson et al Post

October 21, 2013
Last week I posted a comment on a paper by Neil Johnson and colleagues that I now regret. The comment amounted to a bit of statistical pedantry on my part regarding some of the wording in the paper. It was my wording in this post, and specifically the title, which would have benefited from some […]

## P-value fallacy alive and well: Latest case in Scientific Reports

October 17, 2013
Erratum (10/17/13): The paper was published in Scientific Reports, an OA journal from the publishers of Nature, and not in the Journal Nature as originally reported. Clarification (10/17/13): The paper discussed here is quite good overall and very interesting. I do not believe that anything in this post calls into question any of its main […]

## Calculating AUC the hard way

October 10, 2013
The Area Under the Receiver Operator Curve is a commonly used metric of model performance in machine learning and many other binary classification/prediction problems. The idea is to generate a threshold independent measure of how well a model is able to distinguish between two possible outcomes. Threshold independent here just means that for any model […]

## Estimating rates from a single occurrence of a rare event

October 5, 2013
Elon Musk’s writing about a Tesla battery fire reminded me of some of the math related to trying to estimate the rate of a rare event from a single occurrence of the event (plus many non-event occurrences). In this article we work through some of the ideas. Elon Musk wrote that the issues of the […] Related posts: Sample size and power for rare events What is a large enough…