Here's another from xkcd.com, on our "good graphics" theme.

J’animerai une formation lundi 28 de 14:00 à 16:00 au local N-6320 de l’UQAM sur le thème introduction aux arbres de classification. Cette formation est organisée dans le cadre des séminaires en méthodes d’analyses quantitative...

The New York Times recently published an article on education titled "Parental Involvement Is Overrated". Most research in this area supports the opposite view, but the authors claim that "evidence from our research suggests otherwise". Before you stop helping your children … Continue reading →

Nelson Villoria writes: I find the multilevel approach very useful for a problem I am dealing with, and I was wondering whether you could point me to some references about poolability tests for multilevel models. I am working with time series of cross sectional data and I want to test whether the data supports cross […] The post If you get to the point of asking, just do it. But…

How is it possible that it has taken a podcast called Data Stories 35 episodes to get to the topic of data storytelling? Alberto Cairo and I helped get the topic straightened out, and I think we even convinced Moritz that stories are not the enemy of exploration. It was a fun episode to record, and it touches on many interesting topics.

Ethan Siegel wrote a post entitled The Math of the Fastest Human Alive five years ago, using regressions. An alternative is too use extreme value models (I wrote a post a long time ago on the maximum length of a tennis match using extreme value theory a few years ago). In 2009, John Einmahl and Sander Smeets wrote a great article entitled ultimate 100m world records through extreme-value theory. The article is…

Here I derive a simple formula for probability distributions general enough for Statistical Mechanics and Classical Statistics in which the roles, meanings, and interpretations between the Information Entropy and Boltzmann’s Entropy are as clear ...

Prakash Nayak writes: I work as a musculoskeletal oncologist (surgeon) in Mumbai, India and am keen on sarcoma research. Sarcomas are rare disorders, and conventional frequentist analysis falls short of providing meaningful results for clinical application. I am thus keen on applying Bayesian analysis to a lot of trials performed with small numbers in this […] The post Looking for Bayesian expertise in India, for the purpose of analysis of…

Two years ago, Wired breathlessly extolled the virtues of A/B testing (link). A lot of Web companies are in the forefront of running hundreds or thousands of tests daily. The reality is that most A/B tests fail. A/B tests fail for many reasons. Typically, business leaders consider a test to have failed when the analysis fails to support their hypothesis. "We ran all these tests varying the color of the…

The MAPE (mean absolute percentage error) is a popular measure for forecast accuracy and is defined as where denotes an observation and denotes its forecast, and the mean is taken over . Armstrong (1985, p.348) was the first (to my knowledge) to point out the asymmetry of the MAPE saying that “it has a bias favoring estimates that are below the actual values”. A few years later, Armstrong…

Last week I described the Hilbert matrix of size n, which is a famous square matrix in numerical linear algebra. It is famous partially because its inverse and its determinant have explicit formulas (that is, we know them exactly), but mainly because the matrix is ill-conditioned for moderate values of […]

A Statistical Model as a Chance Mechanism Aris Spanos Jerzy Neyman (April 16, 1894 – August 5, 1981), was a Polish/American statistician[i] who spent most of his professional career at the University of California, Berkeley. Neyman is best known in statistics for his pioneering contributions in framing the Neyman-Pearson (N-P) optimal theory of hypothesis testing […]

According to the central limit theorem, if random variables, , are independent and identically distributed, is sufficiently large, then the distribution of their sample mean, , is approximately normal, and this approximation is better as increases. One of the most remarkable aspects of the central limit theorem (CLT) is its validity for any parent distribution of […]

-+*Cancer research is sometimes criticized for being timid. Drug companies run enormous trials looking for small improvements. Critics say they should run smaller trials and more of them. Which side is correct depends on what’s out there waiting to be discovered, which of course we don’t know. We can only guess. Timid research is rational […]

This would make Karl Popper cry. And, at the very end: The present results indicate that under certain, theoretically predictable circumstances, female ovulation—long assumed to be hidden—is in fact associated with a distinct, objectively observable behavioral display. This statement is correct—if you interpret the word “predictable” to mean “predictable after looking at your data.” P.S. […] The post When you believe in things that you don’t understand appeared first on…

The exponential family of distributions is important in Statistics because all the common distributions are of this type, and by the Pitman-Koopman theorem, they are exactly the family of distributions which has useful sufficient statistics. By an amaz...