A reader of my most recent post tried the R code I had written to download the data set of electoral disproportionality from the GitHub repository. However, it didn’t work for them. After entering disproportionality.data <- getURL(url) they go...

Aki writes: Here’s my version of the birthday frequency graph. I used Gaussian process with two slowly varying components and periodic component with decay, so that periodic form can change in time. I used Student’s t-distribution as observation model to allow exceptional dates to be outliers. I guess that periodic component due to week effect [...]

According to this link, a study proved that "90 percent of a person's traits can be judged with their shoes." Without needing to look at the study, a reader can reason that this claim makes no sense. What does it mean by 90 percent of a person's traits? How many traits are there in a person? If the researcher defines 1000 traits, then the shoe predictor will need to predict…

—Larry Wasserman In this post I want to review an interesting result by David Freedman (Annals of Mathematical Statistics, Volume 36, Number 2 (1965), 454-456) available at projecteuclid.org. The result gets very little attention. Most researchers in statistics and machine learning seem to be unaware of the result. The result says that, “almost all” Bayesian [...]

Catherine Rampell writes: On Monday the Nobel Foundation, which bestows the world’s most prestigious academic, literary and humanitarian prizes, said it was reducing the cash awarded with Nobel Prizes by about 20 percent. . . . Peter A. Diamond, a professor emeritus at the Massachusetts Institute of Technology who also received the Nobel in economic [...]

Jacob Oaknin asks: Akaike‘s selection criterion is often justified on the basis of the empirical risk of a ML estimate being a biased estimate of the true generalization error of a parametric family, say the family, S_m, of linear regressors on a m-dimensional variable x=(x_1,..,x_m) with gaussian noise independent of x (for instance in “Unifying [...]

This summer, I am teaching an undergraduate stats class, which is a first class in stats to cover three units, descriptive statistics, probability and statistical inference. The course webpage is here. The following paragraph is from the thesis of Michael Phillip Lesnick. It explains the relationship among the three units: Recall first that in statistics, [...]

—Larry Wasserman Welcome to my blog, which will discuss topics in Statistics and Machine Learning. Some posts will be technical and others will be non-technical. Since this blog is about topics in both Statistics and Machine Learning, perhaps I should address the question: What is the difference between these two fields? The short answer is: [...]

Hi, our group of R users from INSEE, aka FLR, meets monthly in Paris. Next meeting is on Wed 13 (tomorrow), 1-2 pm, room 539 (an ID is needed to come in, map to access INSEE R), about ggplot2 and parallel computing. Since the first meeting in February, presentations have included hot topics like webscrapping, C in R, RStudio, SQLite […]