How likelihoodists exaggerate evidence from statistical tests

November 25, 2014
Have you ever noticed that some leading advocates of a statistical account, say a testing account A, upon discovering account A is unable to handle a certain kind of important testing problem that a rival testing account, account B, has no trouble at all with, will mount an argument that being able to handle that kind of problem […]

In an earlier post I mentioned a paper that I co-authored with Xiao Ling. The paper is "Bias reduction for the maximum likelihood estimator of the parameters of the generalized Rayleigh family of distributions. Communications in Statistics - ...

The World Cup Problem Part 2: Germany v. Argentina

November 25, 2014
This is the second of two articles about Bayesian analysis applied to World Cup soccer.  The previous article is here.Earlier this semester I posed this problem to my Bayesian statistics class at Olin College:In the final match of t...

Performing Logistic Regression in R and SAS

Introduction My statistics education focused a lot on normal linear least-squares regression, and I was even told by a professor in an introductory statistics class that 95% of statistical consulting can be done with knowledge learned up to and including a course in linear regression.  Unfortunately, that advice has turned out to vastly underestimate the […]

More on Big Data

November 24, 2014
An earlier post, "Big Data the Big Hassle," waxed negative. So let me now give credit where credit is due.What's true in time-series econometrics is that it's very hard to list the third-most-important, or even second-most-important, contribution of Bi...

Msc Kvetch: “You are a Medical Statistic”, or “How Medical Care Is Being Corrupted”

November 22, 2014
A NYT op-ed the other day,”How Medical Care Is Being Corrupted” (by Pamela Hartzband and Jerome Groopman, physicians on the faculty of Harvard Medical School), gives a good sum-up of what I fear is becoming the new normal, even under so-called “personalized medicine”.  “It is obsolete for the doctor to approach each patient strictly as an individual; medical decisions should […]

Statistical computing languages at the RSS

November 22, 2014
On Friday the Royal Statistical Society hosted a meeting on Statistical computing languages, organised by my colleague Colin Gillespie. Four languages were presented at the meeting: Python, Scala, Matlab and Julia. I presented the talk on Scala. The slides I presented are available, in addition to the code examples and instructions on how to run […]

Statistics for Big Data

November 22, 2014
Doctoral programme in cloud computing for big data I’ve spent much of this year working to establish our new EPSRC Centre for Doctoral Training in Cloud Computing for Big Data, which partly explains the lack of posts on this blog in recent months. The CDT is now established, with 11 students in the first cohort, […]

Factor Analysis vs Principal Component Analysis

November 22, 2014
Recently some papers discussed in our journal club  are focused on integrative clustering of multiple omics data sets. I found that they are all originated from factor analysis and make use of the advantage of factor analysis over principal component analysis. Let’s recall the model for factor analysis: where () and , with mean and […]

Gelman explains why massive sample sizes to chase after tiny effects is silly

November 21, 2014
What a lucky day I found time to catch up on some Gelman. He posted about the Facebook research ethics controversy, and I'm glad to see that he and I have pretty much the same attitude (my earlier post is here.). It's a storm in a teacup. Gelman makes two other points about the Facebook study--unrelated to the ethics--which are very important. First, he said: if we happen to see…