It is well known that dropping the constant in regression analysis may introduce bias. However, bias is really not the deeper issue. The deeper issue is that by omitting the constant, you are specifying a very specific form for the relationship b...

One of my students complained that his slice sampler of a Poisson distribution was not working when following the instructions in Monte Carlo Statistical Methods (Exercise 8.5). This puzzled me during my early morning run and I checked on my way back, even before attacking the fresh baguette I had brought from the bakery… The […]

In an earlier video, I introduced the definition of the hazard function and broke it down into its mathematical components. Recall that the definition of the hazard function for events defined on a continuous time scale is . Did you know that the hazard function can be expressed as the probability density function (PDF) divided by the […]

Stephen Senn Head, Methodology and Statistics Group, Competence Center for Methodology and Statistics (CCMS), Luxembourg Delta Force To what extent is clinical relevance relevant? Inspiration This note has been inspired by a Twitter exchange with respected scientist and famous blogger David Colquhoun. He queried whether a treatment that had 2/3 of an effect that would […]

This amusing-yet-so-true video directed by Eléonore Pourriat shows a sex-role-reversed world where women are in charge and men don’t get taken seriously. It’s convincing and affecting, but the twist that interests me comes at the end, when the real world returns. It’s really creepy. And this in turn reminds me of something we discussed here […]The post In the best alternative histories, the real world is what’s ultimately real appeared…

Just for fun I thought I’d run a week’s worth of old posts, just some things I came across when searching for various things. Of course I could just post the links right here but instead I’ll repost with my comments on how things have changed in the intervening years. Mon: In the best alternative […]The post On deck this week: Revisitings appeared first on Statistical Modeling, Causal Inference, and…

The article (link) in Science about the failure of Google Flu Trends is important for many reasons. One is the inexplicable silence in the Big Data community about this little big problem: it's not as if this is breaking news -- it was known as early as 2009 that Flu Trends completely missed the swine flu pandemic (link), underestimating it by 50%, and then in 2013, Nature reported that Flu…

À 11h15 au Centre de Mathématiques Appliquées: Peut-on utiliser les méthodes bayésiennes pour résoudre la crise des résultats de la recherche statistiquement significatifs que ne tiennent pas? It’s the usual story: the audience will be technical but with a varying mix of interests, and so what they most wanted to hear was something general and […]The post Ma conférence demain (mardi) à l’École Polytechnique appeared first on Statistical Modeling, Causal…

The leave-one-out cross-validation statistic is given by where , are the observations, and is the predicted value obtained when the model is estimated with the th case deleted. This is also sometimes known as the PRESS (Prediction Residual Sum of Squares) statistic. It turns out that for linear models, we do not actually have to estimate the model times, once for each omitted case. Instead, CV can be…

The SAS/IML language has several functions for finding the unions, intersections, and differences between sets. In fact, two of my favorite utility functions are the UNIQUE function, which returns the unique elements in a matrix, and the SETDIF function, which returns the elements that are in one vector and not [...]

The above is the running head of the arXived paper with full title “Implications of uniformly distributed, empirically informed priors for phylogeographical model selection: A reply to Hickerson et al.” by Oaks, Linkem and Sukuraman. That I (again) read in the plane to Montréal (third one in this series!, and last because I also watched […]

More efficiency and an additional function in the new version on CRAN. Variance estimation The major functionality in the package is variance estimation: Ledoit-Wolf shrinkage via var.shrink.eqcor statistical factor model (principal components) via factor.model.stat There have been a number of previous blog posts on both factor models and Ledoit-Wolf shrinkage. Positive-definiteness The default value of … Continue reading →

Best blog comment ever, following up on our post, How tall is Jon Lee Anderson?: Based on this picture: http://farm3.static.flickr.com/2235/1640569735_05337bb974.jpg he appears to be fairly tall. But the perspective makes it hard to judge. Based on this picture: http://www.catalinagarcia.com/cata/Libraries/BLOG_Images/Cata_w_Jon_Lee_Anderson.sflb.ashx he appears to be about 9-10 inches taller than Catalina Garcia. But how tall is Catalina […]The post “I have no idea who Catalina Garcia is, but she makes a decent…