This brief post is a "shout out" for Irene Botusaru (Economics, Simon Fraser University) who gave a great seminar in our department yesterday.The paper that she presented (co-authored with Federico Guitierrez), is titled "Diff...

We don’t get a lot of trolls on this blog. When people try, I typically respond with some mixture of directness and firmness, and the trolls either give up or perhaps they recognize that I am answering questions in sincerity, which does not serve their trollish purposes. But I’m pretty sure that my feeling is […] The post Why are trolls so bothersome? appeared first on Statistical Modeling, Causal Inference,…

David Mellor, from the Center for Open Science, emailed me asking if I’d announce his Preregistration Challenge on my blog, and I’m glad to do so. You win $1,000 if your properly preregistered paper is published. The recent replication effort in psychology showed, despite the common refrain – “it’s too easy to get low P-values” – that in […]

The news is out that Uber got fined by the New York Attorney General's office for data breaches and privacy concerns. The headline writer for ZDNet nailed this one: "Uber fined peanuts in God View surveillance" (link). And the sub-lead has the kicker: "For a company with a valuation of over $50 billion, a $20,000 fine over user data protection is laughable." This settlement tells us one of the following…

Dave Choi writes: A reviewer has pointed me something that you wrote in your blog on inverting test statistics. Specifically, the reviewer is interested in what can happen if the test can reject the entire assumed family of models, and has asked me to consider discussing whether it applies to a paper that I am […] The post Read this to change your entire perspective on statistics: Why inversion of…

Presentation is often considered a part of visualization, but what does that mean for the kinds of techniques we use? Are they the same as used for analysis? What criteria should we use to pick them? In a new paper, I discuss a class of techniques I call presentation-only. The paper is accordingly titled Presentation-Only Visualization Techniques, and it just … Continue reading Paper: Presentation-Oriented Visualization Techniques

Statisticians use n to denote the number of subjects in a data set and p to denote nearly everything else. You’re supposed to know from context what each p means. In the phrase “big n, little p” the symbol p means the number of measurements per subject. Traditional data sets are “big n, little p” […]

Actually, they just want to look into the possibility. Alexander Berger of Givewell writes: In the past you’ve written a couple posts about GiveWell’s research, and we’ve recently posted something else that I thought might be of interest to your audience: an expression of interest in research on the impact of trace lithium on suicide […] The post Givewell wants to put lithium in your drinking water appeared first on…

I don’t often read the Iranian Journal of Cancer Prevention, but I like this quote: I was thinking more about the PACE trial. God is in every leaf of every tree. There’s been a lot of discussion about statistical problems with the PACE papers, and also about the research team’s depressing refusal to share their […] The post The PACE trial and the problems with discrete, yes/no thinking appeared first…

Weighted averages are all around us. Teachers use weighted averages to assign a test more weight than a quiz. Schools use weighted averages to compute grade-point averages. Financial companies compute the return on a portfolio as a weighted average of the component assets. Financial charts show (linearly) weighted moving averages […] The post Compute a weighted mean in SAS appeared first on The DO Loop.

Kevin Tenenbaum writes: I wanted to let you know about a hackathon that we will be hosting at Camden Yards on February 5th, 2016. This event is a great opportunity for your students to use their statistics, data science and computer science expertise to find novel solutions to problems that Major League Baseball teams deal […] The post Baltimore Orioles Hackathon coming soon! appeared first on Statistical Modeling, Causal Inference,…

One thing that struck me about this PACE scandal: if this study was so bad as all that, how did it taken so seriously by policymakers and the press? There’s been a lot of discussion about serious flaws in the published papers, and even more discussion about the unforgivable refusal of the research team to […] The post PACE study and the Lancet: Journal reputation is a two-way street appeared…

Following the course of this morning, I got a very interesting question from a student of mine. The question was about having non-significant components in a splineregression. Should we consider a model with a small number of knots and all components significant, or one with a (much) larger number of knots, and a lot of knots non-significant? My initial intuition was to prefer the second alternative, like in autoregressive models in R. When…

I’ll start off this blog on the first work day of the new year with an important post connecting some ideas we’ve been lately talking a lot about. Someone rolls a die four times, and he tells you he got the numbers 1, 4, 3, 6. Is this a plausible outcome? Sure. Is is probable? […] The post Plausibility vs. probability, prior distributions, and the garden of forking paths appeared…

