Project Tycho includes data from all weekly notifiable disease reports for the United States dating back to 1888. These data are freely available to anybody interested. I wanted to play around with the data a bit, so I registered.MeaslesMeasles a...

A student writes: I am new to Bayesian methods. While I am reading your book, I have some questions for you. I am interested in doing Bayesian hierarchical (multi-level) linear regression (e.g., random-intercept model) and Bayesian structural equation modeling (SEM)—for causality. Do you happen to know if I could find some articles, where authors could

The anonymous commenter puts it well: The problem is simple, the researchers are disproving always false null hypotheses and taking this disproof as near proof that their theory is correct.

Consider the problem of making statistical inferences, as opposed to predictions. The product of statistical inference is a probabilistic statement about a population quantity, for example a 100(1-)% confidence interval for a population median. In this context, the principal reason for diagnostics is to comfort ourselves about the quality of such inferences. For example, we […]

I'm happy that Nate Silver and his FiveThirtyEight are back. Nate generally provides interesting and responsible data-based journalism for the educated layperson. (Of course he sometimes gets in over his head, but don't we all?)Now Krugman suddenl...

Mark Palko explains why a penalty for getting the wrong answer on a test (the SAT, which is used in college admissions and which is used in the famous 8 schools example) is not a "penalty for guessing." Then the very next day he catches this from Todd Balf in the New York Times Magazine:

I was unlocking my bike, with music turned on low, and a couple of high school kids were lounging around nearby. One of them walked over and asked, « Qui est-ce qui chante? ». I responded, “Stevie Wonder” (not trying any accent on this ...

In this post I will go through 5 reasons: zero cost, crazy popularity, awesome power, dazzling flexibility, and mind-blowing support. I believe R is the best statistical programming language to learn. As a blogger who has contributed over 150 posts in Stata and over 100 in R I have extensive experience with both a proprietary statistical programming language as well as the open source alternative. In my graduate career I have…

Andrew Gelman and his coauthors, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Don Rubin, have now published the latest edition of their book Bayesian Data Analysis. David and Aki are newcomers to the authors’ list, with an extended section on non-linear and non-parametric models. I have been asked by Sam Behseta to write […]

In a further discussion of the discussion about the discussion of a paper in Administrative Science Quarterly, Thomas Basbøll writes: I [Basbøll] feel "entitled", if that's the right word (actually, I'd say I feel privileged), to express my opinions to anyone who wants to listen, and while I think it does say something about an

Reader Daniel T. is unhappy about this analysis of the intraday Internet usage by OS and device types. He doesn't like their choice of index, which I'll get to in a second post. (Link appears here when ready.) There is something else wrong with this type of analysis. Let's do a thought experiment. If you are a marketer interested in the diurnal variability in Internet usage, what are some of…

We’re going to be discussing the philosophy of m-s testing today in our seminar, so I’m reblogging this from Feb. 2012. I’ve linked the 3 follow-ups below. Check the original posts for some good discussion. (Note visitor*) “This is the kind of cure that kills the patient!” is the line of Aris Spanos that I […]

As discussed in the MAT8181 course, there are – at least – two kinds of non-stationary time series: those with a trend, and those with a unit-root (they will be called integrated). Unit root tests cannot be used to assess whether a time ser...

Dylan Small writes: I am starting an observational studies journal that aims to publish papers on all aspects of observational studies, including study protocols for observational studies, methodologies for observational studies, descriptions of data sets for observational studies, software for observational studies and analyses of observational studies. One of the goals of the journal is

UCLA is having a big data conference on Thursday and Friday Mar 27, 28 2014. The conference is organized by four computer science and genomic biology types. Speakers cluster [one of the rare appropriate uses of cluster analysis I know of] into three...

Theodore Vasiloudis writes: I'd like to bring your attention to this article by Benjamin Morris discussing the value of steals for the NBA. The author argues that a steal should be a highly sought after statistic as it equates to higher chances of victory and is very hard to replace when a player is injured.