The difference between “significant” and “non-significant” is not itself statistically significant

January 9, 2013

(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)

Commenter Rahul asked what I thought of this note by Scott Firestone (link from Tyler Cowen) criticizing a recent discussion by Kevin Drum suggesting that lead exposure causes violent crime. Firestone writes:

It turns out there was in fact a prospective study done—but its implications for Drum’s argument are mixed. The study was a cohort study done by researchers at the University of Cincinnati. Between 1979 and 1984, 376 infants were recruited. Their parents consented to have lead levels in their blood tested over time; this was matched with records over subsequent decades of the individuals’ arrest records, and specifically arrest for violent crime. Ultimately, some of these individuals were dropped from the study; by the end, 250 were selected for the results.

The researchers found that for each increase of 5 micrograms of lead per deciliter of blood, there was a higher risk for being arrested for a violent crime, but a further look at the numbers shows a more mixed picture than they let on. In prenatal blood lead, this effect was not significant. If these infants were to have no additional risk over the median exposure level among all prenatal infants, the ratio would be 1.0. They found that for their cohort, the risk ratio was 1.34. However, the sample size was small enough that the confidence interval dipped as low as 0.88 (paradoxically indicating that additional 5 µg/dl during this period of development would actually be protective), and rose as high as 2.03. This is not very convincing data for the hypothesis.

For early childhood exposure, the risk is 1.30, but the sample size was higher, leading to a tighter confidence interval of 1.03-1.64. This range indicates it’s possible that the effect is as little as a 3% increase in violent crime arrests, but this is still statistically significant.

I have not followed this at all and have no comments on the substance of the matter. But based on Firestone’s piece linked above, I am not impressed by his statistical criticisms. He seemed to just be going around looking for subsets of the data with statistically insignificant results. With a small sample size, not every comparison is going to be statistically significant. That does not represent evidence against the hypothesis of an effect.

P.S. Firestone comments, explaining that his goal is not to shoot down the claim but rather to point out areas of uncertainty which should motivate further study.

Please comment on the article here: Statistical Modeling, Causal Inference, and Social Science

Tags: ,