Selection bias in the reporting of shaky research: An example

September 9, 2017

(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)

On 30 Dec 2016, a reporter wrote:

I was wondering if you’d have some time to look at an interesting embargoed study coming out next week in JAMA Internal Medicine, which seeks to show that gun violence is a social contagion. I know that a few years ago, social contagion studies were controversial and I’m wondering if this work has any significant flaws – and in particular whether it controls for homophily or shared environment. If you’d any time to look at it this before its embargo lifts at Tues 11 am, and talk about this or offer a few thoughts by email, I’d greatly appreciate your time.

My response:

I don’t fully understand everything that’s going on in the paper.

For example on the second column of page E3, they say, “We restricted our analysis to the network’s largest connected component, which contained 29.9% of all arrested individuals (n = 138 163) and 89.3% of all the co-offending edges (n = 417 635). Consistent with previous research on the concentration of gun violence within co-offending networks, the largest connected component contained 74.5% of gun-shot violence episodes of arrested individuals…” First, it’s not clear to me why they would want to throw out 70% of the people in their sample and 25% of their cases of gun-shot violence. That’s a lot of data to toss out, and I don’t see why they feel the need to do this. Second, it’s not clear to me how gun-shot violence episodes fit into the data. The network is defined by arrests, so what does it mean for an episode of gun-shot violence to be “contained” by a component. I’m not saying they did anything wrong here, I just am not clear on what they are saying. Figure 1 doesn’t help because it only shows arrests, not violence episodes.

On page E4 they write, “For each gunshot subject who was influenced primarily by contagion, we identified which peer (the infector) was most responsible for causing him or her to become infected (ie, a subject of gun violence).” It’s hard for me to believe that you can do this using a statistical analysis. How could you know that someone is “influenced primarily by contagion”? I don’t even know what this means.

I’m suspicious of this: “The results of these experiments suggested that homophily and confounding were insufficient explanations for the data, leaving social contagion as a more likely explanation.” It should be all three! Just because homophily and confounding don’t explain everything, that doesn’t mean they’re nothing, right? (See page 962 of this paper for more on the general point that there’s no need to choose among explanations.)

They write that 63% of the violence episodes “were attributable to social contagion.” I don’t know what they mean by this. It sounds weird to me. But maybe it all makes sense, I don’t know. They refer to eMethods and eFigures but those have not been included here.

Also, this isn’t the whole story, but . . . it’s bad news that they approvingly cite the discredited study of Christakis and Fowler on the contagion of obesity (that’s #23 in their reference list). Oddly enough, they cite some critics of Christakis and Fowler (references 50 and 51) but it appears they didn’t internalize these criticisms or else they wouldn’t have, with a straight face, written earlier in their paper that “social networks are fundamental in diffusion processes related to . . . obesity.”

In any case, the topic is important. I don’t really buy statements such as 63% of episodes being attributable etc., as written. Somehow this all has to be untangled. If you’re connected with someone in this network, it means you’ve been arrested at the same time as that other person. People in this dataset who have been involved in gun violence are disproportionately likely to have been arrested at the same time as someone else who’s been involved with gun violence—I guess this makes sense, but I don’t know why it has to be called contagion.

Also I don’t see why it’s published in an internal medicine journal! No big deal, it just seems a bit off-topic!

You might also want to ask Andrew Thomas at CMU, a statistician who’s looked critically at some of these social contagion issues in the past.

The reporter then replied:

Thanks for your very detailed response. I spoke with another public health person who focuses on gun violence research who raised many similar questions – am going back and forth with the authors, but may skip writing about this since it seems mostly to be perplexing, even to experts.

Remember that selection bias we were talking about awhile ago, that when shaky science gets published, credulous reporters are more likely to just run the equivalent of the press release, while skeptical reporters might just skip the story entirely? The result is that what does get published is more likely to be positive and uncritical.

The post Selection bias in the reporting of shaky research: An example appeared first on Statistical Modeling, Causal Inference, and Social Science.

Please comment on the article here: Statistical Modeling, Causal Inference, and Social Science

Tags: ,