(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)

Annie Wang writes:

I’m a law student (and longtime reader of the blog), and I’m writing to flag a variant of the “All Else Equal” fallacy in ProPublica’s article on the COMPAS Risk Recidivism Algorithm. The article analyzes how statistical risk assessments, which are used in sentencing and bail hearings, are racially biased. (Although this article came out a while ago, it’s been recently linked to in this and this NYT op-ed.)

ProPublica posts a github repo with the data and replication code. I wanted to flag this part of the analysis:

- The analysis also showed that even when controlling for prior crimes, future recidivism, age, and gender, black defendants were 45 percent more likely to be assigned higher risk scores than white defendants.
- The violent recidivism analysis also showed that even when controlling for prior crimes, future recidivism, age, and gender, black defendants were 77 percent more likely to be assigned higher risk scores than white defendants
The basic method is to build a logistic regression model with the score as outcome and race and few other demographic variables as the independent variables. (You can also reasonably argue that a logistic regression without any interaction terms is not the best way to analyze this data, but for the moment, I’ll just stick within the authors’ approach.)

Here’s the problem: to arrive at the numbers above,

they compare an. (See Cell 16-17 of the original analysis)`intercept-only`

model vs.`intercept + African-American Indicator`

modelBut since it’s a logistic regression, the marginal effect of being African-American isn’t captured by the coefficient alone. Instead, they calculate the marginal effect of being African-American with all the other factors set to 0, i.e., it’s a comparison among White and African-American males, between age 25-45, with zero priors, with zero recidivism within the last two years, and with a particular severity of crime.

Fewer than 5% of the entire dataset meets these specifications in the first analysis and it’s only 7% in the second, so the statistical result reported is really only applicable for a small portion of the population.

If you calculate marginal effects over the entire dataset, taking into account men and women, all ages, and the full distribution of prior crimes, severity, and recidivism, those numbers are more modest:

- The analysis also showed that even when controlling for prior crimes, future recidivism, age, and gender, black defendants were
~~45~~20 percent more likely to be assigned higher risk scores than white defendants.- The violent recidivism analysis also showed that even when controlling for prior crimes, future recidivism, age, and gender, black defendants were
~~77~~33 percent more likely to be assigned higher risk scores than white defendants.This doesn’t change the piece’s overall argument, but some of these claims seem a little misleading in light of the actual comparison being made. My full analysis here (written for an undergraduate who’s taken a first course on statistics): https://github.com/anniejw6/

compas-analysis/blob/master/ 01-regression-correction.ipynb Curious to get your take here. I emailed the authors of this article, who responded with “Very interesting and informative. We were advised that our way of reporting is standard practice.”

My reply: Without getting into any of the specifics (not because I disagree with the above argument but just because I don’t have the energy to try to evaluate the details), I’ll say that this reminds me a lot of my paper with Iain Pardoe on average predictive comparisons for models with nonlinearity, interactions, and variance components. The key point is that predictive comparisons depend in general on the values of the other variables in the model, and if you want some sort of average number, you have to think a bit about what to average over. I hadn’t thought of the connection to the All Else Equal fallacy but that’s an interesting point.

The post Average predictive comparisons and the All Else Equal fallacy appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**