Fan Li and Peng Ding write:
Difference-in-differences is a widely-used evaluation strategy that draws causal inference from observational panel data. Its causal identification relies on the assumption of parallel trend, which is scale dependent and may be questionable in some applications. A common alternative method is a regression model that adjusts for the lagged dependent variable, which rests on the assumption of ignorability conditional on past outcomes. In the context of linear models, Angrist and Pischke (2009) show that difference-in-differences and the lagged-dependent-variable regression estimates have a bracketing relationship. Namely, for a true positive effect, if ignorability is correct, then mistakenly assuming the parallel trend will overestimate the effect; in contrast, if the parallel trend is correct, then mistakenly assuming ignorability will underestimate the effect. We show that the same bracketing relationship holds in general nonparametric (model-free) settings without assuming either ignorability or parallel trend. We also extend the result to semiparametric estimation based on inverse probability weighting.
Li and Ding sent the paper to me because I wrote something on the topic a few years ago, under the title, Difference-in-difference estimators are a special case of lagged regression.