What if that regression-discontinuity paper had only reported local linear model results, and with no graph?

We had an interesting discussion the other day regarding a regression discontinuity disaster.

In my post I shone a light on this fitted model:

Most of the commenters seemed to understand the concern with these graphs, that the upward slopes in the curves directly contribute to the estimated negative value at the discontinuity leading to a model that doesn’t seem to make sense, but I did get an interesting push-back that is worth discussing further. Commenter Sam wrote:

You criticize the authors for using polynomials. Here is something you yourself wrote with Guido Imbens on the topic of using polynomials in RD designs:

“We argue that estimators for causal effects based on such methods can be misleading, and we recommend researchers do not use them, and instead use estimators based on local linear or quadratic polynomials or other smooth functions.”

From p.15 of the paper:

“We implement the RDD using two approaches: the global polynomial regression and the local linear regression”

They show that their results are similar in either specification.

The commenter made the seemingly reasonable point that, since the authors actually did use the model that Guido and I recommended, and it gave the same results as what they found under the controversial model, what was my problem?

What if?

To put it another way, what if the authors had done the exact same analyses but reported them differently, as follows:

– Instead of presenting the piecewise quadratic model as the main result and the local linear model as a side study, they could’ve reversed the order and presented the local linear model as their main result.

– Instead of graphing the fitted discontinuity curve, which looks so bad (see graphs above), they could’ve just presented their fitted model in tabular form. After all, if the method is solid, who needs the graph?

Here’s my reply.

First, I do think the local linear model is a better choice in this example than the global piecewise quadratic. There are cases where a global model makes a lot of sense (for example in pre/post-test situations such as predicting election outcomes given previous election outcomes), but not in this case, when there’s no clear connection at all between percentage vote for a union and some complicated measures of stock prices. So, yeah, I’d say ditch the global piecewise quadratic model, don’t even include it in a robustness check unless the damn referees make you do it and you don’t feel like struggling with the journal review process.

Second, had the researchers simply fit the local linear model without the graph, I wouldn’t have trusted their results.

Not showing the graph doesn’t make the problem go away, it just hides the problem. It would be like turning off the oil light on your car so that there’s one less thing for you to be concerned about.

This is a point that the commenter didn’t seem to realize: The graph is not just a pleasant illustration of the fitted model, not just some sort of convention in displaying regression discontinuities. The graph is central to the modeling process.

One challenge with regression discontinuity modeling (indeed, applied statistical modeling more generally) as it is commonly practiced is that it is unregularized (with coefficients estimated using some variant of least squares) and uncontrolled (lots of researcher degrees of freedom in fitting the model). In a setting where there’s no compelling theoretical or empirical reason to trust the model, it’s absolutely essential to plot the fitted model against the data and see if it makes sense.

I have no idea what the data and fitted local linear model would look like, and that’s part of the problem here. (The research article in question has other problems, notably regarding data coding and exclusion, choice of outcome to study, and a lack of clarity regarding the theoretical model and its connection to the statistical model, but here we’re focusing on the particular issue of the regression being fit. These concerns do go together, though: if the data were cleaner and the theoretical structure were stronger, this can inspire more trust in a fitted statistical model.)

Taking the blame

Examples in statistics and econometrics textbooks (my own included) are too clean. The data come in, already tidy, and then the model is fit, and it works as expected, and some strong and clear conclusion comes out. You learn research methods in this way, and you can expect this to happen in real life, with some estimate or hypothesis test lining up with some substantive question, and all the statistical modeling just being a way to make that connection. And you can acquire the attitude that the methods just simply work. In the above example, you can have the impression that if you do a local linear regression and a bunch of robustness tests, that you’ll get the right answer.

Does following the statistical rules assure you (probabilistically) that you will get the right answer? Yes—in some very simple settings such as clean random sampling and clean randomized experiments, where effects are large and the things being measured are exactly what you want to know. More generally, no. More generally, there are lots of steps connecting data, measurement, substantive theory, and statistical model, and no statistical procedure blindly applied—even with robustness checks!—will be enuf on its own. It’s necessary to directly engage with data, measurement, and substantive theory. Graphing the data and fitted model is one part of this engagement, often a necessary part.