(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)

Ryan Bain writes:

I came across your ‘Fitting Multilevel Models When Predictors and Group Effects Correlate‘ paper that you co-authored with Dr. Bafumi and read it with great interest. I am a current postgraduate student at the University of Glasgow writing a dissertation examining explanations of Euroscepticism at the individual and country level since the onset of the economic crisis. I employ multilevel modeling with two levels: individuals within states. As I am examining predictors of Euroscepticism at the country level, I employ random effects as individuals are clustered within countries. My supervisor pointed me in the direction of your paper as a means for controlling for omitted variable bias by ensuring that my country-level predictors are not correlated with my random effect parameter.

I recently discovered an article by Jonathan Kelley, M. D. R. Evans, Jennifer Lowman and Valerie Lykes: ‘Group-mean-centering independent variables in multi-level models is dangerous’. After working through a series of examples, the paper suggests that the practice be abandoned. The authors demonstrate, after group mean centering individual-level independent variables, that group mean centering country-level variables in regression models results in incorrect estimations of the coefficients for country-level (and individual-level) predictors being produced. The authors summarise their doubts about the method on pg.15 in the ‘5 Summary’ section. However, all of their criticisms about the use of the method and the adverse consequences that group mean centering has on estimates of country-level predictors are based on models that also have the individual-level predictors group mean centered.

The authors of the article only briefly reference the purpose of group mean centering as a means of controlling for omitted variable bias at the contextual level, on pg.3 stating: “Raudenbush and Bryk (2002) also posit that group-mean centering can reduce bias in random component variance estimates”. That passing reference is all that the authors make in regards to the use of group mean centering for this purpose.

They also cite other authors who criticise the method but, again, all of their issues with the method relate to models in which individual level predictors are centered. In ‘Centering Predictor Variables in Cross-Sectional Multilevel Models: A New Look at an Old Issue’ by Craig K. Enders and Davood Tofighi (2007), for example, the authors state on pg.121 that: “the centering of Level 2 (e.g., organizational level) variables is far less complex than the centering decisions required at Level 1, as it is only necessary to choose between the raw metric and CGM[centered at the grand mean]; CWC[centering within cluster(which the authors refer to group mean centering as)] is not an option because each member of a given cluster shares the same value on the Level 2 predictor. Centering decisions at Level 2 generally mimic prescribed practice from the OLS regression literature (Aiken & West, 1991), so the focus of this article is on centering at Level 1. Throughout the remainder of the article, we assume that all Level 2 predictors are centered at their grand mean.”

Could please provide any guidance on this matter? The Kelley et al. (2016) article has made me doubt the use of group mean centering for controlling for omitted variable bias yet I am not sure if that was it’s intention for models in which only the country-level predictors were group mean centered.

My reply:

Yes, rather than thinking about centering the group means, I prefer to think about it as adding new predictors at the group level. In sociology they sometimes talk about individual and contextual effects, but more generally we can just speak predictively and say that the individual predictor and its group-level average can both be predictive of the outcome.

Bain adds:

What I believe has happened with this paper is that the authors assert that the group mean centered individual level coefficients are inappropriate because the within effect introduces additional level 2 error. But the authors do not mean the within effect (they stay clear of this terminology but it is what their argument is referring to). They are actually discussing the difference between the within and between effect. Throughout their article the authors examined the mean of the correlated random effects (cre) model in their analysis which represents the between-within difference.

Essentially, because the authors have examined the effects of the mean of the cre model they’ve compared and contrasted the coefficient of the mean of the individual-level variable of interest in the cre model with the original coefficient in a random effects model. With their focus on the mean – the difference between the within and between effect – they believed that this was the coefficient which represented the within effect hence why they’ve (incorrectly) argued that the within effect is confounded with the level 2 error (because the mean is what they focused on which obviously is confounded with the level 2 error in the cre model).

The post Fitting multilevel models when predictors and group effects correlate appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**