(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)

Donald Williams points us to this new paper by Gang Chen, Yaqiong Xiao, Paul Taylor, Tracy Riggins, Fengji Geng, Elizabeth Redcay, and Robert Cox:

In neuroimaging, the multiplicity issue may sneak into data analysis through several channels . . . One widely recognized aspect of multiplicity, multiple testing, occurs when the investigator fits a separate model for each voxel in the brain. However, multiplicity also occurs when the investigator conducts multiple comparisons within a model, tests two tails of a t-test separately when prior information is unavailable about the directionality, and branches in the analytic pipelines. . . .

More fundamentally, the adoption of dichotomous decisions through sharp thresholding under NHST may not be appropriate when the null hypothesis itself is not pragmatically relevant because the effect of interest takes a continuum instead of discrete values and is not expected to be null in most brain regions. When the noise inundates the signal, two different types of error are more relevant than the concept of FPR: incorrect sign (type S) and incorrect magnitude (type M).

Excellent! Chen et al. continue:

In light of these considerations, we introduce a different strategy using Bayesian hierarchical modeling (BHM) to achieve two goals: 1) improving modeling efficiency via one integrative (instead of many separate) model and dissolving the multiple testing issue, and 2) turning the focus of conventional NHST on FPR into quality control by calibrating type S errors while maintaining a reasonable level of inference efficiency.

The performance and validity of this approach are demonstrated through an application at the region of interest (ROI) level, with all the regions on an equal footing: unlike the current approaches under NHST, small regions are not disadvantaged simply because of their physical size. In addition, compared to the massively univariate approach, BHM may simultaneously achieve increased spatial specificity and inference efficiency. The benefits of BHM are illustrated in model performance and quality checking using an experimental dataset. In addition, BHM offers an alternative, confirmatory, or complementary approach to the conventional whole brain analysis under NHST, and promotes results reporting in totality and transparency. The methodology also avoids sharp and arbitrary thresholding in the p-value funnel to which the multidimensional data are reduced.

I haven’t read this paper in detail but all of this sounds great to me. Also I noticed this:

The difficulty in passing a commonly accepted threshold with noisy data may elicit a hidden misconception: A statistical result that survives the strict screening with a small sample size seems to gain an extra layer of strong evidence, as evidenced by phrases in the literature such as “despite the small sample size” or “despite limited statistical power.” However, when the statistical power is low, the inference risks can be perilous . . .

They’re pointing out the “What does not kill my statistical significance makes it stronger” fallacy!

And they fit their models in Stan. This is just wonderful. I really hope this project works out and is useful in imaging research. It feels so good to think that all this work we do can make a difference somewhere.

The post “Handling Multiplicity in Neuroimaging through Bayesian Lenses with Hierarchical Modeling” appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**