Amy Orben and Andrew Przybylski write:
The widespread use of digital technologies by young people has spurred speculation that their regular use negatively impacts psychological well-being. Current empirical evidence supporting this idea is largely based on secondary analyses of large-scale social datasets. Though these datasets provide a valuable resource for highly powered investigations, their many variables and observations are often explored with an analytical flexibility that marks small effects as statistically significant . . . we address these methodological challenges by applying specification curve analysis (SCA) across three large-scale social datasets . . . to rigorously examine correlational evidence for the effects of digital technology on adolescents. The association we find between digital technology use and adolescent well-being is negative but small, explaining at most 0.4% of the variation in well-being. Taking the broader context of the data into account suggests that these effects are too small to warrant policy change.
SCA is a tool for mapping the sum of theory-driven analytical decisions that could justifiably have been taken when analysing quantitative data. Researchers demarcate every possible analytical pathway and then calculate the results of each. Rather than reporting a handful of analyses in their paper, they report all results of all theoretically defensible analyses . . .
Here’s the relevant methods paper on specification curve analysis, by Uri Simonsohn, Joseph Simmons, and Leif Nelson, which seems similar to what Sara Steegen, Francis Tuerlinckx, Wolf Vanpaemel and I called the multiverse analysis.
It makes sense that a good idea will come up in different settings with some differences in details. Forking paths in methodology as well as data coding and analysis, one might say.
Anyway, here’s what Orben and Przybylski report:
Three hundred and seventy-two justifiable specifications for the YRBS, 40,966 plausible specifications for the MTF and a total of 603,979,752 defensible specifications for the MCS were identified. Although more than 600 million specifications might seem high, this number is best understood in relation to the total possible iterations of dependent (six analysis options) and independent variables (224 + 225 – 2 analysis options) and whether co-variates are included (two analysis options). . . . The number rises even higher, to 2.5 trillion specifications, for the MCS if any combination of co-variates (212 analysis options) is included.
Given this, and to reduce computational time, we selected 20,004 specifications for the MCS.
I love it that their multiverse was so huge they needed to drastically prune it by only including 20,000 analyses.
How did they choose this particular subset?
We included specifications of all used measures per se, and any combinations of measures found in the previous literature, and then supplemented these with other randomly selected combinations. . . . After noting all specifications, the result of every possible combination of these specifications was computed for each dataset.
I wonder if they could’ve found even more researcher degrees of freedom by considering rules for data coding and exclusion, which is what we focused on in our multiverse paper. (I’m also thinking of the article discussed the other day that excluded all but 687 out of 5342 observations.)
Ultimately I think the right way to analyze this sort of data is through a multilevel model, not a series of separate estimates and p-values.
But I do appreciate that they went to the trouble to count up 603,979,752 paths. This is important, because I think a lot of people don’t realize the weakness of many published claims based on p-values (an issue we discussed in a recent comment thread here, when Ethan wrote: “I think lots of what’s discussed on this blog and a cause of common lay errors in probability comes down to, ‘It’s tempting to believe that you can’t get all of this just by chance, but you can.'”).