(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)

Above is a pair of graphs from a 2015 paper by Alison Gopnik, Thomas Griffiths, and Christopher Lucas. It takes up half a page in the journal, Current Directions in Psychological Science. I think we can do better.

First, what’s wrong with the above graphs?

We could start with the details: As a reader, I have to go back and forth between the legend and the bars to keep the color scheme fresh in my mind. The negative space in the middle of each plot looks a bit like a white bar, which is not directly confusing but makes the display that much harder to read. And I had to do a double-take to interpret the infinitesimal blue bar of zero height on the left plot. Also, it’s not clear how to interpret the y-axes: 25 participants out of how many? And whassup with the y-axis on the second graph: the 15 got written as a 1, which makes me suspect that the graph was re-drawn from an original, which then leads to concern that other mistakes may have been introduced in the re-drawing process.

But that’s not the direction I want to go. There *are* problems with the visual display, but going and fixing them one by one would not resolve the larger problem. To get there, we need to think about goals.

A graph is a set of comparisons, and the two goals of a graph are:

1. To understand communicate the size and directions of comparisons that you were already interested in, before you made the graph; and

2. To facilitate discovery of new patterns in data that go beyond what you expected to see.

Both these goals are important. Its important to understand and communicate what we think we know, and it’s also important to put ourselves in a position where we can learn more.

The question, when making any graphs, is: what comparisons does it make it easy to see? After all, if you just wanted the damn numbers you could put them in a table.

Now let’s turn to the graph above. It makes it easy to compare the heights of two lines right next to each other—for example, I see that the dark blue lines are all higher than the light blue lines, except for the pair on the left . . . hmmmm, which are light blue and which are dark blue, again? I can’t really do much with the absolute levels of the lines because I don’t know what “25” means.

Look. I’m not trying to rag on the authors here. This sort of Excel-style bar graph is standard in so many presentations. I just think they could do better.

So, how to do better? Let’s start with the goals.

1. What are the key comparisons that the authors want to emphasize? From the caption, it seems that the most important comparisons are between children and adults. We want a graph that shows the following patterns:

(i) In the Combination scenario, children tended to chose multiple objects (the correct response, it seems) and adults tended to choose single objects (the wrong response).

(ii) In the Individual scenario, both children and adults tended to choose a single object (the correct response).

Actually, I think I’d re-order these, and first look at the Individual scenario which seems to be some sort of baseline, and then go to Combination which is displaying something new.

2. What might we want to learn from a graph of these data, beyond the comparisons listed just above? This one’s not clear so I’ll guess:

Who were those kids and adults who got the wrong answer in the Combination scenario? Did they have other problems? What about the *adults* who got the wrong answer in the Individual scenario, which was presumably easier? Did they also get the answer wrong in the other case? There must be some other things to learn from these data too—it’s hard to get people to participate in a psychology experiment, and once you have them there, you’ll want to give them as many tasks as you can. But from this figure alone, I’m not sure what these other questions would be.

OK, now time to make the new graph. Given that I don’t have the raw data, and I’m just trying to redo the figure above, I’ll focus on task 1: displaying the key comparisons clearly.

Hey—I just realized something! The two outcomes in this study are “Single object” and “Multiple object”—that’s all there is! And, looking carefully, I see that the numbers in each graph add up to a constant: it’s 25 children and, ummm, let me read this carefully . . . 28 adults!

This simplifies our task considerably, as now we have only 4 numbers to display instead of 8.

We can easily display four numbers with a line plot. The outcome is % who give the “Single object” response, and the two predictors are Child/Adult and Individual Principle / Combination Principle.

One of these predictors will go on the x-axis, one will correspond to the two lines, and the outcome will go on the y-axis.

In this case, which of our two predictors goes on the x-axis?

Sometimes the choice is easy: if one predictor is binary (or discrete with only a few categories) and the other is continuous, it’s simplest to put the continuous predictor as x, and use the discrete predictor to label the lines. In this case, though, both predictors are binary, so what to do?

I think we should use logical or time order, as that’s easy to follow. There are two options:

(1) Time order in age, thus Children, then Adults; or

(2) Logical order in the experiment, thus Individual Principle, then Combination Principle, as Individual is in a sense the control case and Combination is the new condition.

I tried it both ways and I think option 2 was clearer. So I’ll show you this graph and the corresponding R code. Then I’ll show you option 1 so you can compare.

Here’s the graph:

I think this is better than the bar graphs from the original article, for two reasons. First, we can see everything in one place: Like the title sez, “Children did better than adults,\nespecially in the combination condition.” Second, we can directly make both sorts of comparisons: we can compare children to adults, and we can also make the secondary comparison of seeing that both groups performed worse under the combination condition than the individual condition.

Here’s the data file I made, gopnik.txt:

Adult Combination N_single N_multiple 0 0 25 0 0 1 4 21 1 0 23 5 1 1 18 10

And here’s the R code:

setwd("~/AndrewFiles/research/graphics") gopnik <- read.table("gopnik.txt", header=TRUE) N <- gopnik$N_single + gopnik$N_multiple p_multiple <- gopnik$N_multiple / N p_correct <- ifelse(gopnik$Combination==0, 1 - p_multiple, p_multiple) colors <- c("red", "blue") combination_labels <- c("Individual\ncondition", "Combination\ncondition") adult_labels <- c("Children", "Adults") pdf("gopnik_2.pdf", height=4, width=5) par(mar=c(3,3,3,2), mgp=c(1.7, .5, 0), tck=-.01, bg="gray90") plot(c(0,1), c(0,1), yaxs="i", xlab="", ylab="Percent who gave correct answer", xaxt="n", yaxt="n", type="n", main="Children did better than adults,\nespecially in the combination condition", cex.main=.9) axis(1, c(0, 1), combination_labels, mgp=c(1.5,1.5,0)) axis(2, c(0,.5,1), c("0", "50%", "100%")) for (i in 1:2){ ok <- gopnik$Adult==(i-1) x <- gopnik$Combination[ok] y <- p_correct[ok] se <- sqrt((N*p_correct + 2)*(N[ok]*(1-p_correct) + 2)/(N[ok] + 4)^3) lines(x, y, col=colors[i]) points(x, y, col=colors[i], pch=20) for (j in 1:2){ lines(rep(x[j], 2), y[j] + se[j]*c(-1,1), lwd=2, col=colors[i]) lines(rep(x[j], 2), y[j] + se[j]*c(-2,2), lwd=.5, col=colors[i]) } text(mean(x), mean(y) - .05, adult_labels[i], col=colors[i], cex=.9) } dev.off()

Yeah, yeah, I know the code is ugly. I'm pretty sure it could be done much more easily in ggplot2.

Also, just for fun I threw in +/- 1 and 2 standard error bars, using the Agresti-Coull formula based on (y+2)/(n+4) for binomial standard errors. Cos why not. The one thing this graph *doesn't* show is whether the adults who got it wrong on the individual condition were more likely to get it wrong in the combination condition, but that information wasn't in the original graph either.

On the whole, I'm satisfied that the replacement graph contains all the information in less space and is much clearer than the original.

Again, this is not a slam on the authors of the paper. They're not working within a tradition in which graphical display is important. I'm going through this example in order to provide a template for future researchers when summarizing their data.

And, just for comparison, here's the display going the other way:

(This looks so much like the earlier plot that it seems at first that we did something wrong. But, no, it just happened that way because we're only plotting four numbers, and it just happened that the two numbers whose positions changed had very similar values of 0.84 and 0.82.)

And here's the R code for this second graph:

pdf("gopnik_1.pdf", height=4, width=5) par(mar=c(3,3,3,2), mgp=c(1.7, .5, 0), tck=-.01, bg="gray90") plot(c(0,1), c(0,1), yaxs="i", xlab="", ylab="Percent who gave correct answer", xaxt="n", yaxt="n", type="n", main="Children did better than adults,\nespecially in the combination condition", cex.main=.9) axis(1, c(0, 1), adult_labels) axis(2, c(0,.5,1), c("0", "50%", "100%")) for (i in 1:2){ ok <- gopnik$Combination==(i-1) x <- gopnik$Adult[ok] y <- p_correct[ok] se <- sqrt((N*p_correct + 2)*(N[ok]*(1-p_correct) + 2)/(N[ok] + 4)^3) lines(x, y, col=colors[i]) points(x, y, col=colors[i], pch=20) for (j in 1:2){ lines(rep(x[j], 2), y[j] + se[j]*c(-1,1), lwd=2, col=colors[i]) lines(rep(x[j], 2), y[j] + se[j]*c(-2,2), lwd=.5, col=colors[i]) } text(mean(x), mean(y) - .1, combination_labels[i], col=colors[i], cex=.9) } dev.off()

The post Graphs as comparisons: A case study appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**