What pieces do chess grandmasters move, and when?

Dan Goldstein posted a version of the above image (with R code!) which came from Ashton Anderson.

My graph above is slightly modified from the original, which looks like this:

The original was just fine, but I had a few changes to make. I thought the color scheme could be improved, also I wanted change the order of the pieces on the graph: it didn’t seem quite right to start with the bishop. I’d do some order such as Pawn, Knight, Bishop, Queen, Rook, Castling, King, which is roughly the order that pieces get moved (except that I’ve put Castling between Rook and King because it seems to make sense to go there).

I wonder how Anderson came up with the order in the above graph. Let me look at the code . . . OK, I see, it’s alphabetical! (B, K, N, O, P, Q, R). We don’t like alphabetical order.

But enuf complaining: I should be able to go to the code and clean things up. And, a half hour later, here it is, my (slightly) adapted code:


mt <- read.csv(url('https://gist.githubusercontent.com/ashtonanderson/cfbf51e08747f60472ee2132b0d35efb/raw/80acd2ad7c0fba4e85c053e61e9e5457137e00ee/moveno_piecetype_counts'))

mt$piece_type <- factor(mt$piece_type, levels=c("P","N","B","Q","R","O","K"))

mt <- mt %>%
  group_by(move_number) %>% 
  mutate(tot = sum(count),frac = count/tot)

p <- ggplot(mt %>% filter(move_number <= 125),aes(move_number,frac)) + 
  geom_area(aes(fill = piece_type), position = 'stack') + 
  scale_fill_brewer(type='qual',palette=3,name='Piece type', labels=c("Pawn","Knight","Bishop","Queen","Rook","Castling","King")) + 
  theme(panel.border=element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + 
  xlab('Move number') + ylab('') + 
  scale_y_continuous(labels = scales::percent, breaks=seq(0,1,0.2))


The only things I did were change the order of the pieces, cut down on the y-axis labeling (I'd also like to add tick marks and change the sizes and locations of the axis labels but I don't know how to do that in ggplot2). Also, just for laffs, I extended the x-axis to 125 moves, cos why stop at 80?

The result is the graph I showed at the top of the page.

I prefer it to the original. To me, the generally monotone pattern allows me to see what's happening more clearly, whereas in the original, I had to spend a lot of time going back and forth between the legend and the curves. Even better would be to label the filled area directly; I don't know how to do that in ggplot2 either, but I'm sure it's easy enough for those who know the proper function call.

Also there's some glitch where there's some white space in some of the early moves. I don't know where that's coming from, but I see some if it in the original graph too.

In any case, hats off to Anderson for posting his data and code (and Goldstein for sharing) so that the rest of us can easily play with it all.