Graphical model diagrams in Doing Bayesian Data Analysis versus traditional convention

May 13, 2012

(This article was originally published at Doing Bayesian Data Analysis, and syndicated at StatsBlogs.)

In this post I contrast conventions for illustrating hierarchical models. On the one hand, there is the traditional convention as used, for example, by DoodleBUGS. On the other hand, there is the style used in Doing Bayesian Data Analysis (DBDA). I explain the advantages of the style in DBDA.

Consider a generic model for Bayesian linear regression. The graphical model diagram in DBDA looks like this:
Graphical diagram in Doing Bayesian Data Analysis.
A corresponding graphical model diagram in DoodleBUGS looks something like this:
Graphical diagram from DoodleBUGS.
The DoodleBUGS diagrams are much like conventional graphical diagrams used in computer science and statistics.

Which diagram is better for explaining the model? For me, it's the diagrams in DBDA.
  • The diagrams in DBDA show at a glance what the distribution is for each variable. By contrast, the diagrams in DoodleBUGS do not show the distributions at all. Instead, you have to cross reference the equations, shown elsewhere.
  • The diagrams in DBDA show which parameters "live together" in the same distribution. For example, μi and τ are seen to be the mean and precision of the same normal distribution. By contrast, the diagrams in DoodleBUGS do not show which distributions the parameters "live in". For example, we do not know from the diagram whether μi and τ are in the same distribution or not. To find out, you have to look the equations, shown elsewhere.
  • There are other explanatory advantages of the format in DBDA. In particular, the icons of the distributions show directly whether a variable is discrete or continuous, and its range. For example, the icon of the gamma distribution shows that the variable is continuous and has a lower bound. The icon of the Bernoulli distribution (not illustrated here, but repeatedly in the book) shows that the variable has two discrete values. By contrast, conventional diagrams like DoodleBUGS indicate continuous versus discrete by the arbitrary shape of the figure that surrounds the variable: oval for continuous and square for discrete. Or was it oval for discrete and square for continuous? It's easy to get confused by arbitrary conventions.

Which diagram is better for understanding the corresponding JAGS/BUGS model specification? For me, it's the diagrams in DBDA. The key reason is that the diagrams in DBDA have a much more direct correspondence to lines of code in JAGS/BUGS: (Usually) each arrow in the DBDA diagram corresponds to a line of code in the JAGS/BUGS model specificaion. Notice in the DBDA diagram above, there are five arrows. Each arrow has a corresponding line of code in the model specification:
Notice that the DoodleBUGS diagram also has five arrows, but those arrows have no direct correspondence to the model specification! In particular, there is no line of code that says y is related to tau, and a separate line of code that says y is related mu, and a separate line of code that says mu is related to alpha, and another line of code that says mu is related to beta.

The style of diagrams in DBDA are a direct expression of the conceptual distributions and dependencies in the model. And, if you can draw such a picture, it is relatively straightforward to express it in JAGS/BUGS. But the conventional diagrams like DoodleBUGS leave out a huge amount of important conceptual information, and provide little guidance for how to express the model in JAGS/BUGS. Thus, both pedagogically and practically, I prefer the diagrams in DBDA.

One thing that might be improved in the DBDA diagrams is the specification of iteration. In its current form, iteration is indicated ambiguously with an ellipsis that does not indicate explicitly which index is being iterated. In some hierarchical models it can be unclear which index is implied. This could be clarified by using some sort of "plate" notation like what is used in DoodleBUGS, but when plates are drawn in the DBDA diagrams, the overall effect gets visually messy. A simple fix is simply to indicate the index and its limits next to the ellipsis.

Please comment on the article here: Doing Bayesian Data Analysis