(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)

In Bayesian Data Analysis, we write, “In general, we call a prior density p(θ) *proper* if it does not depend on data and integrates to 1.” This was a step forward from the usual understanding which is that a prior density is improper if an infinite integral.

But I’m not so thrilled with the term “proper” because it has different meanings for different people.

Then the other day I heard Dan Simpson and Mike Betancourt talking about “non-generative models,” and I thought, Yes! this is the perfect term! First, it’s unambiguous: a non-generative model is a model for which it is not possible to generate data. Second, it makes use of the existing term, “generative model,” hence no need to define a new concept of “proper prior.” Third, it’s a statement about the model as a whole, not just the prior.

I’ll explore the idea of a generative or non-generative model through some examples:

Classical iid model, y_i ~ normal(theta, 1), for i=1,…,n. This is not generative because there’s no rule for generating theta.

Bayesian model, y_i ~ normal(theta, 1), for i=1,…,n, with uniform prior density, p(theta) proportional to 1 on the real line. This is not generative because you can’t draw theta from a uniform on the real line.

Bayesian model, y_i ~ normal(theta, 1), for i=1,…,n, with data-based prior, theta ~ normal(y_bar, 10), where y_bar is the sample mean of y_1,…,y_n. This model is not generative because to generate theta, you need to know y, but you can’t generate y until you know theta.

In contrast, consider a Bayesian model, y_i ~ normal(theta, 1), for i=1,…,n, with non-data-based prior, theta ~ normal(0, 10). This is generative: you draw theta from the prior, then draw y given theta.

Some subtleties do arise. For example, we’re implicitly conditioning on n. For the model to be fully generative, we’d need a prior distribution for n as well.

Similarly, for a regression model to be fully generative, you need a prior distribution on x.

Non-generative models have their uses; we should just recognize when we’re using them. I think the traditional classification of prior, labeling them as improper if they have infinite integral, does not capture the key aspects of the problem.

**P.S.** Also relevant is this comment, regarding some discussion of models for the n:

As in many problems, I think we get some clarity by considering an existing problem as part of a larger hierarchical model or meta-analysis. So if we have a regression with outcomes y, predictors x, and sample size n, we can think of this as one of a larger class of problems, in which case it can make sense to think of n and x as varying across problems.

The issue is not so much whether n is a “random variable” in any particular study (although I will say that, in real studies, n typically is not precisely defined ahead of time, what with difficulties of recruitment, nonresponse, dropout, etc.) but rather that n can vary across the reference class of problems for which a model will be fit.

The post Don’t say “improper prior.” Say “non-generative model.” appeared first on Statistical Modeling, Causal Inference, and Social Science.

**Please comment on the article here:** **Statistical Modeling, Causal Inference, and Social Science**