When I’m giving talks explaining how multilevel modeling can resolve some aspects of the replication crisis, I mention this well-known saying in mathematics: “When a problem is hard, solve it by embedding it in a harder problem.” As applied to statistics, the idea is that it could be hard to analyze a single small study, as inferences can be sensitive to the prior, but if you consider this as one of a large population or long time series of studies, you can model the whole process, partially pool, etc.

In math, examples of embedding into a harder problem include using the theory of ideals to solve problems in prime numbers (ideals are a general class that includes primes as a special case, hence any theory on ideals is automatically true on primes but is more general), using complex numbers to solve problems with real numbers, and using generating functions to sum infinite series.

That last example goes like this. You want to compute

S = sum_{n=1}^{infinity} a_n, but you can’t figure out how to do it. So you write the generating function,

G(x) = sum_{n=1}^{infinity} a_n x^n,

you then do some analysis to figure out G(x) as a function of x, then your series is just S = G(1). And it really works. Cool.

Anyway, I thought that next time I mention this general idea, it would be fun to demonstrate with an example, so one day when I was sitting in a seminar with my notebook, I decided to try to work one out.

I thought I’d start with something simple, like this:

S = 1/1^2 + 1/2^2 + 1/3^2 + 1/4^2 + . . .

That is, S = sum_{n=1}^{infinity} n^{-2}

Then the generating function is,

G(x) = sum_{n=1}^{infinity} n^{-2} x^n.

To solve for G(x), we take some derivatives until we can get to something we can sum directly.

First one derivative:

dG/dx = sum_{n=1}^{infinity} n^{-1} x^{n-1}.

OK, taking the derivative again will be a mess, but we can do this:

x dG/dx = sum_{n=1}^{infinity} n^{-1} x^n.

And now we can differentiate again!

d/dx (x dG/dx) = sum_{n=1}^{infinity} x^{n-1}.

Hey, that one we know! It’s 1 + 1/x + 1/x^2 + . . . = 1/(1-x).

So now we have a differential equation:

xG”(x) + G'(x) = 1/(1-x).

Or maybe better to write as,

x(1-x) G”(x) + (1-x) G'(x) – 1 = 0.

Either way, it looks like we’re close to done. Just solve this second-order differential equation. Actually, even easier than that. Let h(x) = G'(x), then we just need to solve,

x(1-x) h'(x) + (1-x) h(x) – 1 = 0.

Hey, that’s just h(x) = -log(1-x) / x. I can’t remember how I figured that one out—it’s just there in my notes—but there must be some easy derivation. In any case, it works:

h'(x) = log(1-x)/x^2 + 1/(x(1-x)), so

x(1-x) h'(x) = log(1-x)*(1-x)/x + 1

(1-x) h(x) = -log(1-x)*(1-x)/x

So, yeah, x(1-x) h'(x) + (1-x) h(x) – 1 = 0. We’ve solved the differential equation!

And now we have the solution:

G(x) = integral dx (-log(1-x) / x).

This is an indefinite integral but that’s not a problem: we can see that, trivially, G(0) = 0, so we just have to do the integral starting from 0.

At this point, I was feeling pretty good about myself, like I’m some kind of baby Euler, racking up these sums using generating functions.

All I need to do is this little integral . . .

OK, I don’t remember integrals so well. It must be easy to do it using integration by parts . . . oh well, I’ll look it up when I come into the office, it’ll probably be an arcsecant or something like that. But then . . . it turns out there’s no closed-form solution!

Here it is in Wolfram alpha (OK, I take back all the things I said about them):

OK, what’s Li_2(x)? Here it is:

Hey—that’s no help at all, it’s just the infinite series again.

So my generating-function trick didn’t work. Next step is to sum the infinite series by integrating it in the complex plane and counting the poles. But I *really* don’t remember that! It’s something I learned . . . ummm, 35 years ago. And probably forgot about 34 years ago.

So, yeah, my math is rusty.

But I still like the general principle: When a problem is hard, solve it by embedding it in a harder problem.

**P.S.** We can use this example to teach a different principle of statistics: the combination of numerical and analytic methods.

How do you compute S = sum_{n=1}^{infinity} n^{-2}?

Simplest approach is to add a bunch of terms; for example, in R:

S_approx_1 <- sum((1:1000000)^(-2)).
This brute-force method works fine in this example but it would have trouble if the function to evaluate is expensive.

Another approach is to approximate the sum by an integral; thus:

S_approx_2 <- integral_{from x=0.5 to infinity} dx x^{-2} = 2. (The indefinite integral is just -1/x, so the definite integral is 1/infinity - (-1/0.5) = 2.) You have to start the integral at 0.5 because the sum starts at 1, so the little bars to sum are [0.5,1.5], [1.5,2.5], etc.
That second approximation isn't so great at the low end of x, though, where the curve 1/x^2 is far from locally linear.
So we can do an intermediate approximation:

S_approx_3 <- sum((1:N)^(-2)) + integral_{from x=(N+0.5) to infinity} dx x^{-2} = sum((1:N)^(-2)) + 1/(N+0.5).

That last approximation is fun because it combines numerical and analytic methods. And it works! Just try N=3:

S_approx = 1 + 1/4 + 1/9 + 1/3.5 = 1.647.

The exact value, to three decimal places, is 1.644. Not bad.

There are better approximation methods out there; the point is that even a simple approach of this sort can do pretty well. And I’ve seen a lot of simulation studies that are done using brute force where the answers just don’t make sense, and where just a bit of analytical work at the end could’ve made everything work out.

**P.P.S. Tomorrow’s post: Deterministic thinking (“dichotomania”): a problem in how we think, not just in how we act.**