What’s the origin of the term “chasing noise” as applying to overinterpreting noisy patterns in data?

Roy Mendelssohn writes:

In an internal discussion at work I used the term “chasing noise”, which really grabbed a number of people involved in the discussion. Now my memory is I first saw the term (or something similar) in your blog. But it made me interested in who may have first used the term? Did you hear it first from someone, or have any idea of who may have first used the term, or something close to it?

My reply:

The term seems so natural. I don’t know if I heard it from somewhere. Here’s where I used it in 2013. I’ve also used the related term “noise mining.”

A quick google search came up with this 2012 article, Chasing Noise, by Brock Mendel and Andrei Shleifer in the Journal of Financial Economics, but they’re using the term slightly differently, referring not to overfitting explanations of noisy statistical findings, but to random economic behavior.

Roy then gave some background:

The term came up in the setting that I am a firm believer that if you ignore spatial and temporal correlation in space-time data, as many analyses do, you are uncovering patterns that are transitory in the dynamics sense, either because you have over estimated the effective sample size (as when the talks on Stan talk about ESS for analyzing the chains) or you are just being fooled by the seeming patterns caused by noise when data are dependent (actually even when they are independent – when state lotteries started I knew quite a few people who were positive they had found a pattern in the numbers, and sure enough they all lost a fair amount of money).

Anyway, if any of you know further history on this use of the expression “chasing noise” as applying to overinterpretation of noisy patterns in data, please let us know in comments.