A call for less automation, more transparency in digital advertising

August 8, 2017

(This article was originally published at Big Data, Plainly Spoken (aka Numbers Rule Your World), and syndicated at StatsBlogs.)

The Facebook Split Testing feature is an example of nice math not producing nice results when released in the "wild". More on my experience in a prior blog post here.

I should be quite upset about the absurd results of my tests, not just because of the waste of time, but also because of a waste of money. During those botched tests, I paid Facebook dollars for each ad impression rendered. If I were a larger advertiser, I'd be exponentially more irritated.

The testing platform can be made instantly more useful by offering users some degree of control:

(a) let users impose a waiting period before the optimization algorithm kicks in

(b) let users manage the "learning rate", i.e. the speed at which the algorithm responds to the response rate differential

[Note: One issue with letting user control the running of this type of algorithms is that many users would fall prey to a narrative fallacy. It is fairly easy to convince oneself that any of-the-moment result conforming to one's belief is credible. This is the reason why many experimenters discourage "peeking." What I am saying is zero control is worse.]

The fact that the Facebook testing platform doesn't allow any user control is symptomatic of a destructive bargain between mainstream software designers and average software users. Facebook is by no means the only company following this paradigm.

Many software designers chase full automation. The ultimate goal of this paradigm is to remove human beings from processes. But human beings are nosy and meddlesome. If users are allowed to monitor the details of algorithms, they can't help but offer suggestions and corrections. To really get rid of human interference, the designers make the system opaque.

Some users like opaqueness. It releases them from accountability. You take the reports coming out of the opaque system, and regurgitate them to your supervisor. So long as these reports show good news, few will be questioned. If there are doubts, you just shrug since you don't have access to the data, or have any knowledge of its implementation.


The lack of user control frustrates me about the Facebook platform. I also have zero understanding of which specific algorithm is being deployed, or which allocation shifting rules are being implemented, or how the learning rate has been set. Their documentation does not explain much. I am pretty sure the algorithm failed me because it hasn't been tuned properly but I have zero knowledge of how it has been tuned, and with what data. The software designers dictated all of those things, and if you want to advertise on Facebook, that's what you must live with.

In effect, Facebook is asking users to give blind trust to an opaque process.

The pricing model is equally opaque. For example, an advertiser can run ads with the explicit goal of generating "leads". However, the advertiser is not allowed to pay Facebook based on the number of leads generated - the only setting allowed here is to pay by the number of impressions served. This model is not pay for performance, but pay for "effort."

I won't get into the controversy surrounding serving of ads - let's just say that there are many "impressions served" that no human beings could see. A recent Google experiment (link, tip from Augustine Fou) confirms that impressions are being counted even when the "programmatic" advertising platform is turned off! This is an industry-wide crisis, and it's not my intention to single out Facebook.

If the advertiser accepts the pay-for-effort structure, then he or she runs into another wall. How much will the advertiser be charged per (thousand of) impressions served? The answer is: the technocrats can't tell you - there is no list price, and the actual price varies widely. In my tests, the cost per thousand impressions paid was different every day despite everything else the same.

If pressed, Facebook and Google and pretty much everyone in the business will bring out their "weapon of math destruction" (to steal a term from Cathy "Mathbabe" O'Neill): the price is determined by a real-time auction.

Is there anything the advertiser can know about this auction? No. You can read some theoretical papers about generalized second-price auctions (which in the generalized case, do not reveal true values). But you shall not see any data related to the specific auctions that determined the specific prices that you are being invoiced. And of course, you shall not review the code used to implement such auctions. Under these conditions, the advertiser can be charged whatever price, and there is simply no way to contest it.


The digital platforms (Facebook, Google, everyone else) all report on themselves. They hold on to the data for dear life. How many impressions were sent? How many people click on the ads? How many people shared, liked, commented, forwarded, etc.? Lots of metrics are shown to users, none of which are verifiable. (If the ad is supposed to drive sales, and the sale transaction occurs on the advertiser's own website, and the advertiser's internal web operations team keeps track of such transactions, then those specific end-points can be verified, although the connection of any given transaction to an ad view or ad click is still tenuous.)

These platforms create their own virtual currencies (Facebook likes, twitter follows/retweets, etc.), they sell these virtual coins in exchange for real dollars, and they determine the exchange rate via an obscure auction.


Digital advertising as currently practiced falls far short of the measurability, accountability and efficiency promises made. If traditional (TV) advertising is considered wasteful, most of the money pouring into digital ads are also wasted.

The industry needs to reform by shifting the focus from automation to transparency.

For example, if one were to implement a real-time optimization platform for A/B testing, users ought to have access to the following:

  1. Some degree of control of tuning parameters of the underlying algorithm
  2. The run-time statistics of the algorithm should be reported, not just on the outcomes but also on operating metrics
  3. Detailed data should be available, not just data summaries
  4. The code for the algorithm should be reviewable


 P.S. This rant by an early investor in Facebook and Google is very timely and a great complementary read.




Please comment on the article here: Big Data, Plainly Spoken (aka Numbers Rule Your World)

Tags: , , , , , , , , ,