Visible and invisible errors, and the nefarious power of suggestion

December 20, 2017

(This article was originally published at Big Data, Plainly Spoken (aka Numbers Rule Your World), and syndicated at StatsBlogs.)

In Chapter 4 of Numbers Rule Your World (link), I explain how predictive models make errors. One of the key ideas in the chapter is the visibility of errors. Modelers are much more likely to care about visible errors than invisible errors. Most classic discussion of predictive accuracy ignores this important issue, and assumes that all errors, and all types of errors, are visible. In real life, many errors are invisible, or can be deliberately hidden from users.

Consider the failed child-abuse prediction model used by Chicago and described in a previous post. The errors made by this model are highly visible, and frustrating to case workers. This is why this scam was exposed. Case workers have had to work through thousands of cases of false positives. False negatives are equally visible, in the form of at-risk children who were not flagged for investigation.

Errors by auto-correction software are visible and offensive like mosquito bites. But we mostly react to the false positive errors, when a perfectly fine word has been changed to the wrong word. It's less noticeable if the software misses a wrong word or a typo. Not surprisingly, such software gets a bad rep.


By contrast, lots of predictive algorithms generate errors of prediction that are invisible to users.

Mark-cruz-gps-330105For example, many drivers use Google's Waze for navigation. Waze gives predictions of how much time the driver will save by following its route. There is no way for the driver to measure whether that prediction is accurate.

I highlighted in my book - anti-doping tests to catch cheating athletes. If the test commits a false negative error, you bet the cheating athlete will not call a press conference to taunt the testing labs for their big mistake. If the test gives a false positive, however, the pleas of innocence would be deafening.


Sometimes, it takes expertise to notice the errors. Take Google Translate. If the target language is completely unknown to the user, the user can't tell if the translation is good or not.

Rainfall forecasts usually come as probabilities - 30 percent chance of rain. It takes quite a bit of effort to validate the prediction of 30-percent chance; someone who's not a scientist is unlikely to have to tools to do it.

These errors can be visible but usually stay in the shadows.


Errors tend to feel worse if they are visible. Invisible errors present opportunities for mischief.

Take Waze for example. At the start of your journey, Waze will proudly announce "I have chosen the most optimal route for your trip. You will save 8.3 minutes by following this route." Drivers have no way of verifying whether (a) the route suggested is indeed the most optimal or (b) whether this route saved 8.3 minutes relative to some other route or 10.5 minutes, or 2.4 minutes, or any other number. But many drivers are convinced that they have saved time. The uncorroborated perception of time saving demonstrates the power of suggestion. By contrast, the old-school GPS such as TomTom or Garmin does not make claims - to their detriment.

Such suggestions only work if the errors are invisible. When the developer of the child-abuse prediction model for Chicago tried the same trick as the developer of Waze, they ran into trouble. The Chicago Tribune reported that the predictive model generated alerts to social workers of the form: 

Please note that the two youngest children, ages 1 year and 4 years have been assigned a 99% probability by the Eckerd Rapid Safety Feedback metrics of serious harm or death in the next two years.

It's rather easy to disprove this statement because the errors are easy to spot. Eckerd has agreed to tone down the language.


Please comment on the article here: Big Data, Plainly Spoken (aka Numbers Rule Your World)

Tags: , , , , , , , , , , ,