Flaws in stupid horrible algorithm revealed because it made numerical predictions

Kaiser Fung points to this news article by David Jackson and Gary Marx:

The Illinois Department of Children and Family Services is ending a high-profile program that used computer data mining to identify children at risk for serious injury or death after the agency’s top official called the technology unreliable. . . .

Two Florida firms — the nonprofit Eckerd Connects and its for-profit partner, Mindshare Technology — mined electronic DCFS files and assigned a score of 1 to 100 to children who were the subject of an abuse allegation to the agency hotline. The algorithms rated the children’s risk of being killed or severely injured during the next two years, according to DCFS public statements.

OK, this could work. But then:

More than 4,100 Illinois children were assigned a 90 percent or greater probability of death or injury . . . 369 youngsters, all under age 9, got a 100 percent chance of death or serious injury in the next two years . . . At the same time, high-profile child deaths kept cropping up with little warning from the predictive analytics software . . . The DCFS automated case-tracking system was riddled with data entry errors . . .

Illinois child care agencies told the Tribune they were alarmed by computer-generated alerts like the one that said: “Please note that the two youngest children, ages 1 year and 4 years have been assigned a 99% probability by the Eckerd Rapid Safety Feedback metrics of serious harm or death in the next two years.”

Check out this response:

“We all agree that we could have done a better job with that language. I admit it is confusing,” said Eckerd spokesman Douglas Tobin.

Ummm . . . “confusing”? That’s all you can say? How about, “We screwed up. Our numbers were entirely wrong.”

“We could have done a better job with that language.” Jesus. Do these people have no shame? They went to the George Orwell Newspeak school of communication?

The statistics

We can draw two useful lessons from this fiasco:

1. Predictive model checking is important, indeed vital. Get testable, actionable predictions from your data, and look at these predictions carefully and critically. This is the same thing that I do when I read a scientific paper: I look at its specific numbers. For example, when I saw a paper with a graph saying that (a) in a certain area in China the life expectancy was 91 years, and (b) a certain policy had an effect of -5 years, implying that an expected life expectancy of 96 years in that area, then I was suspicious. The paper had a lot of problems, but the good news is that it put its data right out there so I and others were able to see that something had to be wrong.

Similarly, Eckerd really screwed up with its algorithm—but it’s good they gave numerical probabilities, as this supplies the loud siren that makes us realize that these numbers are bad.

What would be worse is if Eckerd had just reported the predictions as “high risk,” “mid risk,” and “low risk,” or something like that. The stupid probability numbers were a feature, not a bug, as they made the problems apparent.

2. Data quality is important. The best algorithm in the world won’t work if you’re not serious about your data. And it seems that these people were much more serious about making serious money via no-bid contracts (see below) than about their data.

The politics

From the news article:

The $366,000 Rapid Safety Feedback program was central to reforms promised by . . . [former Illinois Department of Children and Family Services Director] George Sheldon . . .

A May 2017 Tribune investigation found the arrangement with Eckerd was among a series of no-bid deals Sheldon gave to a circle of associates from his previous work in Florida as a child welfare official, lawyer and lobbyist. Sheldon left Illinois under a cloud a month later, and a July joint report by the Office of Executive Inspector General and the DCFS inspector general concluded that Sheldon and DCFS committed mismanagement by classifying the Eckerd/Mindshare arrangement as a grant, instead of as a no-bid contract. . . .

Eckerd Connects — which recently changed its name from Eckerd Kids — told the Tribune that variants of its Rapid Safety Feedback are used today by child welfare agencies in Ohio, Indiana, Maine, Louisiana, Tennessee, Connecticut and Oklahoma. . . .

Even before arriving in Illinois, Sheldon had professional ties to both Eckerd and Mindshare.

He is quoted on Mindshare’s website endorsing that company and its technology. And as head of Florida’s child welfare agency, he worked closely with Eckerd, which runs child welfare programs in Florida’s Hillsborough County under a $73 million state contract, using for-profit companies as subcontractors.

When Sheldon arrived in Illinois in 2015, he appointed Eckerd’s Chief External Relations Officer Jody Grutza to a $125,000 senior DCFS position. While Grutza did not supervise the Eckerd contract, Sheldon put her in charge of overseeing other deals with Sheldon’s Florida associates, including a Five Points Technology contract that paid $262,000 to Christopher Pantaleon, a longtime Sheldon aide with whom Sheldon owned Florida property, the Tribune revealed in a July report.

After a year in Illinois, Grutza returned to a top position with Eckerd in Florida.

The post Flaws in stupid horrible algorithm revealed because it made numerical predictions appeared first on Statistical Modeling, Causal Inference, and Social Science.