Use of statistics in setting insurance rates

November 9, 2012

(This article was originally published at Numbers Rule Your World, and syndicated at StatsBlogs.)

Reader Mark Johnstone, from across the pond, points me to some fascinating materials that are highly irrelevant to those who have read Chapter 3 of Numbers Rule Your World, where I explain the statistical underpinning of insurance policies. It's unfortunate that our policy-makers do not understand the probabilistic nature of this business, and create rules that are self-defeating.

The EU Court recently decreed that insurers are not allowed to use gender as a factor in determining how much one pays for insurance. (WSJ article here; also PDF of the ruling). For example, life insurers sell policies to women at lower average prices than to men--well, women on average live longer than men. The argument against such differential pricing is that it discriminates between people. That fundamentally usurps the entire concept of insurance.

As I said in the book, insurance is an ingenious and extremely human (and humane) concept. It turns what seems like a bad thing (cross-subsidies) into a good thing (lower average prices for everyone). It protects the unlucky few. It is always the case that some people subsidize other people. But we accept that because it is not known beforehand who would suffer the early death and who would have the long life.

The insurance scheme fails if some customers feel like the subsidy has been set up unfairly, that is, you think you are a sure loser who subsidizes others in the scheme. This is why insurers would give women lower prices. Otherwise, the women would drop out of the insurance pool if and when they realize they would always be subsidizing the men.


This ruling may have broad implications on all kinds of businesses because any predictive models used in marketing or credit scoring or online ad targeting,etc. all contain gender as one of the factors. I have always understood that if gender is only one of many factors in a multi-factorial model, then we are acting in good faith.

In Chapter 2 of the book, I explained how a simple credit-scoring model works. If gender is only one of many factors in the model, then not all women receive the same credit score. Amongst women, other factors such as income, education, the kinds of magazines you read, etc. determine one's particular score.

That said, if someone were to remove all other factors from the model, that is to say, if you aggregate everything else and come up with the credit score for the average woman and that for the average man, they are certainly not identical. But that is a very poor use of statistical aggregation. That's when you throw out a lot of important data to arrive at an overly simplistic conclusion.

Here's an example of such aggregation:

In Chapter 2, I made the argument that for credit-scoring applications, statistical models based on correlations are sufficient. It doesn't matter whether it is gender that causes the lower propensity to be responsible for car accidents. If it is consistently the case that if someone satisfies a set of criteria (one of which may be female), one has much lower risk, then to me, it is enough to give such people lower rates. The Court rejects this argument and suggests that only causal models can be applied (which is as good as saying, no models can be used.)


Mark told me one of the proposals out there is "to fit 'black boxes' into cars so more individual data can be collected, as opposed to relying heavily on aggregates". Presumably, this suggestion is made to automobile insurers.

You know what, the statistics cannot be suppressed, and the result of using this method will not be materially different. If you now take the rates set by these "black boxes" and aggregate them up by gender, as I described above, it is again certain that the rate paid by the average man will be different from that paid by the average woman. The only way by which they would be the same would be that male and female drivers cause the same proportion of accidents, which is demonstrably false.


Would insurance premiums go up as a result of this?

It would seem to me that women will pay more and men may pay slightly less. Eventually, some women will realize that their premiums essentially subsidize men. If they start dropping out of the pool, then everyone's premiums would rise.


Would the EU Court stop at gender? Arguably, discriminating on age, on education, on incomes, on what journals you subscribe to, etc. are all bad.

One other point I made in Chapter 2 is that automated models are just super-charged versions of manual rate-setting. If a human being is asked to set rates for different customers, he or she would end up setting lower prices for the average woman anyway. In fact, from the very beginning, actuarial tables have gender breakdowns because longevity clearly differs by gender.



Please comment on the article here: Numbers Rule Your World

Tags: , , , , , , , , ,