IBM recently announced its artificial intelligence technology is 95% accurate in predicting workers planning to leave their jobs.

The company proclaimed its “predictive attrition program”, developed using Watson Analytics, can predict the employees who are ready to quit. The company says the program has bolstered retention rates and saved it $300-million.

At first glance, it seems pretty impressive. 95% is a big number.

But it’s nowhere as impressive if you dig into the mathematics.

Assuming your company has 1,000 employees, IBM’s technology can correctly identify 50, or 5% of the 1,000 employees who are apparently going to quit.

But given the IBM program is 95% accurate, that number is actually 48 employees (95% of 50 employees).

At the same time, however, the IBM program will incorrectly identify 48 (5%) of the remaining 950 employees as also ready to quit. This is something known as the false positive paradox.

In total, the program would identify 96 of 1,000 employees as people who could quit, but only 48 of them would quit. This is a false positive rate of 50%: 48 divided by 96.

This is dramatically higher than IBM’s claim that the program is 95% accurate.

First, let’s provide some context around false-positives.

By definition, a false positive is there is a positive result for a test when you should have received a negative result, sometimes known as a false alarm.

Some real-world examples include a positive pregnancy test when you aren’t pregnant or a virus on your computer that incorrectly identifies a harmless program as malicious.

From a big picture perspective, there are some key things to consider about IBM’s 95% accuracy claim.

One of the most important is getting past the marketing hype and incomplete facts surrounding statistical accuracy (e.g. 95% accuracy rates). In a TED Talk, Peter Donnelly, an English statistician, pointed out how people are frequently fooled by statistics.

In IBM’s case, the cost of having a false-positive rate of 50% is probably not terrible because treating your employees well is a good thing, particularly those who have no intention of leaving the company.

But what about a medical test?

If 50 people are inaccurately identified as having cancer or HIV, that’s a major problem.

This is why medical practitioners will often recommend a second opinion, just in case the first test gave you a false positive.

Fifty incorrect braking events in an autonomous braking system using predictive models results in 50 unnecessary braking events, which increases the likelihood of an accident.

The problem with false positives becomes even more exacerbated when you’re dealing with smaller percentages for things that you’re trying to predict.

Getting a positive result for a rare disease that impacts one in 1,000 people, for example, is alarming but your odds of actually having the disease may be far less than 0.001% based on your health, travel patterns, etc.

There are obviously many problems with false positives. They can scare people or make them think that something is better than it really is. In the case of autonomous vehicles, false positives can make technology unreliable and, as a result, deter adoption.

As well, you need to consider the costs of false positives. How much money, for example, do insurance companies waste when they investigate fraudulent activity are legitimate claims?

Here’s a question: if positive predictive accuracy is the wrong metric, why does everyone use it?

The simple answer is it sounds good and sells products. But it leads to systems that generate mediocre, if not poor, results. Promoting positive predictive accuracy can persuade, for example, people to make investments in a high-potential project, only to discover it has inherent flaws.

Now, if the cost of a false positive is negligible, carpet bomb away!

Bottom line: If someone claims something is 95% accurate, maybe you should ask them a few questions.

Recent Blog Posts

Sign up to receive our newsletter