Answer to Monday Brain Teaser

Bayes’ theorem is useful in evaluating the result of drug tests. Suppose a certain drug test is 99% sensitive and 99% specific, that is, the test will correctly identify a drug user as testing positive 99% of the time, and will correctly identify a non-user as testing negative 99% of the time. This would seem to be a relatively accurate test, but Bayes’ theorem will reveal a potential flaw. Let’s assume a corporation decides to test its employees for opium use, and 0.5% of the employees use the drug. We want to know the probability that, given a positive drug test, an employee is actually a drug user. Let “D” be the event of being a drug user and “N” indicate being a non-user. Let “+” be the event of a positive drug test. We need to know the following:

   * P(D), or the probability that the employee is a drug user, regardless of any other information. This is 0.005, since 0.5% of the employees are drug users. This is the prior probability of D.

   * P(N), or the probability that the employee is not a drug user. This is 1 ? P(D), or 0.995.

   * P(+|D), or the probability that the test is positive, given that the employee is a drug user. This is 0.99, since the test is 99% accurate.

   * P(+|N), or the probability that the test is positive, given that the employee is not a drug user. This is 0.01, since the test will produce a false positive for 1% of non-users.

   * P(+), or the probability of a positive test event, regardless of other information. This is 0.0149 or 1.49%, which is found by adding the probability that the test will produce a true positive result in the event of drug use (= 99% x 0.5% = 0.495%) plus the probability that the test will produce a false positive in the event of non-drug use (= 1% x 99.5% = 0.995%). This is the prior probability of +.

Given this information, we can compute the posterior probability P(D|+) of an employee who tested positive actually being a drug user:

P(D|+)  = (0.99 x 0.005)/(0.99 x 0.005)+(0.01 x 0.005)

P(D|+)  = 0.3322

Despite the high accuracy of the test, the probability that an employee who tested positive actually did use drugs is only about 33%, so it is actually more likely that the employee is not a drug user. The rarer the condition for which we are testing, the greater the percentage of positive tests that will be false positives.

The purpose of this exercise was to show that even when common sense suggests that a test with high accuracy should have highly accurate results, the underlying probability of a positive result is the most important factor in getting an accurate result.  In simple terms, we ought to be a lot less confident that prosecutors are charging the correct person, that drug testing works, or that statistics are showing us what we think they are.  Highly unlikely events remain highly unlikely even when we have very accurate tests which indicate they are taking place.

2 comments

  1. …this is simply another attempt on my part to demonstrate that all of us, including me, know less than we think we do.

  2. Bayes rules!  heh.

Comments have been disabled.