In Rationality
for Mortals by Gerd Gigerenzer
the author explains a way to rephrase the typical problem of applying
Bayes'rule to determine the probability of cause C, given the symptom S.
I quote from the book an instance of such problem:
[Page 16]Assume that you screen women in a particular region for breast
cancer
with mammography. You know the following about women in this region:
(Prevalence) The probability that a woman has breast
cancer is 1 percent
(Sensitivity) If a woman has breast cancer, the
probability is 90 percent
that she will have a positive mammogram
(False positive rate) If a woman does not have breast
cancer, the probability is
9 percent that she will still have a positive mammogram.
A woman who tested positive asks if she really has breast
cancer or
what the probability is that she actually has breast cancer.
The mindless solution to this problem is as follows. Let C ={c,
¬c} and S ={s,
¬s} be random variables. Their values stand for
c
cause present
(cancer)
¬c
cause not present
(no
cancer)
s
symptom present
(mammography test
positive)
¬s
symptom not present
(mammography test
negative)
Tabulate the data.
P(C)
c
0.01
¬ c
0.99
Sum
1.00
P(S|C)
c
¬c
s
0.9
0.09
¬ s
0.1
0.91
Sum
1.0
1.00
Compute and tabulate the conjunction P(S,C) by means of
Bayes'rule, i.e. P(S|C)P(C) = P(S,C)
P(S,C)
c
¬c
P(S)
s
0.0090
0.0891
0.0981
¬ s
0.0010
0.9009
0.9019
P(C)
0.0100
0.9900
1
Sum
Sum
Note that in this form one can, by summing rows or column,
obtain
P(C) and P(S).
Compute and tabulate the conditional P(C|S) = P(S,C)/P(C).
P(C|S)
c
¬c
Sum
s
0.0917
0
.9083
1
¬ c
0.0011
0.9989
1
This computation shows that there is 9% chance that the woman who got
the positive mammogram has cancer. The whole process sounds obscure
to anyone not proficient with the technicalities presented above.
9% of what? That is, how can she who gets a positive test get a
the feeling of how
serious her situation is?
How to explain the process that leads to the answer,
without resorting to the mechanical algorithm and
the involved abstractions (tables, Bayes'rule) and
mathematical notation above?
As for myself, every time I am faced with such a problem I
spend half an hour scratching my head and wasting
an embarrassing amount of paper and pencil until, by
exhaustively scanning my memories as — at my best — a
mediocre rote learner I can recall the mechanism to compute the answer.
Gigerenzer argues that such obscurities and insecurities can be
alleviated by stating
the problem (and solving it) through natural frequencies.
Decide a number of samples, say 1000.
Paraphrasing a little from the book, the reasoning goes:
10 out of 1000 women have breast cancer.
cancer=samples*prevalence
Of these 10 women, we
expect
that 9
will have a positive
mammogram
positives=cancer*sensitivity
1 will
test
negative (a
false negative)
falseNegatives=cancer-positive
Of the remaining 990 women
without breast cancer
noCancer=samples-cancer
some 89 will
still
test
positive,
falsePositives=noCancer*falsePositiveRate
and 910 will test
negative.
negatives=noCancer-falsePositives
Those who have positive mammograms are 89+9=98. 9
out of 98 will actually have breast cancer; or, 9.2%.
P(c|s)=positives/(positives+falsePositives)
Here follows a Javascript application, taking the percentage inputs of
prevalence, sensitivity and false positive rates and translating into
natural frequencies, on a given amount of samples.
Cause happens
with
%
probability. (Prevalence)
Symptom
shows with
%
probability
if is present. (Sensitivity)
Symptom happens with
%
probability if is NOT present.
(False positive rate)
Sample of cases
out of cases
with .
Of the remaining
samples without ...
will
exhibit
The remaining will not
exhibit , albeit with
present.
will
anyway exhibit : they are false
positives
The remaining will not
exhibit .
+=
of cases will exhibit
.
out of
cases showing
will have
, a
100*/ = % chance.
Considerations about the above, placed in this section of
the website for simplicity albeit non technical in content.
This way of explaining and solving the problem of finding P(C|S) is
way better
to perform and remember, satisfying the requirement above.
Assuming mastectomy is a way to get definitely rid of cancer
(which to my knowledge it is not), Frau Beate might say :
I want to undergo mastectomy. What if I am in the
unlucky %?
I don't want to undergo mastectomy. %
sounds so low a chance.
I think both choices are "rational". Yet I think it is unwordly to
take such irreversible choices based on this calculation.
Doesn't the above actually show that it is possible
to support in both cases two contradictory decisions with the same
data?