Week 18: Bayes’ Rule

How to deal with conflicting information of different strengths and get the right answer on your statistics test.


What is Bayes’ rule?

Bayes’ rule is a method for assessing the probabilities of an event. You start with a probability before an event, receive evidence, and update the probability you originally assigned.

Philip Tetlock found that "superforecasters", people who were especially good at predicting future events, use Bayes’ rule extensively:

The superforecasters are a numerate bunch: many know about Bayes' theorem and could deploy it if they felt it was worth the trouble. But they rarely crunch the numbers so explicitly. What matters far more to the superforecasters than Bayes' theorem is Bayes' core insight of gradually getting closer to the truth by constantly updating in proportion to the weight of the evidence.

Superforecasting, Philip Tetlock and Dan Gardner

This takes a while to explain and get, but it’s worthwhile taking the time. Arbital does a good job of explaining and applying it. You should head over there and read it.

Examples of Bayes’ rule

Bayes’ rule is useful in several ways, but one is that it forces us to think probabilistically. It allows us to account for competing evidence of different strengths (in how big our ‘update’ is) and promotes a nuanced view, thus avoiding a simplistic black and white application of ‘good and bad’ outcomes.

It forces us to account for the base rate, and to take an appropriate outside view. If the prior probability of cancer is low, even relatively strong evidence might not make the posterior probability of someone actually having cancer very high.

Say you’re trying to filter spam from your inbox. You could try to blacklist phrases like “grow your penis”, but that will be restrictive for all the times your friend wants to send you an email with genuine penis enlargement tips. Or you could blacklist emails from certain domains, but then the spammers would likely just change domains, and you’ll be forever playing catch up (see the red queen hypothesis). But the way that Bayesian spam filters work is assigning a prior probability to each email (the base rate of spam), and updating on relevant information (what domain is it being sent from?, how much have we marked spam from this email before, does it contain the phrase “penis enlargement” etc.).

Your friend comes to you claiming they’ve got a miracle cure for what ails you. The prior probability you assign to this being true should be very low (the base rate for successful medical interventions is at most one in tens of thousands), so you’d need to get very strong evidence of efficacy.

Also check out

  1. A visual guide to Bayes’ theorem (video), Julia Galef

Get one concept every week in your inbox