# Bayes' theorem

The conditional probability of $A$ given $B$, $P(A|B)$, expresses the probability of event $A$ given that $B$ has already occurred. It is defined as:

$$P(A|B) = \frac{P(A,B)}{P(B)}$$

$(A,B)$ is the joint event of $A$ and $B$. A direct consequence of this definition is:

$$P(A,B) = P(A|B)P(B) = P(B|A)P(A)$$

This can be rearranged as the product rule:

$$P(A,B) = P(A)P(B|A)$$

which can be interpreted as a way of decomposing the probability of the joint event $A,B$ into a product of probabilities. In other words, the probability of both $A$ and $B$ occurring, is the probability of $A$ multiplied by the probability of $B$… given that $A$ has occurred, of course! The nicer formula $$P(A,B) = P(A)P(B)$$ holds when $A$ and $B$ are independent. When $P(B|A) = P(B)$, that is.

Bayes' theorem can be derived directly from the previous formula $P(A,B) = P(A|B)P(B) = P(B|A)P(A)$:

$$P(A|B) = \frac{P(B|A)P(A)}{P(B)}$$

## Some manipulations

### Decompose a conditional probability into a product

Given the expression $P(A,B|C)$, is it possible to decompose it as something that resembles a product of probabilities of $A$ and $B$? Yes, it is! And the derivation is straightforward:

$$P(A,B|C) = \frac{P(A,B,C)}{P(C)} = \frac{P(A,C)}{P(C)}\frac{P(A,B,C)}{P(A,C)} = P(A|C)P(B|A,C)$$

Luckily, this is the same as the product rule $P(A,B) = P(A)P(B|A)$, but adding $C$ in the conditional part of the probability.

### Generalization to $n$ variables

Apply the above formulas as many times as you wish and you get the following:

\begin{aligned} P(A_1,\ldots,A_n) &= P(A_1)P(A_2,\ldots,A_n|A_1) \\ &= P(A_1)P(A_2|A_1)P(A_3,\ldots,A_n|A_1,A_2) \\ &= P(A_1)P(A_2|A_1)P(A_3|A_1,A_2)P(A_4,\ldots,A_n|A_1,A_2,A_3) \\ &= \ldots \end{aligned}

### Bayes' theorem is true for conditional probabilities

Bayes' theorem holds even when there is some event $C$ in the conditional part. Just write $C$ everywhere:

$$P(A|B,C) = \frac{P(B|A,C)P(A|C)}{P(B|C)}$$ 