A posterior probability is the updated probability of some event occurring after accounting for new information.
For example, we might be interested in finding the probability of some event “A” occurring after we account for some event “B” that has just occurred. We could calculate this posterior probability by using the following formula:
P(A|B) = P(A) * P(B|A) / P(B)
P(A|B) = the probability of event A occurring, given that event B has occurred. Note that “|” means “given.”
P(A) = the probability that event A occurs.
P(B) = the probability that event B occurs.
P(B|A) = the probability of event B occurring, given that event A has occurred.
Example: Calculating Posterior Probability
A forest is composed of 20% Oak trees and 80% Maple trees. Suppose it is known that 90% of the Oak trees are healthy while just 50% of the Maple trees are healthy. Suppose that from a distance you can tell that a particular tree is healthy. What is the probability that the tree is an Oak tree?
Recall that the probability of event A occurring given that event B has occurred is:
P(A|B) = P(A) * P(B|A) / P(B)
In this example, the probability that the tree is an Oak given that the tree is healthy is:
P(Oak|Healthy) = P(Oak) * P(Healthy|Oak) / P(Healthy)
P(Oak) = The probability that a given tree is an Oak tree is 0.2 because 20% of all trees in the forest are Oak.
P(Healthy) = The probability that a given tree is healthy can be calculated as (0.20)*(0.9) + (0.8)*(0.5) = 0.58.
P(Healthy|Oak) = The probability that a tree is healthy given that it’s an Oak tree is 0.9, since we were told that 90% of the Oak trees are healthy.
Using these three numbers, we can find the probability that the tree is an Oak tree given that it’s healthy:
P(Oak|Healthy) = P(Oak) * P(Healthy|Oak) / P(Healthy) = (0.2) * (0.9) / (0.58) = 0.3103.
For an intuitive understanding of this probability, suppose the following grid represents this forest with 100 trees. Exactly 20 of the trees are Oak trees and 18 of them are healthy. The other 80 trees are Maple and 40 of them are healthy.
(O = Oak, M = Maple, Green = Healthy, Red = Unhealthy)
Out of all the trees, exactly 58 of them are healthy and 18 of these healthy ones are Oak trees. Thus, if we know that we’ve selected a healthy tree then the probability that it’s an Oak tree is 18/58 = 0.3103.
When Should You Use Posterior Probability?
Posterior probability is used in a wide variety of domains including finance, medicine, economics, and weather forecasting.
The whole point of using posterior probabilities is to update a previous belief we had about something once we obtain new information.
Recall in the previous example that we knew the probability of a given tree in the forest being Oak was 20%. This is known as a prior probability. If we simply picked a tree at random, we knew that the probability of it being an Oak was 0.20.
However, once we obtained the new information that the tree we selected was healthy, we were able to use this new information to determine that the posterior probability of this tree being an Oak was instead 0.3103.
In the real world, people come across new information all the time. This new information helps us update our prior beliefs. In statistical terms, it means we’re able to generate posterior probabilities of events occurring, which helps us gain a more accurate understanding of the world and enables us to make more accurate predictions about future events.