People are usually not good at intuitive estimation of probabilities. If you do not agree with me, just think of the number of lottery tickets sold every single day. In fact, the whole gambling industry is based on the fact that people are really bad at estimating probabilities. This gets even worse with conditional probabilities when one has to calculate the probability of an event after getting some other related information. In this article, we discuss an important probability topic (Bayesian Inference) and a famous example (Monty Hall Problem) as stepping-stones towards better understanding of probability fundamentals.

Bayesian Inference

Chances are you have heard about Bayes theorem and Bayesian inference at some point in your life. Many books and teaching resources start Bayesian reasoning by introducing Bayes formula. However, our goal here is to understand Bayesian inference at a conceptual level using a concrete and insightful example. This will set the foundation for more advanced Bayesian concepts and terminology in the future.

Suppose we have a bag containing exactly 5 items with each item either a coin or a die (we do not know what exactly is in there). We draw 4 times from this bag with replacement and record each item in a sequence. Let’s say we drew a sequence of die, coin, die and die from this bag:

Now, with this observed sequence, we would like to know what is in the bag. We first need to consider all possible combinations in the bag, and then for each of these cases count the number of ways we can see the observed sequence:

Case 1: bag containing 5 coins: Number of ways: 0 × 5 × 0 × 0 = 0

Case 2: bag containing 1 die and 4 coins:
Number of ways: 1 × 4 × 1 × 1 = 4

Case 3: bag containing 2 dice and 3 coins:
Number of ways: 2 × 3 × 2 × 2 = 24

Case 4: bag containing 3 dice and 2 coins:
Number of ways: 3 × 2 × 3 × 3 = 54

Case 5: bag containing 4 dice and 1 coin:
Number of ways: 4 × 1 × 4 × 4 = 64

Case 6: bag containing 5 dice:
Number of ways: 5 × 0 × 5 × 5 = 0

In order to calculate the probability of each case, we only need to calculate the ratio of number of ways for each case to the total of possible ways. So the probability of each of the above cases will be 0/146, 4/146, 24/146, 54/146, 64/146, 0/146 or 0, 0.027, 0.164, 0.369, 0.438, 0, respectively. Since you have calculated the probabilities for all possible cases, you can simply compare them, i.e., the bag probably has 4 dice and 1 coin but 3 dice and 2 coins is also quite plausible. In Bayesian terminology, this is called calculating posterior distribution and is the fundamental idea behind Bayesian thinking. It is that simple.

Monty Hall Problem

Monty Hall problem is famous for illustrating how intuitive reasoning can be wrong. There are three doors with a goat behind two of them and a car behind the third one (and of course you don’t know what is behind each door). Your goal is to choose the door that has a car behind it and win the expensive sports car.

First, you are asked to choose one door randomly. The host of the game opens one of the other doors with a goat behind it. He then asks you whether you want to switch to another door or stick to your original choice. What should you do in order to have a higher chance of winning the car?

This is one of the most counter-intuitive probability problems and here we look at different ways to solve it (before looking at the solution below, test your intuition and see if you can guess the correct answer).

One way of thinking about this is that your initial choice was completely random but the host decided not to open one of the remaining doors because he knew there is a car behind it. Therefore, it makes sense to switch. This is somewhat enlightening but there is still room for better understanding.
When you choose the first door randomly, you are right 1/3 of the times. After one of the other doors are opened, its 1/3 probability goes to the unopened door and it becomes 2/3. So, you should switch. Well, this makes sense but I’m personally not quite convinced. What do you think?
A more systematic way is to think of all the possible scenarios (for reference, suppose the door you choose is door 1, one of the remaining doors is door 2 and the last one is door 3). There are 3 possible cases: 1) your first choice was a car, 2) your first choice was not a car and door number 2 was opened by the host, 3) your first choice was not a car and door number 3 was opened by the host. In the last two cases, you are going to win by switching (2 of 3 cases making a probability of 2/3 for winning the car).

This is the only case you will lose if you switch.

In the 2nd case, you will win by switching.

In the 3rd case, you will still win if you switch.

This last approach is quite straightforward and helps us get to a higher level of understanding by rephrasing the problem in the following way:

How can I lose when I switch?
The only way I can lose by switching is when my initial choice was the door with a car behind it (in all other cases, I am going to win by switching because the host opens the one door with a goat behind it). I choose the door with a car behind it by chance only 1/3 of the times. So, I can only lose 1/3 of the times when I switch and therefore win 2/3 of the times.
To put it another way, you will always win by switching if your initial guess was wrong. Your chance of guessing wrong on your first choice is 2/3 and therefore you will win 2/3 of the times if you switch.

There is a great deal of related topics we need to discuss in the future and I hope these two discussions laid the groundwork in that direction.