Elegant AI: Insights for Everyday Life
Mehdi Sadeqi, PhD
If you are not a statistician or an AI/ML expert, it does not mean you cannot apply probability and statistics rules in your everyday life and benefit from it.
After all, probability theory originated in application to games and betting. Here, I start with presenting three simple examples that I find useful and interesting. In some future articles, I will discuss more advanced insights and show you how intuitive reasoning can be misleading.
Have you ever wondered how to compare ratios that are calculated from different total numbers? For example, if you look at product reviews of three similar products on a web site and one product has 2 reviews with an average of 5 out of 5 stars, the other one has 8 reviews with an average of 5 out of 5 stars and the last one has 50 reviews with an average of 4.8 out of 5 stars. Which one is more likely to be a better product?
Answer: To fully understand how to determine which one of these three products is the better one, we need some background in Bayesian inference which will be explained in a later article. In this document, we ignore the math and simply present the final result.
To calculate the adjusted ratios, we just need to follow a simple rule. Suppose the original ratio is a/b where a is the sum of the review scores and b is the total number of reviews (notice that in this example, a should be in units of 5 stars). We just need to calculate (a+1)/(b+2) and it will be the adjusted ratio values you can compare.
For example, in the case of 2 reviews with an average of 5 out of 5 stars, you have an original ratio of 2/2=1. In the case of 8 reviews with an average of 5 out of 5 stars, you have a ratio of 8/8=1. Finally, for 50 reviews with an average of 4.8 out of 5 stars, you have a ratio of 48/50=0.96. The first ratio will then be adjusted to (2+1)/(2+2)=0.75, the second one will be modified to (8+1)/(8+2)=0.90 and the final one will be changed to (48+1)/(50+2)~0.94. This means that the last product is more likely to be the best one to buy. When you have a small number of reviews, your adjusted ratio can change substantially but with larger number of reviews, the effect on the original ratio will be less substantial.
Note: If we had access to the total review data, we could actually make a better adjustment. In the above scenario, we see reviews on a random web site and do not have access to such data. More on this when we discuss Bayesian inference in a later story.
The Birthday Problem
It is 2030. You have finally invited your friends to a post-COVID-19 party. How many should you invite to your party to have a probability of at least 0.5 for two of your friends having the same birthday? Before looking at the answer, mull over it for a couple minutes and use your intuition to make a guess.
Answer: This is one of those probability problems that is easier to calculate by finding the complement probability p and 1-p would be the desired probability. The complement probability p would be the probability of not having any 2 of your friends with the same birthdays. Let’s say that one of your friends is born on day x. The probability that another on of your friends not having the same birthday as the first one, is 364/365. Similarly, the probability of the 3rd friend not born on the same day as either the first one or the second one is 363/365. You can see where this is going with more of your friends. Assuming there are n people at your party, the probability p would be (364/365)× (363/365)×… ×(365-n+1) and our original desired probability would be 1–(364/365)×(363/365)×…×(365-n+1). If you calculate this for increasing values of n, you will see at n=23, you will have a probability of slightly above 0.5 of having two friends of the same birthday.
Now, ask yourself how far your intuition was from 23? I will leave it as an exercise to find out how many of your friends you need to invite to have a probability of at least 0.90? If you are usually surprised when you see coincidences, this might make you wonder maybe they are not that unlikely to happen. More on this in a later article.
Fair Coin Out of an Unfair one
We all know how to make a fair decision with an unbiased coin. Now, suppose you have a biased coin and you want to make a similar fair decision with this one. How could you do it?
Answer: Suppose our biased coin has 0.7 probability for heads and 0.3 probability for tails. We now toss the coin twice and calculate the probabilities of all possible outcomes, namely, two heads (HH), first heads then tails (HT), first tails then heads (TH), and two tails (TT). Since each coin toss is an independent event, the probability of each of these outcomes is the product of their constituent probabilities:
Since a first tails and a second heads (TH) is as probable as a first heads and a second tails (HT), you can use these two outcomes similar to the two sides of a fair coin to make a fair decision, i.e., TH will be the new tails and HT will be the new heads. The elegance of this approach is that we do not need to know in advance how unfair the coin is.
There are several other interesting insights and famous paradoxes in statistics and probability such as Monty Hall Problem, Simpson’s Paradox and Inspection Paradox that will be the topic of another article in the future. We will also see what Bayesian Inference really is about by presenting some simple practical examples.
Mehdi Sadeqi, PhD
Machine Learning Researcher and Practitioner