From: 3blue1brown
When evaluating online sellers based on their positive ratings and total reviews, a common intuition is that more data provides greater confidence in a given rating [00:00:28]. For example, a 100% positive rating with only 10 reviews feels less reliable than a 96% positive rating with 50 reviews, or even a 93% positive rating with 200 reviews [00:00:06]. The challenge is to quantify this intuition and rationally assess the trade-off between confidence gained from more data and a lower absolute percentage [00:00:43].
This problem is a modification of an example by John Cook in his blog post, “A Bayesian Review of Amazon Resellers” [00:00:54]. It provides a practical context to delve into core topics in probability and statistics [00:01:05].
Modeling the Situation
To model this scenario, each seller is considered to produce random experiences that are either positive or negative [00:03:06]. Each seller is assumed to have a constant underlying “success rate” (S) for giving a good experience [00:03:11]. The core challenge lies in the fact that this true success rate (S) is unknown [00:03:21].
For instance, a 100% positive rating from 10 reviews does not guarantee a true success rate of 100%; it could be 95% [00:03:25]. Simulations demonstrate that even with a 95% true success rate, approximately 60% of 10-review sequences would result in 10 out of 10 positive reviews [00:04:05]. The objective is to maximize the probability of having a positive experience, despite this uncertainty about the true success rate [00:04:21]. This requires reasoning about a “probability of probabilities” for various possible success rates between 0 and 1 [00:04:39].
This setup is relevant to many real-world situations where judgments must be made about a random process based on limited data, such as estimating car defect rates in a factory based on initial tests [00:05:03].
Probability of Data Given Success Rate
A fundamental question is: If the true success rate for a seller were known (e.g., 95%), how would one compute the probability of observing specific review data (e.g., 10 positive and 0 negative, or 48 positive and 2 negative)? [00:05:40]
While simulations can give an empirical sense of this distribution [00:06:03], an exact formula exists [00:06:49].
Consider the example of 48 positive reviews out of 50, with an assumed success rate (S) of 95% [00:06:21]. The probability is calculated as:
P(48 positive out of 50 | S=0.95) = (50 choose 48) * S^48 * (1-S)^2
[00:06:52]
Where:
(50 choose 48)
represents the total number of ways to have 48 positive reviews out of 50 [00:06:57]. This calculates to 1225 [00:07:29].S^48
is the probability of 48 positive reviews.(1-S)^2
is the probability of 2 negative reviews.
This formula assumes each review is independent [00:07:45]. For S=0.95, this calculation yields approximately 0.261 (26.1%), matching empirical simulations [00:07:52].
The Binomial Distribution
The distribution described above is known as the binomial distribution [00:08:14]. It is a fundamental distribution in probability that applies when:
- There’s a random event with two possible outcomes (like a coin flip or a review being positive/negative) [00:08:23].
- The event is repeated a fixed number of times [00:08:26].
- The goal is to find the probability of various total counts for one of the outcomes [00:08:31].
As a function of the success rate S
, the probability curve for k
positive reviews out of n
total reviews takes the form: Constant * S^k * (1-S)^(n-k)
[00:10:14].
For 48 positive reviews out of 50, the probability of seeing this data peaks when the assumed success rate S
is 0.96 (or 96%) [00:09:25]. This intuitively makes sense, as the observed rating is 96% [00:09:30]. If S
approaches 1, the probability of observing two negative reviews goes to zero [00:09:41]. Similarly, if S
is very low (e.g., 0.8), getting 48 out of 50 positive reviews becomes exceedingly rare (1 in 1000 times) [00:09:56].
The Challenge of Interpretation
While this formula gives the probability of seeing the data given an assumed success rate, the ultimate goal is to determine the probability of a success rate given the fixed data observed [00:08:36]. These are related but distinct concepts [00:08:51].
For instance, with 10 out of 10 positive reviews, the binomial formula simplifies to S^10
[00:11:05]. This function continuously increases as S
approaches 1 [00:11:21]. Even if S=1
maximizes this probability, it doesn’t mean one can confidently say there’s a 100% probability of a good experience [00:11:32].
To bridge this gap and get to the probability of S
given the data, concepts like Bayes’ Rule and probability density functions are necessary [00:11:51].
Laplace’s Rule of Succession
As a simplified practical rule, known as Laplace’s rule of succession (dating back to the 18th century), one can estimate the probability of a good experience by pretending there were two more reviews: one positive and one negative [00:01:46].
For the initial examples:
- Seller 1 (10 out of 10 reviews): Pretend 11 out of 12 reviews, yielding 91.7% [00:01:55].
- Seller 2 (48 out of 50 reviews): Pretend 49 out of 52 reviews, yielding 94.2% [00:02:08].
- Seller 3 (186 out of 200 reviews): Pretend 187 out of 202 reviews, yielding 92.6% [00:02:25].
According to this rule, Seller 2 would be the best choice [00:02:34]. Understanding the underlying assumptions and how different goals can change this choice requires further mathematical exploration [00:02:42].