From: 3blue1brown

When evaluating online sellers based on their positive ratings and total reviews, a common intuition is that more data provides greater confidence in a given rating [00:00:28]. For example, a 100% positive rating with only 10 reviews feels less reliable than a 96% positive rating with 50 reviews, or even a 93% positive rating with 200 reviews [00:00:06]. The challenge is to quantify this intuition and rationally assess the trade-off between confidence gained from more data and a lower absolute percentage [00:00:43].

This problem is a modification of an example by John Cook in his blog post, “A Bayesian Review of Amazon Resellers” [00:00:54]. It provides a practical context to delve into core topics in probability and statistics [00:01:05].

Modeling the Situation

To model this scenario, each seller is considered to produce random experiences that are either positive or negative [00:03:06]. Each seller is assumed to have a constant underlying “success rate” (S) for giving a good experience [00:03:11]. The core challenge lies in the fact that this true success rate (S) is unknown [00:03:21].

For instance, a 100% positive rating from 10 reviews does not guarantee a true success rate of 100%; it could be 95% [00:03:25]. Simulations demonstrate that even with a 95% true success rate, approximately 60% of 10-review sequences would result in 10 out of 10 positive reviews [00:04:05]. The objective is to maximize the probability of having a positive experience, despite this uncertainty about the true success rate [00:04:21]. This requires reasoning about a “probability of probabilities” for various possible success rates between 0 and 1 [00:04:39].

This setup is relevant to many real-world situations where judgments must be made about a random process based on limited data, such as estimating car defect rates in a factory based on initial tests [00:05:03].

Probability of Data Given Success Rate

A fundamental question is: If the true success rate for a seller were known (e.g., 95%), how would one compute the probability of observing specific review data (e.g., 10 positive and 0 negative, or 48 positive and 2 negative)? [00:05:40]

While simulations can give an empirical sense of this distribution [00:06:03], an exact formula exists [00:06:49].

Consider the example of 48 positive reviews out of 50, with an assumed success rate (S) of 95% [00:06:21]. The probability is calculated as:

P(48 positive out of 50 | S=0.95) = (50 choose 48) * S^48 * (1-S)^2 [00:06:52]

Where:

  • (50 choose 48) represents the total number of ways to have 48 positive reviews out of 50 [00:06:57]. This calculates to 1225 [00:07:29].
  • S^48 is the probability of 48 positive reviews.
  • (1-S)^2 is the probability of 2 negative reviews.

This formula assumes each review is independent [00:07:45]. For S=0.95, this calculation yields approximately 0.261 (26.1%), matching empirical simulations [00:07:52].

The Binomial Distribution

The distribution described above is known as the binomial distribution [00:08:14]. It is a fundamental distribution in probability that applies when:

  • There’s a random event with two possible outcomes (like a coin flip or a review being positive/negative) [00:08:23].
  • The event is repeated a fixed number of times [00:08:26].
  • The goal is to find the probability of various total counts for one of the outcomes [00:08:31].

As a function of the success rate S, the probability curve for k positive reviews out of n total reviews takes the form: Constant * S^k * (1-S)^(n-k) [00:10:14].

For 48 positive reviews out of 50, the probability of seeing this data peaks when the assumed success rate S is 0.96 (or 96%) [00:09:25]. This intuitively makes sense, as the observed rating is 96% [00:09:30]. If S approaches 1, the probability of observing two negative reviews goes to zero [00:09:41]. Similarly, if S is very low (e.g., 0.8), getting 48 out of 50 positive reviews becomes exceedingly rare (1 in 1000 times) [00:09:56].

The Challenge of Interpretation

While this formula gives the probability of seeing the data given an assumed success rate, the ultimate goal is to determine the probability of a success rate given the fixed data observed [00:08:36]. These are related but distinct concepts [00:08:51].

For instance, with 10 out of 10 positive reviews, the binomial formula simplifies to S^10 [00:11:05]. This function continuously increases as S approaches 1 [00:11:21]. Even if S=1 maximizes this probability, it doesn’t mean one can confidently say there’s a 100% probability of a good experience [00:11:32].

To bridge this gap and get to the probability of S given the data, concepts like Bayes’ Rule and probability density functions are necessary [00:11:51].

Laplace’s Rule of Succession

As a simplified practical rule, known as Laplace’s rule of succession (dating back to the 18th century), one can estimate the probability of a good experience by pretending there were two more reviews: one positive and one negative [00:01:46].

For the initial examples:

  • Seller 1 (10 out of 10 reviews): Pretend 11 out of 12 reviews, yielding 91.7% [00:01:55].
  • Seller 2 (48 out of 50 reviews): Pretend 49 out of 52 reviews, yielding 94.2% [00:02:08].
  • Seller 3 (186 out of 200 reviews): Pretend 187 out of 202 reviews, yielding 92.6% [00:02:25].

According to this rule, Seller 2 would be the best choice [00:02:34]. Understanding the underlying assumptions and how different goals can change this choice requires further mathematical exploration [00:02:42].