Skip to main content

Probability Distributions — Diagnostic Tests

Unit Tests

Tests edge cases, boundary conditions, and common misconceptions for probability distributions.

UT-1: Binomial Distribution — Verification of Conditions

Question:

A student claims that the number of heads in 20 coin tosses follows a binomial distribution with n=20n = 20 and p=0.5p = 0.5.

A second student claims that the number of sixes in 60 rolls of a fair die follows a binomial distribution with n=60n = 60 and p=16p = \frac{1}{6}.

(a) Verify the conditions for the binomial distribution for each scenario.

(b) For the coin toss scenario, find P(X=10)P(X = 10) and explain why this is the mode.

(c) A third student claims the number of hearts drawn from a standard deck (with replacement, 20 draws) follows a binomial distribution with n=20n = 20 and p=14p = \frac{1}{4}. Is this correct?

[Difficulty: hard. Tests verification of binomial conditions, mode identification, and a tricky case with card draws.]

Solution:

(a) For the coin tosses:

  1. Fixed number of trials: n=20n = 20. (Yes)
  2. Independent trials: coin tosses are independent. (Yes)
  3. Two outcomes per trial: heads or tails. (Yes)

The claim is correct.

For the die rolls:

  1. Fixed number of trials: n=60n = 60. (Yes)
  2. Independent rolls: die rolls are independent. (Yes)
  3. Two outcomes per trial: six or not-six. (Yes)

The claim is correct.

(b) For XBin(20,0.5)X \sim \mathrm{Bin}(20, 0.5):

P(X=10)=(2010)(12)20=18475610485760.176P(X = 10) = \binom{20}{10}\left(\frac{1}{2}\right)^{20} = \frac{184756}{1048576} \approx 0.176

The mode of a binomial distribution is (n+1)p\lfloor (n+1)p \rfloor. Here 21×0.5=10.5=10\lfloor 21 \times 0.5 \rfloor = \lfloor 10.5 \rfloor = 10, confirming X=10X = 10 is the mode.

(c) The conditions are:

  1. Fixed n=20n = 20 draws. (Yes)
  2. Independent draws with replacement. (Yes)
  3. Two outcomes: heart or not-heart. (Yes)

The claim is correct. XBin(20,14)X \sim \mathrm{Bin}(20, \frac{1}{4}).


UT-2: Normal Distribution — Sign Error in Standardisation

Question:

The heights of men in a population follow N(175,8)N(175, 8).

(a) Find the probability that a randomly selected man is taller than 185cm185\,\mathrm{cm}.

(b) A student computes P(X>185)=P ⁣(Z>1851758)=P(Z>1.25)P(X \gt 185) = P\!\left(Z \gt \frac{185 - 175}{8}\right) = P(Z \gt 1.25) and looks up P(Z>1.25)=0.8944P(Z \gt 1.25) = 0.8944. A second student claims the answer should be 10.8944=0.10561 - 0.8944 = 0.1056. Who is correct?

[Difficulty: hard. Tests the common sign error in normal distribution problems.]

Solution:

(a) Standardising:

Z=X1758=1851758=1.25Z = \frac{X - 175}{8} = \frac{185 - 175}{8} = 1.25

P(X>185)=P(Z>1.25)=1Φ(1.25)P(X \gt 185) = P(Z \gt 1.25) = 1 - \Phi(1.25)

From standard normal tables: Φ(1.25)=0.8944\Phi(1.25) = 0.8944.

P(X>185)=10.8944=0.1056P(X \gt 185) = 1 - 0.8944 = 0.1056

(b) The second student is correct. The student who got 0.89440.8944 looked up P(Z<1.25)P(Z \lt 1.25), which gives the probability of being shorter than 185cm185\,\mathrm{cm}, not taller. The question asks for P(X>185)P(X \gt 185), so the answer is 10.8944=0.10561 - 0.8944 = 0.1056.


UT-3: Poisson Approximation to Binomial

Question:

A call centre receives an average of 2 calls per minute. Find the probability of receiving exactly 5 calls in a one-minute period using the Poisson approximation to the binomial distribution.

A student claims this follows Poi(2)\mathrm{Poi}(2) directly without justification.

(a) Explain what assumptions must be verified.

(b) Compute the probability and compare it with the exact binomial probability if n=200n = 200, p=0.01p = 0.01.

[Difficulty: hard. Tests the conditions for Poisson approximation and comparison with exact binomial.]

Solution:

(a) For the Poisson approximation to the binomial, we need:

  1. nn is large (typically n50n \geq 50).
  2. pp is small (typically p0.1p \leq 0.1).
  3. npnp is moderate (typically np15np \leq 15).

For the call centre, if we model each second as a Bernoulli trial with p=260=130p = \frac{2}{60} = \frac{1}{30} and n=60n = 60, then np=2np = 2. The conditions are satisfied since n=6050n = 60 \geq 50, p=130<0.1p = \frac{1}{30} \lt 0.1, and np=215np = 2 \leq 15.

However, the student's claim that this is "directly Poi(2)\mathrm{Poi}(2)" is incomplete — the Poisson is an approximation that must be justified.

(b) With λ=np=200×0.01=2\lambda = np = 200 \times 0.01 = 2:

Poisson: P(X=5)=e2255!=32e2120=4e215P(X = 5) = \dfrac{e^{-2} \cdot 2^5}{5!} = \dfrac{32 \cdot e^{-2}}{120} = \frac{4e^{-2}}{15}.

Exact binomial: P(X=5)=(2005)(0.01)5(0.99)195P(X = 5) = \binom{200}{5}(0.01)^5(0.99)^{195}.

Poisson: 4e2154×0.1353150.0361\frac{4e^{-2}}{15} \approx \frac{4 \times 0.1353}{15} \approx 0.0361.

The approximation is very close, with relative error less than 0.3%0.3\%.


Integration Tests

Tests synthesis of probability distributions with other topics.

IT-1: Linear Combination of Normal Random Variables (with Number and Algebra)

Question:

The weights of apples from orchard AA follow N(150,12)N(150, 12) and from orchard BB follow N(140,15)N(140, 15). A bag contains 3 apples from orchard AA and 2 apples from orchard BB.

(a) Find the probability that the total weight of the bag exceeds 750g750\,\mathrm{g}.

(b) A student claims that since the apples are independent, the total weight is simply 3×150+2×140=730g3 \times 150 + 2 \times 140 = 730\,\mathrm{g} and the probability of exceeding 750g750\,\mathrm{g} is 50%50\% since 750750 is close to the mean. Explain why this reasoning is wrong.

[Difficulty: hard. Combines linear combinations of normal distributions with probability calculations.]

Solution:

(a) Let AiN(150,12)A_i \sim N(150, 12) for i=1,2,3i = 1, 2, 3 and BjN(140,15)B_j \sim N(140, 15) for j=1,2j = 1, 2.

Total weight: T=A1+A2+A3+B1+B2T = A_1 + A_2 + A_3 + B_1 + B_2.

Since the apples are independent:

E(T)=3(150)+2(140)=450+280=730E(T) = 3(150) + 2(140) = 450 + 280 = 730

Var(T)=3(122)+2(152)=3(144)+2(225)=432+450=882\mathrm{Var}(T) = 3(12^2) + 2(15^2) = 3(144) + 2(225) = 432 + 450 = 882

TN(730,882)N(730,29.7)T \sim N(730, \sqrt{882}) \approx N(730, 29.7)

P(T>750)=P ⁣(Z>750730882)=P ⁣(Z>2029.7)=P(Z>0.673)P(T \gt 750) = P\!\left(Z \gt \frac{750 - 730}{\sqrt{882}}\right) = P\!\left(Z \gt \frac{20}{29.7}\right) = P(Z \gt 0.673)

=1Φ(0.673)10.7495=0.2505= 1 - \Phi(0.673) \approx 1 - 0.7495 = 0.2505

(b) The student's error is confusing the mean with the distribution. While the mean total weight is indeed 730g730\,\mathrm{g}, the total weight is a random variable with spread (standard deviation 29.7g\approx 29.7\,\mathrm{g}). The probability of exceeding 750g750\,\mathrm{g} is not 50%50\% — it is approximately 25%25\%. The student failed to account for the variance of the sum. The probability is 50%50\% only at the mean (730g730\,\mathrm{g}), not at 750g750\,\mathrm{g}.