Skip to main content

Probability Distributions

Discrete Random Variables

Definition

A discrete random variable XX is a function that assigns a numerical value to each outcome in a countable sample space. The set of possible values is finite or countably infinite: for example, {0,1,2,,n}\{0, 1, 2, \ldots, n\} or {0,1,2,}\{0, 1, 2, \ldots\}.

Probability Mass Function (PMF)

The probability mass function of XX is p(x)=P(X=x)p(x) = P(X = x), assigning a probability to each possible value. It must satisfy:

  1. p(x)0p(x) \ge 0 for all xx
  2. allxp(x)=1\displaystyle\sum_{\mathrm{all } x} p(x) = 1

Cumulative Distribution Function (CDF)

F(x)=P(Xx)=txp(t)F(x) = P(X \le x) = \sum_{t \le x} p(t)

The CDF is non-decreasing and right-continuous, with F()=0F(-\infty) = 0 and F()=1F(\infty) = 1. For a discrete variable it is a step function with jumps at each value in the range of XX. The size of each jump at x=ax = a equals P(X=a)P(X = a).

Expected Value

The expected value (mean) of XX is the probability-weighted average of all possible values:

E(X)=μ=allxxp(x)E(X) = \mu = \sum_{\mathrm{all } x} x \cdot p(x)

This represents the long-run average if the experiment is repeated many times. For a function g(X)g(X):

E(g(X))=allxg(x)p(x)E(g(X)) = \sum_{\mathrm{all } x} g(x) \cdot p(x)

A critical special case is E(X2)=x2p(x)E(X^2) = \sum x^2 p(x).

Variance and Standard Deviation

Var(X)=σ2=E ⁣[(Xμ)2]=allx(xμ)2p(x)\mathrm{Var}(X) = \sigma^2 = E\!\left[(X - \mu)^2\right] = \sum_{\mathrm{all } x} (x - \mu)^2 \cdot p(x)

The computational formula is almost always more convenient:

Var(X)=E(X2)[E(X)]2\mathrm{Var}(X) = E(X^2) - [E(X)]^2

The standard deviation is σ=Var(X)\sigma = \sqrt{\mathrm{Var}(X)}. It has the same units as XX and measures the typical distance of values from the mean.

Properties of Expectation and Variance

For any constant aa and random variable XX:

E(a)=a,E(aX)=aE(X),E(X+a)=E(X)+aE(a) = a, \quad E(aX) = aE(X), \quad E(X + a) = E(X) + a Var(a)=0,Var(aX)=a2Var(X),Var(X+a)=Var(X)\mathrm{Var}(a) = 0, \quad \mathrm{Var}(aX) = a^2 \mathrm{Var}(X), \quad \mathrm{Var}(X + a) = \mathrm{Var}(X)

Adding a constant shifts the distribution but does not change its spread. Multiplying by aa scales the spread by a|a|.

Example

A discrete random variable XX has PMF:

xx0123
P(X=x)P(X = x)0.10.40.30.2
E(X)=0(0.1)+1(0.4)+2(0.3)+3(0.2)=1.6E(X) = 0(0.1) + 1(0.4) + 2(0.3) + 3(0.2) = 1.6E(X2)=0(0.1)+1(0.4)+4(0.3)+9(0.2)=3.4E(X^2) = 0(0.1) + 1(0.4) + 4(0.3) + 9(0.2) = 3.4Var(X)=3.41.62=3.42.56=0.84,σ=0.840.917\mathrm{Var}(X) = 3.4 - 1.6^2 = 3.4 - 2.56 = 0.84, \quad \sigma = \sqrt{0.84} \approx 0.917
Example: Finding an unknown parameter

P(X=x)=kxP(X = x) = kx for x=1,2,3,4x = 1, 2, 3, 4. Find kk and E(X)E(X).

k(1+2+3+4)=1    10k=1    k=0.1k(1 + 2 + 3 + 4) = 1 \implies 10k = 1 \implies k = 0.1

E(X)=1(0.1)+2(0.2)+3(0.3)+4(0.4)=0.1+0.4+0.9+1.6=3.0E(X) = 1(0.1) + 2(0.2) + 3(0.3) + 4(0.4) = 0.1 + 0.4 + 0.9 + 1.6 = 3.0

Worked Example: E(X) and Var(X) from a Table

A random variable XX has the following PMF:

xx12345
P(X=x)P(X = x)0.10.20.30.250.15

E(X)=1(0.1)+2(0.2)+3(0.3)+4(0.25)+5(0.15)=0.1+0.4+0.9+1.0+0.75=3.15E(X) = 1(0.1) + 2(0.2) + 3(0.3) + 4(0.25) + 5(0.15) = 0.1 + 0.4 + 0.9 + 1.0 + 0.75 = 3.15

E(X2)=1(0.1)+4(0.2)+9(0.3)+16(0.25)+25(0.15)=0.1+0.8+2.7+4.0+3.75=11.35E(X^2) = 1(0.1) + 4(0.2) + 9(0.3) + 16(0.25) + 25(0.15) = 0.1 + 0.8 + 2.7 + 4.0 + 3.75 = 11.35

Var(X)=11.353.152=11.359.9225=1.4275\mathrm{Var}(X) = 11.35 - 3.15^2 = 11.35 - 9.9225 = 1.4275

σ=1.42751.195\sigma = \sqrt{1.4275} \approx 1.195


Binomial Distribution

Conditions

A random variable XX follows a binomial distribution, XB(n,p)X \sim B(n, p), when all four conditions hold:

  1. Fixed number of trials: exactly nn identical trials.
  2. Independent trials: each trial's outcome does not affect any other.
  3. Two outcomes: each trial yields success (probability pp) or failure (probability q=1pq = 1-p).
  4. Constant probability: pp is the same for every trial.

XX counts the number of successes in nn trials.

Probability Mass Function

P(X=x)=(nx)px(1p)nx,x=0,1,2,,nP(X = x) = \binom{n}{x} p^x (1-p)^{n-x}, \quad x = 0, 1, 2, \ldots, n

where (nx)=n!x!(nx)!\dbinom{n}{x} = \dfrac{n!}{x!(n-x)!} counts the arrangements of xx successes among nn trials.

Mean and Variance

E(X)=np,Var(X)=np(1p),σ=np(1p)E(X) = np, \quad \mathrm{Var}(X) = np(1-p), \quad \sigma = \sqrt{np(1-p)}
Derivation of E(X)=npE(X) = np and Var(X)=np(1p)\mathrm{Var}(X) = np(1-p)

Let X1,,XnX_1, \ldots, X_n be indicator variables: Xi=1X_i = 1 if trial ii succeeds, Xi=0X_i = 0 otherwise. Then X=X1++XnX = X_1 + \cdots + X_n.

E(Xi)=1p+0(1p)=pE(X_i) = 1 \cdot p + 0 \cdot (1-p) = p, so E(X)=npE(X) = np by linearity of expectation.

Var(Xi)=E(Xi2)[E(Xi)]2=pp2=p(1p)\mathrm{Var}(X_i) = E(X_i^2) - [E(X_i)]^2 = p - p^2 = p(1-p), so Var(X)=np(1p)\mathrm{Var}(X) = np(1-p) by independence.

Shape

  • p=0.5p = 0.5: symmetric about npnp.
  • p<0.5p \lt 0.5: positively skewed (right tail longer).
  • p>0.5p \gt 0.5: negatively skewed (left tail longer).

As nn increases the distribution approaches a bell shape (by the Central Limit Theorem). The mode of B(n,p)B(n, p) is at (n+1)p\lfloor (n+1)p \rfloor.

Cumulative Probabilities

On a GDC, P(Xk)P(X \le k) is computed directly. For "at least" problems, use the complement:

P(Xk)=1P(Xk1)P(X \ge k) = 1 - P(X \le k - 1)

Normal Approximation to the Binomial

When nn is large and pp is not too close to 0 or 1 (rule of thumb: np5np \ge 5 and n(1p)5n(1-p) \ge 5), the binomial can be approximated by the normal with matching mean and variance:

B(n,p)N(np,np(1p))B(n, p) \approx N(np, np(1-p))

A continuity correction is required. For example:

P(Xk)P ⁣(Zk+0.5npnp(1p))P(X \le k) \approx P\!\left(Z \le \frac{k + 0.5 - np}{\sqrt{np(1-p)}}\right) P(X=k)P ⁣(k0.5npnp(1p)<Z<k+0.5npnp(1p))P(X = k) \approx P\!\left(\frac{k - 0.5 - np}{\sqrt{np(1-p)}} \lt Z \lt \frac{k + 0.5 - np}{\sqrt{np(1-p)}}\right)
Example

A factory produces bulbs with 3% defect rate. XB(20,0.03)X \sim B(20, 0.03) is the number of defects in a sample of 20.

P(X=2)=(202)(0.03)2(0.97)18=190×0.0009×0.57810.0988P(X = 2) = \binom{20}{2}(0.03)^2(0.97)^{18} = 190 \times 0.0009 \times 0.5781 \approx 0.0988

P(X1)=(0.97)20+20(0.03)(0.97)190.5438+0.33640.8802P(X \le 1) = (0.97)^{20} + 20(0.03)(0.97)^{19} \approx 0.5438 + 0.3364 \approx 0.8802

P(X3)=1P(X2)10.88020.0988=0.0210P(X \ge 3) = 1 - P(X \le 2) \approx 1 - 0.8802 - 0.0988 = 0.0210

E(X)=20(0.03)=0.6E(X) = 20(0.03) = 0.6, σ=20(0.03)(0.97)=0.5820.763\sigma = \sqrt{20(0.03)(0.97)} = \sqrt{0.582} \approx 0.763

Example: IB Paper 2 style

A multiple choice test has 15 questions with 5 options each. A student guesses all answers.

XB(15,0.2)X \sim B(15, 0.2).

P(X=4)=(154)(0.2)4(0.8)110.1876P(X = 4) = \binom{15}{4}(0.2)^4(0.8)^{11} \approx 0.1876

P(X8)=1P(X7)0.0042P(X \ge 8) = 1 - P(X \le 7) \approx 0.0042

To set a pass mark so that guessing gives at most 1% chance of passing:

P(X7)0.0181P(X \ge 7) \approx 0.0181 and P(X8)0.0042P(X \ge 8) \approx 0.0042, so the minimum pass mark is 8 correct.

Worked Example: Binomial Probability with Normal Approximation

A company manufactures light bulbs. On average, 8% are defective. A random sample of 100 bulbs is selected. Find the probability that more than 12 are defective.

Let XB(100,0.08)X \sim B(100, 0.08).

Check conditions for normal approximation: np=85np = 8 \ge 5 and n(1p)=925n(1-p) = 92 \ge 5.

μ=100(0.08)=8,σ2=100(0.08)(0.92)=7.36,σ=2.713\mu = 100(0.08) = 8, \quad \sigma^2 = 100(0.08)(0.92) = 7.36, \quad \sigma = 2.713

With continuity correction:

P(X>12)=P(X13)P ⁣(Z>12.582.713)=P(Z>1.659)P(X \gt 12) = P(X \ge 13) \approx P\!\left(Z \gt \frac{12.5 - 8}{2.713}\right) = P(Z \gt 1.659)

1Φ(1.659)10.9515=0.0485\approx 1 - \Phi(1.659) \approx 1 - 0.9515 = 0.0485

There is approximately a 4.85% chance that more than 12 bulbs are defective.


Poisson Distribution

Conditions

XPo(λ)X \sim \mathrm{Po}(\lambda) models the number of events in a fixed interval of time or space when:

  1. Events occur singly: no simultaneous events.
  2. Independence: events in non-overlapping intervals are independent.
  3. Constant rate: events occur at average rate λ\lambda per unit interval.
  4. Randomness: the count is proportional to the interval size.

Probability Mass Function

P(X=x)=eλλxx!,x=0,1,2,P(X = x) = \frac{e^{-\lambda} \lambda^x}{x!}, \quad x = 0, 1, 2, \ldots

where λ>0\lambda \gt 0 is the mean number of events, and e2.71828e \approx 2.71828.

Mean and Variance

E(X)=λ,Var(X)=λE(X) = \lambda, \quad \mathrm{Var}(X) = \lambda

That E(X)=Var(X)E(X) = \mathrm{Var}(X) is a distinguishing feature. If observed data has mean approximately equal to variance, a Poisson model may be appropriate.

Derivation of E(X)=λE(X) = \lambda and Var(X)=λ\mathrm{Var}(X) = \lambda

E(X)=x=0xeλλxx!=eλx=1λx(x1)!E(X) = \displaystyle\sum_{x=0}^{\infty} x \cdot \frac{e^{-\lambda}\lambda^x}{x!} = e^{-\lambda} \sum_{x=1}^{\infty} \frac{\lambda^x}{(x-1)!}

Substituting k=x1k = x-1: =eλk=0λk+1k!=λeλeλ=λ= e^{-\lambda} \sum_{k=0}^{\infty} \frac{\lambda^{k+1}}{k!} = \lambda e^{-\lambda} \cdot e^{\lambda} = \lambda.

For variance, use x2=x(x1)+xx^2 = x(x-1) + x: E(X2)=E[X(X1)]+E(X)=λ2+λE(X^2) = E[X(X-1)] + E(X) = \lambda^2 + \lambda, so Var(X)=λ2+λλ2=λ\mathrm{Var}(X) = \lambda^2 + \lambda - \lambda^2 = \lambda.

Poisson as a Limit of the Binomial

If nn \to \infty, p0p \to 0, while np=λnp = \lambda stays constant, then B(n,p)Po(λ)B(n, p) \to \mathrm{Po}(\lambda). The Poisson approximates the binomial when nn is large, pp is small, and npnp is moderate (typically n50n \ge 50, p0.1p \le 0.1).

Additivity

If XPo(λ1)X \sim \mathrm{Po}(\lambda_1) and YPo(λ2)Y \sim \mathrm{Po}(\lambda_2) are independent, then:

X+YPo(λ1+λ2)X + Y \sim \mathrm{Po}(\lambda_1 + \lambda_2)

If the rate is λ\lambda per unit interval, then over tt intervals the count is Po(tλ)\mathrm{Po}(t\lambda).

Example

A helpdesk receives λ=3.5\lambda = 3.5 calls per hour. XPo(3.5)X \sim \mathrm{Po}(3.5).

P(X=5)=e3.53.555!0.1318P(X = 5) = \dfrac{e^{-3.5} \cdot 3.5^5}{5!} \approx 0.1318

P(X2)=e3.5 ⁣(1+3.5+12.252)=10.625e3.50.3208P(X \le 2) = e^{-3.5}\!\left(1 + 3.5 + \dfrac{12.25}{2}\right) = 10.625 \, e^{-3.5} \approx 0.3208

Over 2 hours: YPo(7)Y \sim \mathrm{Po}(7), P(Y>7)=1P(Y7)0.4013P(Y \gt 7) = 1 - P(Y \le 7) \approx 0.4013.

Example: Poisson approximation to Binomial

A typesetter makes errors at a rate of 1 per 500 characters. In a passage of 2000 characters, find the probability of at most 2 errors.

Exact: XB(2000,1/500)X \sim B(2000, 1/500), with λ=2000/500=4\lambda = 2000/500 = 4.

Approximate: XPo(4)X \approx \mathrm{Po}(4).

P(X2)=e4 ⁣(1+4+162)=13e40.2381P(X \le 2) = e^{-4}\!\left(1 + 4 + \dfrac{16}{2}\right) = 13e^{-4} \approx 0.2381

Using exact binomial: P(X2)=(499/500)2000+2000(1/500)(499/500)1999+(20002)(1/500)2(499/500)1998P(X \le 2) = (499/500)^{2000} + 2000(1/500)(499/500)^{1999} + \binom{2000}{2}(1/500)^2(499/500)^{1998}

This is computationally intensive but gives a result extremely close to 0.2381.

Worked Example: Poisson Distribution

A call centre receives calls at a rate of λ=4.2\lambda = 4.2 per 10-minute interval.

(a) Find the probability of receiving exactly 6 calls in a 10-minute interval.

P(X=6)=e4.24.266!=e4.2×5489.0720P(X = 6) = \frac{e^{-4.2} \cdot 4.2^6}{6!} = \frac{e^{-4.2} \times 5489.0}{720}

=7.624×e4.27.624×0.01500.1142= 7.624 \times e^{-4.2} \approx 7.624 \times 0.0150 \approx 0.1142

(b) Find the probability of receiving at most 3 calls.

P(X3)=e4.2 ⁣(1+4.2+17.642+74.0886)=e4.2(1+4.2+8.82+12.348)P(X \le 3) = e^{-4.2}\!\left(1 + 4.2 + \frac{17.64}{2} + \frac{74.088}{6}\right) = e^{-4.2}(1 + 4.2 + 8.82 + 12.348)

=26.368×e4.20.3954= 26.368 \times e^{-4.2} \approx 0.3954

(c) Over a full hour (six intervals), find the probability of more than 30 calls.

Over one hour: YPo(6×4.2)=Po(25.2)Y \sim \mathrm{Po}(6 \times 4.2) = \mathrm{Po}(25.2).

Using the normal approximation (since λ\lambda is large):

μ=25.2,σ=25.2=5.020\mu = 25.2, \quad \sigma = \sqrt{25.2} = 5.020

P(Y>30)P ⁣(Z>30.525.25.020)=P(Z>1.056)0.1455P(Y \gt 30) \approx P\!\left(Z \gt \frac{30.5 - 25.2}{5.020}\right) = P(Z \gt 1.056) \approx 0.1455


Normal Distribution

Definition and Properties

XN(μ,σ2)X \sim N(\mu, \sigma^2) has probability density function:

f(x)=1σ2πe(xμ)22σ2,<x<f(x) = \frac{1}{\sigma\sqrt{2\pi}} \, e^{-\frac{(x-\mu)^2}{2\sigma^2}}, \quad -\infty \lt x \lt \infty

Key properties: bell-shaped, symmetric about x=μx = \mu, asymptotic to the xx-axis, total area = 1, inflection points at x=μ±σx = \mu \pm \sigma. The mean, median, and mode all equal μ\mu.

E(X)=μE(X) = \mu and Var(X)=σ2\mathrm{Var}(X) = \sigma^2.

For any normal variable, P(X=a)=0P(X = a) = 0 for any specific value aa (continuous distribution).

The Empirical Rule (68-95-99.7)

P(μσ<X<μ+σ)68.27%P(\mu - \sigma \lt X \lt \mu + \sigma) \approx 68.27\% P(μ2σ<X<μ+2σ)95.45%P(\mu - 2\sigma \lt X \lt \mu + 2\sigma) \approx 95.45\% P(μ3σ<X<μ+3σ)99.73%P(\mu - 3\sigma \lt X \lt \mu + 3\sigma) \approx 99.73\%

Standard Normal Distribution

The standard normal is ZN(0,1)Z \sim N(0, 1). Any normal variable standardises via:

Z=XμσZ = \frac{X - \mu}{\sigma}

The CDF is Φ(z)=P(Zz)\Phi(z) = P(Z \le z). Key properties:

Φ(z)=1Φ(z),P(Z>z)=1Φ(z),P(z<Z<z)=2Φ(z)1\Phi(-z) = 1 - \Phi(z), \quad P(Z \gt z) = 1 - \Phi(z), \quad P(-z \lt Z \lt z) = 2\Phi(z) - 1

Probability Calculations

For XN(μ,σ2)X \sim N(\mu, \sigma^2), to find P(a<X<b)P(a \lt X \lt b), convert to zz-scores:

P(a<X<b)=Φ ⁣(bμσ)Φ ⁣(aμσ)P(a \lt X \lt b) = \Phi\!\left(\frac{b - \mu}{\sigma}\right) - \Phi\!\left(\frac{a - \mu}{\sigma}\right)

On the GDC these are computed directly without manual standardisation.

Inverse Normal

Given probability pp, the inverse normal finds xx such that P(Xx)=pP(X \le x) = p. For the standard normal, z=Φ1(p)z = \Phi^{-1}(p). For a general normal: x=μ+zσx = \mu + z\sigma.

Finding Unknown Parameters

When μ\mu or σ\sigma is unknown, use standardisation with a known probability to set up simultaneous equations. Each known probability gives one equation in two unknowns; two probabilities are needed.

Example

Bags of flour: XN(1000,225)X \sim N(1000, 225) (mean 1000 g, σ=15\sigma = 15 g).

P(985<X<1020)=P(1<Z<1.333)=Φ(1.333)Φ(1)0.90880.1587=0.7501P(985 \lt X \lt 1020) = P(-1 \lt Z \lt 1.333) = \Phi(1.333) - \Phi(-1) \approx 0.9088 - 0.1587 = 0.7501

P(X<970)=P(Z<2)=0.0228P(X \lt 970) = P(Z \lt -2) = 0.0228, so about 2.28% are rejected.

For the mass exceeded by only 5%: P(Xx)=0.95P(X \le x) = 0.95, x=1000+1.645(15)=1024.67x = 1000 + 1.645(15) = 1024.67 g.

Example: Unknown parameters

Test scores are normal. 15% score above 80, 10% score below 45. Find μ\mu and σ\sigma.

80μσ=1.036\dfrac{80 - \mu}{\sigma} = 1.036 and 45μσ=1.282\dfrac{45 - \mu}{\sigma} = -1.282.

Subtracting: 35=2.318σ35 = 2.318\sigma, so σ15.1\sigma \approx 15.1 and μ=801.036(15.1)64.4\mu = 80 - 1.036(15.1) \approx 64.4.

Example: Normal approximation to Binomial

XB(80,0.4)X \sim B(80, 0.4). Find P(X30)P(X \le 30) using a normal approximation.

μ=80(0.4)=32\mu = 80(0.4) = 32, σ2=80(0.4)(0.6)=19.2\sigma^2 = 80(0.4)(0.6) = 19.2, σ=4.382\sigma = 4.382.

With continuity correction: P(X30)P ⁣(Z30.5324.382)=P(Z0.342)P(X \le 30) \approx P\!\left(Z \le \dfrac{30.5 - 32}{4.382}\right) = P(Z \le -0.342).

0.3665\approx 0.3665

Exact binomial: P(X30)0.3642P(X \le 30) \approx 0.3642. The approximation is very close.

Worked Example: Normal Distribution with Unknown Parameters

Heights of a population are normally distributed. The 90th percentile is 182cm182\,\mathrm{cm} and the 30th percentile is 164cm164\,\mathrm{cm}. Find the mean and standard deviation.

P(X182)=0.90    182μσ=1.282P(X \le 182) = 0.90 \implies \frac{182 - \mu}{\sigma} = 1.282

P(X164)=0.30    164μσ=0.524P(X \le 164) = 0.30 \implies \frac{164 - \mu}{\sigma} = -0.524

Subtracting the second equation from the first:

18σ=1.806    σ=181.806=9.97cm\frac{18}{\sigma} = 1.806 \implies \sigma = \frac{18}{1.806} = 9.97\,\mathrm{cm}

From the first equation: μ=1821.282(9.97)=18212.78=169.2cm\mu = 182 - 1.282(9.97) = 182 - 12.78 = 169.2\,\mathrm{cm}.

So μ169cm\mu \approx 169\,\mathrm{cm} and σ10cm\sigma \approx 10\,\mathrm{cm}.


Continuous Uniform Distribution (AHL)

Definition

XU(a,b)X \sim U(a, b) has PDF:

f(x)=1ba,axbf(x) = \frac{1}{b - a}, \quad a \le x \le b

and f(x)=0f(x) = 0 otherwise. The PDF is constant over [a,b][a, b], meaning all values in the interval are equally likely.

Mean and Variance

E(X)=a+b2,Var(X)=(ba)212,σ=ba23E(X) = \frac{a + b}{2}, \quad \mathrm{Var}(X) = \frac{(b - a)^2}{12}, \quad \sigma = \frac{b - a}{2\sqrt{3}}
Derivation

E(X)=abxbadx=b2a22(ba)=a+b2E(X) = \displaystyle\int_a^b \frac{x}{b-a}\,dx = \frac{b^2-a^2}{2(b-a)} = \frac{a+b}{2}

E(X2)=abx2badx=b3a33(ba)=a2+ab+b23E(X^2) = \displaystyle\int_a^b \frac{x^2}{b-a}\,dx = \frac{b^3-a^3}{3(b-a)} = \frac{a^2+ab+b^2}{3}

Var(X)=a2+ab+b23(a+b)24=4(a2+ab+b2)3(a2+2ab+b2)12=(ba)212\mathrm{Var}(X) = \dfrac{a^2+ab+b^2}{3} - \dfrac{(a+b)^2}{4} = \dfrac{4(a^2+ab+b^2) - 3(a^2+2ab+b^2)}{12} = \dfrac{(b-a)^2}{12}

CDF

F(x) = \begin`\{cases}` 0 & x \lt a \\ \dfrac{x - a}{b - a} & a \le x \le b \\ 1 & x \gt b \end`\{cases}`

For any [c,d][a,b][c, d] \subseteq [a, b]: P(cXd)=dcbaP(c \le X \le d) = \dfrac{d - c}{b - a}.

Example

A bus arrives every 15 minutes. XU(0,15)X \sim U(0, 15) is the waiting time.

P(X>10)=5/15=1/3P(X \gt 10) = 5/15 = 1/3

E(X)=7.5E(X) = 7.5 minutes, σ=1523=5324.33\sigma = \dfrac{15}{2\sqrt{3}} = \dfrac{5\sqrt{3}}{2} \approx 4.33 minutes.

Given 5 minutes already waited, the remaining wait is U(0,10)U(0, 10): P(wait8)=2/10=1/5P(\mathrm{wait} \ge 8) = 2/10 = 1/5.


Geometric Distribution (AHL)

Definition

XGeo(p)X \sim \mathrm{Geo}(p) models the number of trials needed for the first success in independent Bernoulli trials with success probability pp.

Probability Mass Function

P(X=x)=(1p)x1p,x=1,2,3,P(X = x) = (1-p)^{x-1} p, \quad x = 1, 2, 3, \ldots

The first x1x-1 trials must be failures, and trial xx must succeed. This is the probability of exactly x1x-1 consecutive failures followed by one success.

Mean and Variance

E(X)=1p,Var(X)=1pp2E(X) = \frac{1}{p}, \quad \mathrm{Var}(X) = \frac{1-p}{p^2}
Derivation of E(X)=1/pE(X) = 1/p and Var(X)=(1p)/p2\mathrm{Var}(X) = (1-p)/p^2

E(X)=px=1x(1p)x1E(X) = p\displaystyle\sum_{x=1}^{\infty} x(1-p)^{x-1}

Using x=1xrx1=1(1r)2\displaystyle\sum_{x=1}^{\infty} xr^{x-1} = \frac{1}{(1-r)^2} for r<1|r| \lt 1, with r=1pr = 1-p:

E(X)=p1p2=1pE(X) = p \cdot \dfrac{1}{p^2} = \dfrac{1}{p}

For variance: E(X2)=E[X(X1)]+E(X)=2(1p)p2+1p=2pp2E(X^2) = E[X(X-1)] + E(X) = \dfrac{2(1-p)}{p^2} + \dfrac{1}{p} = \dfrac{2-p}{p^2},

so Var(X)=2pp21p2=1pp2\mathrm{Var}(X) = \dfrac{2-p}{p^2} - \dfrac{1}{p^2} = \dfrac{1-p}{p^2}.

Useful shortcut

P(X>n)=(1p)nP(X \gt n) = (1-p)^n

The first nn trials must all be failures. Similarly P(Xn)=(1p)n1P(X \ge n) = (1-p)^{n-1}.

Example

A basketball player has free-throw success rate 72%. XGeo(0.72)X \sim \mathrm{Geo}(0.72).

P(X=3)=(0.28)2(0.72)=0.0784×0.720.05645P(X = 3) = (0.28)^2(0.72) = 0.0784 \times 0.72 \approx 0.05645

P(X>5)=(0.28)50.00172P(X \gt 5) = (0.28)^5 \approx 0.00172

E(X)=1/0.721.389E(X) = 1/0.72 \approx 1.389 attempts.

Worked Example: Geometric Distribution

A die is rolled repeatedly until a 6 appears.

XGeo(1/6)X \sim \mathrm{Geo}(1/6).

(a) Find the probability that the first 6 appears on the 4th roll.

P(X=4)=(56)3×16=12512960.0965P(X = 4) = \left(\frac{5}{6}\right)^3 \times \frac{1}{6} = \frac{125}{1296} \approx 0.0965

(b) Find the probability that at least 10 rolls are needed.

P(X10)=(1p)101=(56)9=0.1938P(X \ge 10) = (1 - p)^{10-1} = \left(\frac{5}{6}\right)^9 = 0.1938

(c) Find the expected number of rolls.

E(X)=1p=11/6=6E(X) = \frac{1}{p} = \frac{1}{1/6} = 6

On average, 6 rolls are needed to get the first 6.


Negative Binomial Distribution (AHL)

Definition

XNB(r,p)X \sim \mathrm{NB}(r, p) models the number of trials needed to obtain exactly rr successes. The geometric distribution is the special case NB(1,p)\mathrm{NB}(1, p).

Probability Mass Function

P(X=x)=(x1r1)pr(1p)xr,x=r,r+1,r+2,P(X = x) = \binom{x-1}{r-1} p^r (1-p)^{x-r}, \quad x = r, r+1, r+2, \ldots

In the first x1x-1 trials there are r1r-1 successes (in (x1r1)\dbinom{x-1}{r-1} ways), and trial xx is the rr-th success.

Mean and Variance

E(X)=rp,Var(X)=r(1p)p2E(X) = \frac{r}{p}, \quad \mathrm{Var}(X) = \frac{r(1-p)}{p^2}

Note the parallel with geometric: multiplying rr by a factor scales both E(X)E(X) and Var(X)\mathrm{Var}(X) by the same factor.

Example

A coin has P(heads)=0.4P(\mathrm{heads}) = 0.4. XNB(3,0.4)X \sim \mathrm{NB}(3, 0.4) counts flips for 3 heads.

P(X=7)=(62)(0.4)3(0.6)4=15×0.064×0.12960.1244P(X = 7) = \dbinom{6}{2}(0.4)^3(0.6)^4 = 15 \times 0.064 \times 0.1296 \approx 0.1244

E(X)=3/0.4=7.5E(X) = 3/0.4 = 7.5, Var(X)=3(0.6)/0.16=11.25\mathrm{Var}(X) = 3(0.6)/0.16 = 11.25, σ=11.253.354\sigma = \sqrt{11.25} \approx 3.354.


Central Limit Theorem (AHL)

Statement

If X1,X2,,XnX_1, X_2, \ldots, X_n are independent and identically distributed with mean μ\mu and variance σ2\sigma^2, then for large nn:

XˉnN ⁣(μ,σ2n)\bar{X}_n \sim N\!\left(\mu, \frac{\sigma^2}{n}\right)

This holds regardless of the shape of the original distribution. The rule of thumb is n30n \ge 30.

Distribution of the Sum

The sum Sn=X1++XnS_n = X_1 + \cdots + X_n is approximately SnN(nμ,nσ2)S_n \sim N(n\mu, n\sigma^2) for large nn.

Standard Error

SE(Xˉ)=σn\mathrm{SE}(\bar{X}) = \frac{\sigma}{\sqrt{n}}

As nn increases, the standard error decreases: larger samples give more precise estimates of the population mean.

Example

Apple masses: mean 150 g, σ=20\sigma = 20 g. Sample of 36. Find P(Xˉ>155)P(\bar{X} \gt 155).

XˉN(150,400/36)\bar{X} \sim N(150, 400/36). P ⁣(Z>520/6)=P(Z>1.5)=0.0668P\!\left(Z \gt \dfrac{5}{20/6}\right) = P(Z \gt 1.5) = 0.0668.

Example: Sum of uniform variables

XU(2,10)X \sim U(2, 10). Sample of 50 observations. Find P(sum>310)P(\mathrm{sum} \gt 310).

μ=6\mu = 6, σ2=64/12=16/3\sigma^2 = 64/12 = 16/3. Sum has mean 300300 and variance 50(16/3)=800/350(16/3) = 800/3.

P ⁣(Z>10800/3)=P(Z>0.612)0.2704P\!\left(Z \gt \dfrac{10}{\sqrt{800/3}}\right) = P(Z \gt 0.612) \approx 0.2704.


Confidence Intervals (AHL)

Concept

A C%C\% confidence interval gives a range of plausible values for an unknown population parameter. If the sampling process were repeated many times, approximately C%C\% of constructed intervals would contain the true parameter. The confidence level does not mean there is a C%C\% probability that the parameter lies in any particular interval.

Confidence Interval for the Mean (σ\sigma known)

xˉ±zα/2σn\bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}

where zα/2z_{\alpha/2} satisfies P(Z>zα/2)=α/2P(Z \gt z_{\alpha/2}) = \alpha/2 and α=1C/100\alpha = 1 - C/100.

Confidence Levelzα/2z_{\alpha/2}
90%1.645
95%1.960
99%2.576

When σ\sigma is unknown and nn is large (n30n \ge 30), replace σ\sigma with the sample standard deviation ss.

Margin of Error and Sample Size

Margin of error: E=zα/2σnE = z_{\alpha/2} \cdot \dfrac{\sigma}{\sqrt{n}}. To halve EE, quadruple nn.

Required sample size for margin EE: n=(zα/2σE)2n = \left(\dfrac{z_{\alpha/2} \cdot \sigma}{E}\right)^2 (round up to the next integer).

Example

Bottle volumes: N(μ,25)N(\mu, 25), σ=5\sigma = 5 ml. Sample of 25 gives xˉ=498\bar{x} = 498 ml.

95% CI: 498±1.960×5/25=498±1.96498 \pm 1.960 \times 5/\sqrt{25} = 498 \pm 1.96, so (496.04,499.96)(496.04, 499.96) ml.

For margin 1 ml at 95%: n=(1.960×5/1)2=96.04n = (1.960 \times 5/1)^2 = 96.04, round up to 97.


Combining Random Variables

Linear Combinations

For any random variables XX, YY and constants aa, bb:

E(aX+bY)=aE(X)+bE(Y)E(aX + bY) = aE(X) + bE(Y)

This is the linearity of expectation and holds always, even without independence.

Variance of Sums

For independent XX and YY:

Var(aX+bY)=a2Var(X)+b2Var(Y)\mathrm{Var}(aX + bY) = a^2\mathrm{Var}(X) + b^2\mathrm{Var}(Y) Var(X+Y)=Var(X)+Var(Y),Var(XY)=Var(X)+Var(Y)\mathrm{Var}(X + Y) = \mathrm{Var}(X) + \mathrm{Var}(Y), \quad \mathrm{Var}(X - Y) = \mathrm{Var}(X) + \mathrm{Var}(Y)

Note the plus sign even for differences: subtracting a variable still adds variability.

The general formula (not necessarily independent):

Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)\mathrm{Var}(X + Y) = \mathrm{Var}(X) + \mathrm{Var}(Y) + 2\mathrm{Cov}(X, Y)

where Cov(X,Y)=E(XY)E(X)E(Y)=0\mathrm{Cov}(X, Y) = E(XY) - E(X)E(Y) = 0 when XX and YY are independent.

Important

Linearity of expectation always holds. The simple variance formula Var(X+Y)=Var(X)+Var(Y)\mathrm{Var}(X+Y) = \mathrm{Var}(X) + \mathrm{Var}(Y) requires independence.

Independent Copies

If X1,,XnX_1, \ldots, X_n are iid with mean μ\mu and variance σ2\sigma^2:

E(X1++Xn)=nμ,Var(X1++Xn)=nσ2E(X_1 + \cdots + X_n) = n\mu, \quad \mathrm{Var}(X_1 + \cdots + X_n) = n\sigma^2 E(Xˉ)=μ,Var(Xˉ)=σ2nE(\bar{X}) = \mu, \quad \mathrm{Var}(\bar{X}) = \frac{\sigma^2}{n}

Combining Normal Variables

If XN(μX,σX2)X \sim N(\mu_X, \sigma_X^2) and YN(μY,σY2)Y \sim N(\mu_Y, \sigma_Y^2) are independent, then:

aX+bYN(aμX+bμY,a2σX2+b2σY2)aX + bY \sim N(a\mu_X + b\mu_Y, a^2\sigma_X^2 + b^2\sigma_Y^2)

This is exact (not an approximation) for normal variables, and requires no CLT.

Example

XB(10,0.3)X \sim B(10, 0.3), YB(15,0.4)Y \sim B(15, 0.4), independent.

E(X+Y)=3+6=9E(X + Y) = 3 + 6 = 9

Var(X+Y)=10(0.3)(0.7)+15(0.4)(0.6)=2.1+3.6=5.7\mathrm{Var}(X + Y) = 10(0.3)(0.7) + 15(0.4)(0.6) = 2.1 + 3.6 = 5.7

Var(2X3Y)=4(2.1)+9(3.6)=8.4+32.4=40.8\mathrm{Var}(2X - 3Y) = 4(2.1) + 9(3.6) = 8.4 + 32.4 = 40.8

Example: Normal combinations

Bus ride XN(25,16)X \sim N(25, 16), walk YN(10,9)Y \sim N(10, 9), independent.

X+YN(35,25)X + Y \sim N(35, 25). P(X+Y>40)=P(Z>1)=0.1587P(X + Y \gt 40) = P(Z \gt 1) = 0.1587.

Machine A produces rods: XN(50.0,0.04)X \sim N(50.0, 0.04). Machine B: YN(50.2,0.09)Y \sim N(50.2, 0.09).

XYN(0.2,0.13)X - Y \sim N(-0.2, 0.13). P(XY>0)=P ⁣(Z>0.20.13)=P(Z>0.555)0.2894P(X - Y \gt 0) = P\!\left(Z \gt \dfrac{0.2}{\sqrt{0.13}}\right) = P(Z \gt 0.555) \approx 0.2894.

Worked Example: Combining Random Variables

XB(12,0.3)X \sim B(12, 0.3) and YPo(5)Y \sim \mathrm{Po}(5) are independent. Find:

(a) E(3X2Y)E(3X - 2Y)

E(3X2Y)=3E(X)2E(Y)=3(12×0.3)2(5)=3(3.6)10=10.810=0.8E(3X - 2Y) = 3E(X) - 2E(Y) = 3(12 \times 0.3) - 2(5) = 3(3.6) - 10 = 10.8 - 10 = 0.8

(b) Var(3X2Y)\mathrm{Var}(3X - 2Y)

Var(3X2Y)=9Var(X)+4Var(Y)=9(12×0.3×0.7)+4(5)\mathrm{Var}(3X - 2Y) = 9\mathrm{Var}(X) + 4\mathrm{Var}(Y) = 9(12 \times 0.3 \times 0.7) + 4(5)

=9(2.52)+20=22.68+20=42.68= 9(2.52) + 20 = 22.68 + 20 = 42.68

Note: the variance of the difference uses addition (plus signs for both terms), and the constants are squared.


IB Exam-Style Questions

Question 1 (Paper 1)

XB(20,0.35)X \sim B(20, 0.35). Find P(5X8)P(5 \le X \le 8).

P(5X8)=P(X8)P(X4)0.76250.1260=0.6365P(5 \le X \le 8) = P(X \le 8) - P(X \le 4) \approx 0.7625 - 0.1260 = 0.6365

Question 2 (Paper 1)

XPo(4.2)X \sim \mathrm{Po}(4.2). Find P(X3)P(X \ge 3).

P(X3)=1P(X2)=1e4.2(1+4.2+8.82)10.2103=0.7897P(X \ge 3) = 1 - P(X \le 2) = 1 - e^{-4.2}(1 + 4.2 + 8.82) \approx 1 - 0.2103 = 0.7897

Question 3 (Paper 2)

Daily rainfall: XN(2.8,1.44)X \sim N(2.8, 1.44) (mean 2.8 mm, σ=1.2\sigma = 1.2 mm).

P(X>4)=P(Z>1)=0.1587P(X \gt 4) = P(Z \gt 1) = 0.1587

Expected days per year exceeding 4 mm: 365×0.158758365 \times 0.1587 \approx 58 days.

Rainfall exceeded on only 5% of days: x=2.8+1.645(1.2)=4.774x = 2.8 + 1.645(1.2) = 4.774 mm.

Question 4 (Paper 2, AHL)

XU(0,a)X \sim U(0, a) has P(X>3)=0.4P(X \gt 3) = 0.4. Find aa and Var(X)\mathrm{Var}(X).

a3a=0.4    0.6a=3    a=5\dfrac{a-3}{a} = 0.4 \implies 0.6a = 3 \implies a = 5

Var(X)=25/122.083\mathrm{Var}(X) = 25/12 \approx 2.083

Question 5 (Paper 2, AHL)

XGeo(0.15)X \sim \mathrm{Geo}(0.15). Find the smallest nn with P(Xn)0.8P(X \le n) \ge 0.8.

P(Xn)=10.85n0.8    0.85n0.2P(X \le n) = 1 - 0.85^n \ge 0.8 \implies 0.85^n \le 0.2

nln(0.2)/ln(0.85)9.90n \ge \ln(0.2)/\ln(0.85) \approx 9.90, so n=10n = 10.

Question 6 (Paper 2, AHL)

Component lengths: N(μ,0.25)N(\mu, 0.25), σ=0.5\sigma = 0.5 mm. Sample of 30 gives xˉ=100.2\bar{x} = 100.2 mm.

90% CI: 100.2±1.645×0.5/30=100.2±0.150100.2 \pm 1.645 \times 0.5/\sqrt{30} = 100.2 \pm 0.150, so (100.05,100.35)(100.05, 100.35) mm.

The claim μ=100\mu = 100 mm is not supported at 90% confidence, since 100 falls below the interval.

Question 7 (Paper 2, AHL)

XNB(4,0.25)X \sim \mathrm{NB}(4, 0.25). Find P(X=10)P(X = 10) and E(X)E(X).

P(X=10)=(93)(0.25)4(0.75)6=84×0.003906×0.17800.0584P(X = 10) = \dbinom{9}{3}(0.25)^4(0.75)^6 = 84 \times 0.003906 \times 0.1780 \approx 0.0584

E(X)=4/0.25=16E(X) = 4/0.25 = 16

Question 8 (Paper 2, AHL)

The masses of male students are N(72,36)N(72, 36) and female students are N(58,25)N(58, 25), independent. Find the probability that a randomly chosen male is heavier than a randomly chosen female.

Let MN(72,36)M \sim N(72, 36) and FN(58,25)F \sim N(58, 25). Then D=MFN(7258,36+25)=N(14,61)D = M - F \sim N(72-58, 36+25) = N(14, 61).

P(D>0)=P ⁣(Z>01461)=P(Z>1.793)=Φ(1.793)0.9636P(D \gt 0) = P\!\left(Z \gt \dfrac{0 - 14}{\sqrt{61}}\right) = P(Z \gt -1.793) = \Phi(1.793) \approx 0.9636


Summary of Distributions

Discrete Distributions

DistributionNotationPMFE(X)E(X)Var(X)\mathrm{Var}(X)Support
BinomialB(n,p)B(n, p)(nx)px(1p)nx\dbinom{n}{x}p^x(1-p)^{n-x}npnpnp(1p)np(1-p)0,1,,n0, 1, \ldots, n
PoissonPo(λ)\mathrm{Po}(\lambda)eλλxx!\dfrac{e^{-\lambda}\lambda^x}{x!}λ\lambdaλ\lambda0,1,2,0, 1, 2, \ldots
Geometric (AHL)Geo(p)\mathrm{Geo}(p)(1p)x1p(1-p)^{x-1}p1p\dfrac{1}{p}1pp2\dfrac{1-p}{p^2}1,2,3,1, 2, 3, \ldots
Neg. Binomial (AHL)NB(r,p)\mathrm{NB}(r, p)(x1r1)pr(1p)xr\dbinom{x-1}{r-1}p^r(1-p)^{x-r}rp\dfrac{r}{p}r(1p)p2\dfrac{r(1-p)}{p^2}r,r+1,r, r+1, \ldots

Continuous Distributions

DistributionNotationPDFE(X)E(X)Var(X)\mathrm{Var}(X)Support
NormalN(μ,σ2)N(\mu, \sigma^2)1σ2πe(xμ)22σ2\dfrac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}μ\muσ2\sigma^2(,)(-\infty, \infty)
Uniform (AHL)U(a,b)U(a, b)1ba\dfrac{1}{b-a}a+b2\dfrac{a+b}{2}(ba)212\dfrac{(b-a)^2}{12}[a,b][a, b]

Key Relationships

RelationshipCondition
B(n,p)Po(np)B(n, p) \approx \mathrm{Po}(np)nn large, pp small, npnp moderate
B(n,p)N(np,np(1p))B(n, p) \approx N(np, np(1-p))np5np \ge 5, n(1p)5n(1-p) \ge 5, with continuity correction
Geo(p)=NB(1,p)\mathrm{Geo}(p) = \mathrm{NB}(1, p)Special case
X+YPo(λ1+λ2)X + Y \sim \mathrm{Po}(\lambda_1 + \lambda_2)Independent Poisson variables
aX+bYN(aμX+bμY,a2σX2+b2σY2)aX + bY \sim N(a\mu_X + b\mu_Y, a^2\sigma_X^2 + b^2\sigma_Y^2)Independent normal variables
XˉnN(μ,σ2/n)\bar{X}_n \approx N(\mu, \sigma^2/n)CLT, large nn
E(aX+bY)=aE(X)+bE(Y)E(aX + bY) = aE(X) + bE(Y)Always
Var(aX+bY)=a2Var(X)+b2Var(Y)\mathrm{Var}(aX + bY) = a^2\mathrm{Var}(X) + b^2\mathrm{Var}(Y)XX, YY independent

Common Pitfalls

  1. Confusing pp and λ\lambda: For Poisson, λ\lambda is a rate, not a probability. Unlike binomial pp, there is no upper bound of 1 on λ\lambda.

  2. Forgetting conditions: Before applying a distribution, verify all conditions. For binomial: fixed nn, independence, two outcomes, constant pp.

  3. Variance of differences: Var(XY)=Var(X)+Var(Y)\mathrm{Var}(X - Y) = \mathrm{Var}(X) + \mathrm{Var}(Y) (plus, not minus) for independent variables.

  4. Continuity correction: When approximating a discrete distribution with a continuous one, apply a continuity correction. For example, P(X5)P(X \le 5) becomes P(X<5.5)P(X \lt 5.5) under the normal approximation.

  5. Standardisation direction: Φ(z)\Phi(z) goes from zz-score to probability; Φ1(p)\Phi^{-1}(p) goes from probability to zz-score.

  6. Geometric support: XGeo(p)X \sim \mathrm{Geo}(p) counts trials starting from 1 (IB convention).

  7. Poisson additivity: Requires independence. If events are correlated, the sum is not Poisson.

  8. Confidence interval interpretation: A 95% CI does not mean there is a 95% probability that μ\mu lies in the interval. It means 95% of similarly constructed intervals contain μ\mu.

  9. Squaring constants in variance: Var(3X)=9Var(X)\mathrm{Var}(3X) = 9\mathrm{Var}(X), not 3Var(X)3\mathrm{Var}(X).

Exam Strategy

Always define your random variable and state the distribution with parameters at the start. For normal problems, sketch the bell curve and shade the relevant area. When combining variables, clearly state whether independence is assumed. For confidence intervals, state the level and interpret in context.


Problem Set

Problem 1

A discrete random variable XX has PMF P(X=x)=x+115P(X = x) = \frac{x + 1}{15} for x=0,1,2,3,4x = 0, 1, 2, 3, 4. Find E(X)E(X), Var(X)\mathrm{Var}(X), and P(X2)P(X \ge 2).

Solution

Verify: x=04x+115=1+2+3+4+515=1515=1\sum_{x=0}^{4}\frac{x+1}{15} = \frac{1+2+3+4+5}{15} = \frac{15}{15} = 1.

E(X)=0 ⁣(115)+1 ⁣(215)+2 ⁣(315)+3 ⁣(415)+4 ⁣(515)E(X) = 0\!\left(\frac{1}{15}\right) + 1\!\left(\frac{2}{15}\right) + 2\!\left(\frac{3}{15}\right) + 3\!\left(\frac{4}{15}\right) + 4\!\left(\frac{5}{15}\right)

=0+2+6+12+2015=4015=832.667= \frac{0 + 2 + 6 + 12 + 20}{15} = \frac{40}{15} = \frac{8}{3} \approx 2.667

E(X2)=0+1 ⁣(215)+4 ⁣(315)+9 ⁣(415)+16 ⁣(515)=0+2+12+36+8015=13015=263E(X^2) = 0 + 1\!\left(\frac{2}{15}\right) + 4\!\left(\frac{3}{15}\right) + 9\!\left(\frac{4}{15}\right) + 16\!\left(\frac{5}{15}\right) = \frac{0 + 2 + 12 + 36 + 80}{15} = \frac{130}{15} = \frac{26}{3}

Var(X)=263(83)2=263649=78649=1491.556\mathrm{Var}(X) = \frac{26}{3} - \left(\frac{8}{3}\right)^2 = \frac{26}{3} - \frac{64}{9} = \frac{78 - 64}{9} = \frac{14}{9} \approx 1.556

P(X2)=315+415+515=1215=45=0.8P(X \ge 2) = \frac{3}{15} + \frac{4}{15} + \frac{5}{15} = \frac{12}{15} = \frac{4}{5} = 0.8

If you get this wrong, revise: Discrete Random Variables section.

Problem 2

XB(25,0.35)X \sim B(25, 0.35). Find P(X=10)P(X = 10), P(X5)P(X \le 5), and P(X15)P(X \ge 15).

Solution

P(X=10)=(2510)(0.35)10(0.65)150.1268P(X = 10) = \binom{25}{10}(0.35)^{10}(0.65)^{15} \approx 0.1268

P(X5)=x=05(25x)(0.35)x(0.65)25x0.0334P(X \le 5) = \sum_{x=0}^{5}\binom{25}{x}(0.35)^x(0.65)^{25-x} \approx 0.0334

P(X15)=1P(X14)10.9752=0.0248P(X \ge 15) = 1 - P(X \le 14) \approx 1 - 0.9752 = 0.0248

If you get this wrong, revise: Binomial Distribution section.

Problem 3

A bookshop sells an average of 3.2 rare books per week. XPo(3.2)X \sim \mathrm{Po}(3.2) is the number sold in a week. Find P(X=4)P(X = 4), P(X=0)P(X = 0), and P(X>5)P(X \gt 5).

Solution

P(X=4)=e3.23.244!=104.858×e3.2240.1781P(X = 4) = \frac{e^{-3.2} \cdot 3.2^4}{4!} = \frac{104.858 \times e^{-3.2}}{24} \approx 0.1781

P(X=0)=e3.20.0408P(X = 0) = e^{-3.2} \approx 0.0408

P(X>5)=1P(X5)=1e3.2 ⁣(1+3.2+10.242+32.7686+104.85824+335.544120)P(X \gt 5) = 1 - P(X \le 5) = 1 - e^{-3.2}\!\left(1 + 3.2 + \frac{10.24}{2} + \frac{32.768}{6} + \frac{104.858}{24} + \frac{335.544}{120}\right)

=1e3.2(1+3.2+5.12+5.461+4.369+2.796)=1e3.2(21.946)10.8955=0.1045= 1 - e^{-3.2}(1 + 3.2 + 5.12 + 5.461 + 4.369 + 2.796) = 1 - e^{-3.2}(21.946) \approx 1 - 0.8955 = 0.1045

If you get this wrong, revise: Poisson Distribution section.

Problem 4

Exam scores follow N(65,64)N(65, 64) (mean 65, variance 64). Find the probability that a randomly chosen student scores above 75, and the score that is exceeded by only 10% of students.

Solution

μ=65\mu = 65, σ=64=8\sigma = \sqrt{64} = 8.

P(X>75)=P ⁣(Z>75658)=P(Z>1.25)=1Φ(1.25)10.8944=0.1056P(X \gt 75) = P\!\left(Z \gt \frac{75 - 65}{8}\right) = P(Z \gt 1.25) = 1 - \Phi(1.25) \approx 1 - 0.8944 = 0.1056

For the 90th percentile (exceeded by only 10%):

P(Xx)=0.90    x658=1.282    x=65+1.282(8)=75.26P(X \le x) = 0.90 \implies \frac{x - 65}{8} = 1.282 \implies x = 65 + 1.282(8) = 75.26

A score of approximately 75.3 is exceeded by only 10% of students.

If you get this wrong, revise: Normal Distribution section.

Problem 5

The waiting time for a train is uniformly distributed between 0 and 12 minutes. Find the probability that the waiting time is (a) less than 5 minutes, (b) between 7 and 10 minutes, (c) more than 8 minutes given that it has already been 3 minutes.

Solution

XU(0,12)X \sim U(0, 12).

(a) P(X<5)=5/120.4167P(X \lt 5) = 5/12 \approx 0.4167

(b) P(7<X<10)=(107)/12=3/12=0.25P(7 \lt X \lt 10) = (10 - 7)/12 = 3/12 = 0.25

(c) Given 3 minutes already waited, the remaining time is U(0,9)U(0, 9) (memoryless property of the uniform distribution):

P(remaining>5)=4/90.4444P(\mathrm{remaining} \gt 5) = 4/9 \approx 0.4444

Alternatively: P(X>8X>3)=P(X>8)/P(X>3)=(4/12)/(9/12)=4/9P(X \gt 8 \mid X \gt 3) = P(X \gt 8)/P(X \gt 3) = (4/12)/(9/12) = 4/9.

If you get this wrong, revise: Continuous Uniform Distribution section.

Problem 6

XGeo(0.25)X \sim \mathrm{Geo}(0.25). Find the smallest nn such that P(Xn)0.95P(X \le n) \ge 0.95.

Solution

P(Xn)=1(1p)n=10.75n0.95P(X \le n) = 1 - (1 - p)^n = 1 - 0.75^n \ge 0.95

0.75n0.050.75^n \le 0.05

nln(0.75)ln(0.05)n \ln(0.75) \le \ln(0.05)

nln(0.05)ln(0.75)=2.9960.288=10.40n \ge \frac{\ln(0.05)}{\ln(0.75)} = \frac{-2.996}{-0.288} = 10.40

So n=11n = 11 trials are needed.

If you get this wrong, revise: Geometric Distribution section.

Problem 7

XNB(3,0.2)X \sim \mathrm{NB}(3, 0.2). Find P(X=8)P(X = 8) and Var(X)\mathrm{Var}(X).

Solution

P(X=8)=(72)(0.2)3(0.8)5=21×0.008×0.32768=0.05505P(X = 8) = \binom{7}{2}(0.2)^3(0.8)^5 = 21 \times 0.008 \times 0.32768 = 0.05505

E(X)=30.2=15E(X) = \frac{3}{0.2} = 15

Var(X)=3(0.8)0.04=2.40.04=60\mathrm{Var}(X) = \frac{3(0.8)}{0.04} = \frac{2.4}{0.04} = 60

σ=607.75\sigma = \sqrt{60} \approx 7.75

If you get this wrong, revise: Negative Binomial Distribution section.

Problem 8

The masses of packets of sugar are normally distributed with mean 500g500\,\mathrm{g} and standard deviation 5g5\,\mathrm{g}. A sample of 36 packets is selected. Find the probability that the sample mean is between 498g498\,\mathrm{g} and 503g503\,\mathrm{g}.

Solution

By the CLT:

XˉN ⁣(500,2536)\bar{X} \sim N\!\left(500, \frac{25}{36}\right)

σXˉ=560.833\sigma_{\bar{X}} = \frac{5}{6} \approx 0.833

P(498<Xˉ<503)=P ⁣(4985005/6<Z<5035005/6)=P(2.4<Z<3.6)P(498 \lt \bar{X} \lt 503) = P\!\left(\frac{498 - 500}{5/6} \lt Z \lt \frac{503 - 500}{5/6}\right) = P(-2.4 \lt Z \lt 3.6)

=Φ(3.6)Φ(2.4)=0.99980.0082=0.9916= \Phi(3.6) - \Phi(-2.4) = 0.9998 - 0.0082 = 0.9916

If you get this wrong, revise: Central Limit Theorem section.

Problem 9

A 95% confidence interval for the mean diameter of bolts is (10.02mm,10.18mm)(10.02\,\mathrm{mm}, 10.18\,\mathrm{mm}) based on a sample of size 50. The population standard deviation is known to be σ=0.4mm\sigma = 0.4\,\mathrm{mm}. Find the sample mean and verify the confidence interval.

Solution

The sample mean is the midpoint of the interval:

xˉ=10.02+10.182=10.10mm\bar{x} = \frac{10.02 + 10.18}{2} = 10.10\,\mathrm{mm}

The margin of error is half the width:

E=10.1810.022=0.08mmE = \frac{10.18 - 10.02}{2} = 0.08\,\mathrm{mm}

Verify: E=zα/2σn=1.960×0.450=1.960×0.0566=0.1109E = z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} = 1.960 \times \frac{0.4}{\sqrt{50}} = 1.960 \times 0.0566 = 0.1109

The calculated margin of error (0.11090.1109) exceeds the stated margin (0.080.08). This suggests the confidence interval was constructed with a different confidence level or the stated σ\sigma does not match the data. If we solve for the confidence level that gives E=0.08E = 0.08:

zα/2=0.080.4/50=0.080.0566=1.413z_{\alpha/2} = \frac{0.08}{0.4/\sqrt{50}} = \frac{0.08}{0.0566} = 1.413

This corresponds to approximately 84% confidence, not 95%.

If you get this wrong, revise: Confidence Intervals section.

Problem 10

XB(15,0.4)X \sim B(15, 0.4) and YB(20,0.3)Y \sim B(20, 0.3) are independent. Find E(X+Y)E(X + Y), Var(XY)\mathrm{Var}(X - Y), and P(X+Y=10)P(X + Y = 10).

Solution

E(X)=15(0.4)=6,E(Y)=20(0.3)=6E(X) = 15(0.4) = 6, \quad E(Y) = 20(0.3) = 6

E(X+Y)=6+6=12E(X + Y) = 6 + 6 = 12

Var(X)=15(0.4)(0.6)=3.6,Var(Y)=20(0.3)(0.7)=4.2\mathrm{Var}(X) = 15(0.4)(0.6) = 3.6, \quad \mathrm{Var}(Y) = 20(0.3)(0.7) = 4.2

Var(XY)=3.6+4.2=7.8\mathrm{Var}(X - Y) = 3.6 + 4.2 = 7.8

For P(X+Y=10)P(X + Y = 10), enumerate pairs (x,y)(x, y) where x+y=10x + y = 10, 0x150 \le x \le 15, 0y200 \le y \le 20:

This requires summing over x=0x = 0 to x=10x = 10:

P(X+Y=10)=x=010P(X=x)P(Y=10x)P(X + Y = 10) = \sum_{x=0}^{10} P(X = x)P(Y = 10 - x)

This is computationally intensive without a GDC, but the key principle is clear: since XX and YY are independent binomial variables with the same success probability (p=0.3p = 0.3 and p=0.4p = 0.4 differ, so the sum is not binomial), the distribution of X+YX + Y must be found by convolution.

If you get this wrong, revise: Combining Random Variables section.

Problem 11

The lifetimes of batteries are normally distributed with mean 500hours500\,\mathrm{hours} and standard deviation 50hours50\,\mathrm{hours}. Find the probability that a randomly selected battery lasts more than 550hours550\,\mathrm{hours}. If four batteries are selected independently, find the probability that at least three last more than 550hours550\,\mathrm{hours}.

Solution

P(X>550)=P ⁣(Z>55050050)=P(Z>1)=10.8413=0.1587P(X \gt 550) = P\!\left(Z \gt \frac{550 - 500}{50}\right) = P(Z \gt 1) = 1 - 0.8413 = 0.1587

Let YY be the number (out of 4) lasting more than 550 hours. YB(4,0.1587)Y \sim B(4, 0.1587).

P(Y3)=P(Y=3)+P(Y=4)P(Y \ge 3) = P(Y = 3) + P(Y = 4)

=(43)(0.1587)3(0.8413)+(0.1587)4= \binom{4}{3}(0.1587)^3(0.8413) + (0.1587)^4

=4(0.003997)(0.8413)+0.000635=0.01345+0.000635=0.01409= 4(0.003997)(0.8413) + 0.000635 = 0.01345 + 0.000635 = 0.01409

Approximately 1.4% chance that at least three out of four batteries last more than 550 hours.

If you get this wrong, revise: Normal Distribution and Binomial Distribution sections.

Problem 12

Use the Poisson approximation to the binomial to estimate the probability of getting 3 or more sixes when rolling a fair die 60 times.

Solution

XB(60,1/6)X \sim B(60, 1/6). λ=np=60/6=10\lambda = np = 60/6 = 10.

Approximate: XPo(10)X \approx \mathrm{Po}(10).

Check conditions: n=6050n = 60 \ge 50, p=1/60.1p = 1/6 \le 0.1? No, p=0.167>0.1p = 0.167 \gt 0.1. The Poisson approximation is less accurate here but still usable as an estimate.

P(X3)=1P(X2)=1e10 ⁣(1+10+1002)P(X \ge 3) = 1 - P(X \le 2) = 1 - e^{-10}\!\left(1 + 10 + \frac{100}{2}\right)

=161e10=161(0.0000454)=10.00277=0.9972= 1 - 61e^{-10} = 1 - 61(0.0000454) = 1 - 0.00277 = 0.9972

Exact binomial: P(X2)=(600)(5/6)60+(601)(1/6)(5/6)59+(602)(1/6)2(5/6)58P(X \le 2) = \binom{60}{0}(5/6)^{60} + \binom{60}{1}(1/6)(5/6)^{59} + \binom{60}{2}(1/6)^2(5/6)^{58}

This gives approximately P(X2)0.00268P(X \le 2) \approx 0.00268, so P(X3)0.9973P(X \ge 3) \approx 0.9973. The approximation is quite close despite p>0.1p \gt 0.1 because λ=10\lambda = 10 is moderate.

If you get this wrong, revise: Poisson as a Limit of the Binomial section.



tip

Diagnostic Test Ready to test your understanding of Probability Distributions? The diagnostic test contains the hardest questions within the IB specification for this topic, each with a full worked solution.

Unit tests probe edge cases and common misconceptions. Integration tests combine Probability Distributions with other IB mathematics topics to test synthesis under exam conditions.

See Diagnostic Guide for instructions on self-marking and building a personal test matrix.