Probability Fundamentals
Sample Space and Events
The sample space Ω \Omega Ω (or S S S ) is the set of all possible outcomes of an experiment. An
event A A A is a subset of the sample space.
Probability Axioms (Kolmogorov)
P ( A ) ≥ 0 P(A) \ge 0 P ( A ) ≥ 0 for every event A A A .
P ( Ω ) = 1 P(\Omega) = 1 P ( Ω ) = 1 .
If A 1 , A 2 , … A_1, A_2, \ldots A 1 , A 2 , … are mutually exclusive events, then:
P ( ⋃ i = 1 ∞ A i ) = ∑ i = 1 ∞ P ( A i ) P\!\left(\bigcup_{i=1}^{\infty} A_i\right) = \sum_{i=1}^{\infty} P(A_i) P ( i = 1 ⋃ ∞ A i ) = i = 1 ∑ ∞ P ( A i )
Complementary Events
P ( A ′ ) = 1 − P ( A ) P(A') = 1 - P(A) P ( A ′ ) = 1 − P ( A )
where A ′ A' A ′ (or A ˉ \bar{A} A ˉ ) is the complement of A A A .
Venn Diagrams
Venn diagrams visually represent events and their relationships:
Union : A ∪ B A \cup B A ∪ B (elements in A A A or B B B or both)
Intersection : A ∩ B A \cap B A ∩ B (elements in both A A A and B B B )
Disjoint (mutually exclusive) : A ∩ B = ∅ A \cap B = \emptyset A ∩ B = ∅
Addition Rule
For any two events A A A and B B B :
P ( A ∪ B ) = P ( A ) + P ( B ) − P ( A ∩ B ) P(A \cup B) = P(A) + P(B) - P(A \cap B) P ( A ∪ B ) = P ( A ) + P ( B ) − P ( A ∩ B )
For mutually exclusive events (A ∩ B = ∅ A \cap B = \emptyset A ∩ B = ∅ ):
P ( A ∪ B ) = P ( A ) + P ( B ) P(A \cup B) = P(A) + P(B) P ( A ∪ B ) = P ( A ) + P ( B )
In a class of 40 students, 25 play football, 18 play basketball, and 8 play both. Find the
probability that a randomly selected student plays at least one sport.
P ( F ∪ B ) = P ( F ) + P ( B ) − P ( F ∩ B ) = 25 40 + 18 40 − 8 40 = 35 40 = 7 8 P(F \cup B) = P(F) + P(B) - P(F \cap B) = \frac{25}{40} + \frac{18}{40} - \frac{8}{40} = \frac{35}{40} = \frac{7}{8} P ( F ∪ B ) = P ( F ) + P ( B ) − P ( F ∩ B ) = 40 25 + 40 18 − 40 8 = 40 35 = 8 7
Multiplication Rule
Independent Events
Two events A A A and B B B are independent if and only if:
P ( A ∩ B ) = P ( A ) ⋅ P ( B ) P(A \cap B) = P(A) \cdot P(B) P ( A ∩ B ) = P ( A ) ⋅ P ( B )
For independent events:
P ( A ∩ B ) = P ( A ) ⋅ P ( B ) P(A \cap B) = P(A) \cdot P(B) P ( A ∩ B ) = P ( A ) ⋅ P ( B )
Dependent Events
For dependent events, the conditional probability is needed:
P ( A ∩ B ) = P ( A ) ⋅ P ( B ∣ A ) = P ( B ) ⋅ P ( A ∣ B ) P(A \cap B) = P(A) \cdot P(B|A) = P(B) \cdot P(A|B) P ( A ∩ B ) = P ( A ) ⋅ P ( B ∣ A ) = P ( B ) ⋅ P ( A ∣ B )
Independence is NOT the same as mutual exclusivity. In fact, if two events are both mutually
exclusive and both have non-zero probability, they CANNOT be independent (since
P ( A ∩ B ) = 0 ≠ P ( A ) ⋅ P ( B ) P(A \cap B) = 0 \neq P(A) \cdot P(B) P ( A ∩ B ) = 0 = P ( A ) ⋅ P ( B ) ).
Conditional Probability
Definition
The probability of A A A given that B B B has occurred:
P ( A ∣ B ) = P ( A ∩ B ) P ( B ) P(A|B) = \frac{P(A \cap B)}{P(B)} P ( A ∣ B ) = P ( B ) P ( A ∩ B )
Events A A A and B B B are independent if and only if:
P ( A ∣ B ) = P ( A ) P(A|B) = P(A) P ( A ∣ B ) = P ( A )
A bag contains 5 red and 3 blue marbles. Two marbles are drawn without replacement. Find the
probability that both are red.
P ( b o t h r e d ) = P ( f i r s t r e d ) ⋅ P ( s e c o n d r e d ∣ f i r s t r e d ) = 5 8 ⋅ 4 7 = 20 56 = 5 14 P(\mathrm{both red}) = P(\mathrm{first red}) \cdot P(\mathrm{second red} | \mathrm{first red}) = \frac{5}{8} \cdot \frac{4}{7} = \frac{20}{56} = \frac{5}{14} P ( bothred ) = P ( firstred ) ⋅ P ( secondred ∣ firstred ) = 8 5 ⋅ 7 4 = 56 20 = 14 5
Tree Diagrams
Tree diagrams are useful for multi-stage experiments. Multiply along branches, add between branches.
A box contains 4 defective and 6 non-defective items. Two items are drawn without replacement. Find
the probability that exactly one is defective.
Paths giving exactly one defective:
First defective, second non-defective: 4 10 × 6 9 = 24 90 \dfrac{4}{10} \times \dfrac{6}{9} = \dfrac{24}{90} 10 4 × 9 6 = 90 24
First non-defective, second defective: 6 10 × 4 9 = 24 90 \dfrac{6}{10} \times \dfrac{4}{9} = \dfrac{24}{90} 10 6 × 9 4 = 90 24
P ( e x a c t l y o n e d e f e c t i v e ) = 24 90 + 24 90 = 48 90 = 8 15 P(\mathrm{exactly one defective}) = \frac{24}{90} + \frac{24}{90} = \frac{48}{90} = \frac{8}{15} P ( exactlyonedefective ) = 90 24 + 90 24 = 90 48 = 15 8
Bayes' Theorem
Theorem
For events A A A and B B B with P ( B ) ≠ 0 P(B) \neq 0 P ( B ) = 0 :
P ( A ∣ B ) = P ( B ∣ A ) ⋅ P ( A ) P ( B ) P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} P ( A ∣ B ) = P ( B ) P ( B ∣ A ) ⋅ P ( A )
If A 1 , A 2 , … , A n A_1, A_2, \ldots, A_n A 1 , A 2 , … , A n form a partition of Ω \Omega Ω :
P ( A k ∣ B ) = P ( B ∣ A k ) ⋅ P ( A k ) ∑ i = 1 n P ( B ∣ A i ) ⋅ P ( A i ) P(A_k|B) = \frac{P(B|A_k) \cdot P(A_k)}{\displaystyle\sum_{i=1}^{n} P(B|A_i) \cdot P(A_i)} P ( A k ∣ B ) = i = 1 ∑ n P ( B ∣ A i ) ⋅ P ( A i ) P ( B ∣ A k ) ⋅ P ( A k )
The denominator P ( B ) = ∑ i = 1 n P ( B ∣ A i ) ⋅ P ( A i ) P(B) = \displaystyle\sum_{i=1}^{n} P(B|A_i) \cdot P(A_i) P ( B ) = i = 1 ∑ n P ( B ∣ A i ) ⋅ P ( A i ) is the law of total
probability .
A factory has three machines producing items. Machine A produces 50% of items with 2% defect rate.
Machine B produces 30% with 3% defect rate. Machine C produces 20% with 1% defect rate. An item
is found to be defective. What is the probability it came from machine B?
P ( d e f e c t i v e ) = 0.5 × 0.02 + 0.3 × 0.03 + 0.2 × 0.01 = 0.01 + 0.009 + 0.002 = 0.021 P(\mathrm{defective}) = 0.5 \times 0.02 + 0.3 \times 0.03 + 0.2 \times 0.01 = 0.01 + 0.009 + 0.002 = 0.021 P ( defective ) = 0.5 × 0.02 + 0.3 × 0.03 + 0.2 × 0.01 = 0.01 + 0.009 + 0.002 = 0.021 P ( B ∣ d e f e c t i v e ) = 0.3 × 0.03 0.021 = 0.009 0.021 = 3 7 P(B|\mathrm{defective}) = \frac{0.3 \times 0.03}{0.021} = \frac{0.009}{0.021} = \frac{3}{7} P ( B ∣ defective ) = 0.021 0.3 × 0.03 = 0.021 0.009 = 7 3
Medical Testing Example
A disease affects 1% of a population. A test has 99% sensitivity
(P ( p o s i t i v e ∣ d i s e a s e ) = 0.99 P(\mathrm{positive}|\mathrm{disease}) = 0.99 P ( positive ∣ disease ) = 0.99 ) and 95% specificity
(P ( n e g a t i v e ∣ n o d i s e a s e ) = 0.95 P(\mathrm{negative}|\mathrm{no disease}) = 0.95 P ( negative ∣ nodisease ) = 0.95 ). What is
P ( d i s e a s e ∣ p o s i t i v e ) P(\mathrm{disease}|\mathrm{positive}) P ( disease ∣ positive ) ?
P ( p o s i t i v e ) = 0.01 × 0.99 + 0.99 × 0.05 = 0.0099 + 0.0495 = 0.0594 P(\mathrm{positive}) = 0.01 \times 0.99 + 0.99 \times 0.05 = 0.0099 + 0.0495 = 0.0594 P ( positive ) = 0.01 × 0.99 + 0.99 × 0.05 = 0.0099 + 0.0495 = 0.0594
P ( d i s e a s e ∣ p o s i t i v e ) = 0.01 × 0.99 0.0594 = 0.0099 0.0594 ≈ 0.167 P(\mathrm{disease}|\mathrm{positive}) = \frac{0.01 \times 0.99}{0.0594} = \frac{0.0099}{0.0594} \approx 0.167 P ( disease ∣ positive ) = 0.0594 0.01 × 0.99 = 0.0594 0.0099 ≈ 0.167
This result (approximately 16.7%) is counterintuitively low. Always work through Bayes' theorem
carefully rather than relying on intuition for conditional probability questions.
Discrete Random Variables
Definition
A discrete random variable X X X takes a countable set of values x 1 , x 2 , … x_1, x_2, \ldots x 1 , x 2 , … with
probabilities P ( X = x i ) = p i P(X = x_i) = p_i P ( X = x i ) = p i .
Probability Distribution
A probability distribution satisfies:
p i ≥ 0 p_i \ge 0 p i ≥ 0 for all i i i
∑ a l l i p i = 1 \displaystyle\sum_{\mathrm{all } i} p_i = 1 all i ∑ p i = 1
Expectation (Mean)
E ( X ) = μ = ∑ x i ⋅ p i E(X) = \mu = \sum x_i \cdot p_i E ( X ) = μ = ∑ x i ⋅ p i
Variance
V a r ( X ) = σ 2 = E ( X 2 ) − [ E ( X ) ] 2 = ∑ x i 2 p i − μ 2 \mathrm{Var}(X) = \sigma^2 = E(X^2) - [E(X)]^2 = \sum x_i^2 p_i - \mu^2 Var ( X ) = σ 2 = E ( X 2 ) − [ E ( X ) ] 2 = ∑ x i 2 p i − μ 2
Standard Deviation
σ = V a r ( X ) \sigma = \sqrt{\mathrm{Var}(X)} σ = Var ( X )
Properties of Expectation and Variance
For any constant a a a and b b b :
E ( a X + b ) = a E ( X ) + b E(aX + b) = aE(X) + b E ( a X + b ) = a E ( X ) + b
V a r ( a X + b ) = a 2 V a r ( X ) \mathrm{Var}(aX + b) = a^2 \mathrm{Var}(X) Var ( a X + b ) = a 2 Var ( X )
For independent random variables X X X and Y Y Y :
E ( X + Y ) = E ( X ) + E ( Y ) E(X + Y) = E(X) + E(Y) E ( X + Y ) = E ( X ) + E ( Y )
V a r ( X + Y ) = V a r ( X ) + V a r ( Y ) \mathrm{Var}(X + Y) = \mathrm{Var}(X) + \mathrm{Var}(Y) Var ( X + Y ) = Var ( X ) + Var ( Y )
A random variable X X X has the following probability distribution:
x x x 0 1 2 3 P ( X = x ) P(X = x) P ( X = x ) 0.1 0.4 0.3 0.2
E ( X ) = 0 ( 0.1 ) + 1 ( 0.4 ) + 2 ( 0.3 ) + 3 ( 0.2 ) = 0 + 0.4 + 0.6 + 0.6 = 1.6 E(X) = 0(0.1) + 1(0.4) + 2(0.3) + 3(0.2) = 0 + 0.4 + 0.6 + 0.6 = 1.6 E ( X ) = 0 ( 0.1 ) + 1 ( 0.4 ) + 2 ( 0.3 ) + 3 ( 0.2 ) = 0 + 0.4 + 0.6 + 0.6 = 1.6 E ( X 2 ) = 0 ( 0.1 ) + 1 ( 0.4 ) + 4 ( 0.3 ) + 9 ( 0.2 ) = 0 + 0.4 + 1.2 + 1.8 = 3.4 E(X^2) = 0(0.1) + 1(0.4) + 4(0.3) + 9(0.2) = 0 + 0.4 + 1.2 + 1.8 = 3.4 E ( X 2 ) = 0 ( 0.1 ) + 1 ( 0.4 ) + 4 ( 0.3 ) + 9 ( 0.2 ) = 0 + 0.4 + 1.2 + 1.8 = 3.4 V a r ( X ) = 3.4 − 1.6 2 = 3.4 − 2.56 = 0.84 \mathrm{Var}(X) = 3.4 - 1.6^2 = 3.4 - 2.56 = 0.84 Var ( X ) = 3.4 − 1. 6 2 = 3.4 − 2.56 = 0.84 σ = 0.84 ≈ 0.917 \sigma = \sqrt{0.84} \approx 0.917 σ = 0.84 ≈ 0.917
The Binomial Distribution
Conditions
A random variable X X X follows a binomial distribution X ∼ B ( n , p ) X \sim B(n, p) X ∼ B ( n , p ) if:
There are a fixed number n n n of trials.
Each trial has exactly two outcomes (success/failure).
The probability of success p p p is constant for each trial.
Trials are independent.
Probability Mass Function
P ( X = x ) = ( n x ) p x ( 1 − p ) n − x P(X = x) = \binom{n}{x} p^x (1-p)^{n-x} P ( X = x ) = ( x n ) p x ( 1 − p ) n − x
for x = 0 , 1 , 2 , … , n x = 0, 1, 2, \ldots, n x = 0 , 1 , 2 , … , n .
Mean and Variance
E ( X ) = n p E(X) = np E ( X ) = n p
V a r ( X ) = n p ( 1 − p ) \mathrm{Var}(X) = np(1-p) Var ( X ) = n p ( 1 − p )
σ = n p ( 1 − p ) \sigma = \sqrt{np(1-p)} σ = n p ( 1 − p )
A fair coin is tossed 10 times. Find the probability of getting exactly 6 heads.
X ∼ B ( 10 , 0.5 ) X \sim B(10, 0.5) X ∼ B ( 10 , 0.5 ) .
P ( X = 6 ) = ( 10 6 ) ( 0.5 ) 6 ( 0.5 ) 4 = 210 × ( 0.5 ) 10 = 210 1024 ≈ 0.205 P(X = 6) = \binom{10}{6}(0.5)^6(0.5)^4 = 210 \times (0.5)^{10} = \frac{210}{1024} \approx 0.205 P ( X = 6 ) = ( 6 10 ) ( 0.5 ) 6 ( 0.5 ) 4 = 210 × ( 0.5 ) 10 = 1024 210 ≈ 0.205
A multiple-choice test has 20 questions, each with 5 options. A student guesses all answers. Find
the probability of getting at least 10 correct.
X ∼ B ( 20 , 0.2 ) X \sim B(20, 0.2) X ∼ B ( 20 , 0.2 ) .
P ( X ≥ 10 ) = 1 − P ( X ≤ 9 ) = 1 − ∑ x = 0 9 ( 20 x ) ( 0.2 ) x ( 0.8 ) 20 − x P(X \ge 10) = 1 - P(X \le 9) = 1 - \sum_{x=0}^{9}\binom{20}{x}(0.2)^x(0.8)^{20-x} P ( X ≥ 10 ) = 1 − P ( X ≤ 9 ) = 1 − x = 0 ∑ 9 ( x 20 ) ( 0.2 ) x ( 0.8 ) 20 − x This is best computed using a GDC (calculator). The result is approximately 0.00026 0.00026 0.00026 .
Cumulative Binomial Probabilities
P ( X ≤ k ) = ∑ x = 0 k ( n x ) p x ( 1 − p ) n − x P(X \le k) = \sum_{x=0}^{k}\binom{n}{x}p^x(1-p)^{n-x} P ( X ≤ k ) = x = 0 ∑ k ( x n ) p x ( 1 − p ) n − x
Most questions require using the cumulative binomial function on a GDC.
For binomial probability questions, always state the distribution clearly: "X ∼ B ( n , p ) X \sim B(n, p) X ∼ B ( n , p )
where...". Use your GDC for calculations involving large n n n or cumulative probabilities.
The Normal Distribution
Properties
A continuous random variable X X X follows a normal distribution X ∼ N ( μ , σ 2 ) X \sim N(\mu, \sigma^2) X ∼ N ( μ , σ 2 ) .
The curve is bell-shaped and symmetric about μ \mu μ .
The mean, median, and mode are all equal to μ \mu μ .
Approximately 68% of data lies within μ ± σ \mu \pm \sigma μ ± σ .
Approximately 95% of data lies within μ ± 2 σ \mu \pm 2\sigma μ ± 2 σ .
Approximately 99.7% of data lies within μ ± 3 σ \mu \pm 3\sigma μ ± 3 σ .
Standardisation
To find probabilities, convert to the standard normal Z ∼ N ( 0 , 1 ) Z \sim N(0, 1) Z ∼ N ( 0 , 1 ) :
Z = X − μ σ Z = \frac{X - \mu}{\sigma} Z = σ X − μ
P ( X < a ) = P ( Z < a − μ σ ) P(X \lt a) = P\!\left(Z \lt \frac{a - \mu}{\sigma}\right) P ( X < a ) = P ( Z < σ a − μ )
Given X ∼ N ( 50 , 16 ) X \sim N(50, 16) X ∼ N ( 50 , 16 ) , find P ( 45 < X < 55 ) P(45 \lt X \lt 55) P ( 45 < X < 55 ) .
μ = 50 \mu = 50 μ = 50 , σ = 4 \sigma = 4 σ = 4 .
P ( 45 < X < 55 ) = P ( 45 − 50 4 < Z < 55 − 50 4 ) = P ( − 1.25 < Z < 1.25 ) P(45 \lt X \lt 55) = P\!\left(\frac{45-50}{4} \lt Z \lt \frac{55-50}{4}\right) = P(-1.25 \lt Z \lt 1.25) P ( 45 < X < 55 ) = P ( 4 45 − 50 < Z < 4 55 − 50 ) = P ( − 1.25 < Z < 1.25 ) = Φ ( 1.25 ) − Φ ( − 1.25 ) = 0.8944 − 0.1056 = 0.7888 = \Phi(1.25) - \Phi(-1.25) = 0.8944 - 0.1056 = 0.7888 = Φ ( 1.25 ) − Φ ( − 1.25 ) = 0.8944 − 0.1056 = 0.7888
Inverse Normal
Given a probability, find the corresponding value of X X X :
P ( X < x ) = p ⟹ x = μ + z p ⋅ σ P(X \lt x) = p \implies x = \mu + z_p \cdot \sigma P ( X < x ) = p ⟹ x = μ + z p ⋅ σ
where z p z_p z p is the p p p -th percentile of the standard normal.
Heights of a population follow N ( 170 , 64 ) N(170, 64) N ( 170 , 64 ) (in cm). Find the height that is at the 90th percentile.
μ = 170 \mu = 170 μ = 170 , σ = 8 \sigma = 8 σ = 8 .
P ( X < x ) = 0.90 ⟹ x − 170 8 = z 0.90 = 1.282 P(X \lt x) = 0.90 \implies \frac{x - 170}{8} = z_{0.90} = 1.282 P ( X < x ) = 0.90 ⟹ 8 x − 170 = z 0.90 = 1.282 x = 170 + 1.282 × 8 = 170 + 10.26 = 180.26 c m x = 170 + 1.282 \times 8 = 170 + 10.26 = 180.26 \mathrm{ cm} x = 170 + 1.282 × 8 = 170 + 10.26 = 180.26 cm
Combining Normal Variables
For independent normal variables X ∼ N ( μ X , σ X 2 ) X \sim N(\mu_X, \sigma_X^2) X ∼ N ( μ X , σ X 2 ) and
Y ∼ N ( μ Y , σ Y 2 ) Y \sim N(\mu_Y, \sigma_Y^2) Y ∼ N ( μ Y , σ Y 2 ) :
X + Y ∼ N ( μ X + μ Y , σ X 2 + σ Y 2 ) X + Y \sim N(\mu_X + \mu_Y, \sigma_X^2 + \sigma_Y^2) X + Y ∼ N ( μ X + μ Y , σ X 2 + σ Y 2 )
X − Y ∼ N ( μ X − μ Y , σ X 2 + σ Y 2 ) X - Y \sim N(\mu_X - \mu_Y, \sigma_X^2 + \sigma_Y^2) X − Y ∼ N ( μ X − μ Y , σ X 2 + σ Y 2 )
a X + b ∼ N ( a μ X + b , a 2 σ X 2 ) aX + b \sim N(a\mu_X + b, a^2\sigma_X^2) a X + b ∼ N ( a μ X + b , a 2 σ X 2 )
The weight of a parcel is X ∼ N ( 2 , 0.04 ) X \sim N(2, 0.04) X ∼ N ( 2 , 0.04 ) kg. The packaging adds Y ∼ N ( 0.3 , 0.01 ) Y \sim N(0.3, 0.01) Y ∼ N ( 0.3 , 0.01 ) kg. Find
the probability that the total exceeds 2.5 kg.
X + Y ∼ N ( 2.3 , 0.05 ) X + Y \sim N(2.3, 0.05) X + Y ∼ N ( 2.3 , 0.05 ) P ( X + Y > 2.5 ) = P ( Z > 2.5 − 2.3 0.05 ) = P ( Z > 0.894 ) = 1 − 0.814 = 0.186 P(X + Y \gt 2.5) = P\!\left(Z \gt \frac{2.5 - 2.3}{\sqrt{0.05}}\right) = P(Z \gt 0.894) = 1 - 0.814 = 0.186 P ( X + Y > 2.5 ) = P ( Z > 0.05 2.5 − 2.3 ) = P ( Z > 0.894 ) = 1 − 0.814 = 0.186
Continuous Random Variables
Probability Density Functions (PDF)
A function f ( x ) f(x) f ( x ) is a PDF if:
f ( x ) ≥ 0 f(x) \ge 0 f ( x ) ≥ 0 for all x x x .
∫ − ∞ ∞ f ( x ) d x = 1 \displaystyle\int_{-\infty}^{\infty} f(x)\,dx = 1 ∫ − ∞ ∞ f ( x ) d x = 1 .
Probabilities from a PDF
P ( a ≤ X ≤ b ) = ∫ a b f ( x ) d x P(a \le X \le b) = \int_a^b f(x)\,dx P ( a ≤ X ≤ b ) = ∫ a b f ( x ) d x
Mean and Variance
E ( X ) = ∫ − ∞ ∞ x f ( x ) d x E(X) = \int_{-\infty}^{\infty} x f(x)\,dx E ( X ) = ∫ − ∞ ∞ x f ( x ) d x
E ( X 2 ) = ∫ − ∞ ∞ x 2 f ( x ) d x E(X^2) = \int_{-\infty}^{\infty} x^2 f(x)\,dx E ( X 2 ) = ∫ − ∞ ∞ x 2 f ( x ) d x
V a r ( X ) = E ( X 2 ) − [ E ( X ) ] 2 \mathrm{Var}(X) = E(X^2) - [E(X)]^2 Var ( X ) = E ( X 2 ) − [ E ( X ) ] 2
The median m m m satisfies:
∫ − ∞ m f ( x ) d x = 0.5 \int_{-\infty}^{m} f(x)\,dx = 0.5 ∫ − ∞ m f ( x ) d x = 0.5
A continuous random variable X X X has PDF f ( x ) = 2 x f(x) = 2x f ( x ) = 2 x for 0 ≤ x ≤ 1 0 \le x \le 1 0 ≤ x ≤ 1 .
Verify it is a valid PDF:
∫ 0 1 2 x d x = [ x 2 ] 0 1 = 1 \int_0^1 2x\,dx = [x^2]_0^1 = 1 ∫ 0 1 2 x d x = [ x 2 ] 0 1 = 1 Find P ( X < 0.5 ) P(X \lt 0.5) P ( X < 0.5 ) :
P ( X < 0.5 ) = ∫ 0 0.5 2 x d x = [ x 2 ] 0 0.5 = 0.25 P(X \lt 0.5) = \int_0^{0.5} 2x\,dx = [x^2]_0^{0.5} = 0.25 P ( X < 0.5 ) = ∫ 0 0.5 2 x d x = [ x 2 ] 0 0.5 = 0.25 Find E ( X ) E(X) E ( X ) :
E ( X ) = ∫ 0 1 x ⋅ 2 x d x = ∫ 0 1 2 x 2 d x = [ 2 x 3 3 ] 0 1 = 2 3 E(X) = \int_0^1 x \cdot 2x\,dx = \int_0^1 2x^2\,dx = \left[\frac{2x^3}{3}\right]_0^1 = \frac{2}{3} E ( X ) = ∫ 0 1 x ⋅ 2 x d x = ∫ 0 1 2 x 2 d x = [ 3 2 x 3 ] 0 1 = 3 2 Find the median:
∫ 0 m 2 x d x = 0.5 ⟹ m 2 = 0.5 ⟹ m = 1 2 = 2 2 \int_0^m 2x\,dx = 0.5 \implies m^2 = 0.5 \implies m = \frac{1}{\sqrt{2}} = \frac{\sqrt{2}}{2} ∫ 0 m 2 x d x = 0.5 ⟹ m 2 = 0.5 ⟹ m = 2 1 = 2 2
IB Exam-Style Questions
Question 1 (Paper 1 style)
Events A and B are such that P ( A ) = 0.6 P(A) = 0.6 P ( A ) = 0.6 , P ( B ) = 0.4 P(B) = 0.4 P ( B ) = 0.4 , and P ( A ∣ B ) = 0.3 P(A|B) = 0.3 P ( A ∣ B ) = 0.3 .
(a) Find P ( A ∩ B ) P(A \cap B) P ( A ∩ B ) .
P ( A ∩ B ) = P ( A ∣ B ) ⋅ P ( B ) = 0.3 × 0.4 = 0.12 P(A \cap B) = P(A|B) \cdot P(B) = 0.3 \times 0.4 = 0.12 P ( A ∩ B ) = P ( A ∣ B ) ⋅ P ( B ) = 0.3 × 0.4 = 0.12
(b) Determine whether A and B are independent.
P ( A ) ⋅ P ( B ) = 0.6 × 0.4 = 0.24 ≠ 0.12 = P ( A ∩ B ) P(A) \cdot P(B) = 0.6 \times 0.4 = 0.24 \neq 0.12 = P(A \cap B) P ( A ) ⋅ P ( B ) = 0.6 × 0.4 = 0.24 = 0.12 = P ( A ∩ B ) .
Not independent.
(c) Find P ( A ∪ B ) P(A \cup B) P ( A ∪ B ) .
P ( A ∪ B ) = 0.6 + 0.4 − 0.12 = 0.88 P(A \cup B) = 0.6 + 0.4 - 0.12 = 0.88 P ( A ∪ B ) = 0.6 + 0.4 − 0.12 = 0.88
Question 2 (Paper 2 style)
A bag contains 7 red and 5 blue marbles. Three marbles are drawn without replacement.
(a) Find the probability that all three are red.
P = 7 12 × 6 11 × 5 10 = 210 1320 = 7 44 P = \frac{7}{12} \times \frac{6}{11} \times \frac{5}{10} = \frac{210}{1320} = \frac{7}{44} P = 12 7 × 11 6 × 10 5 = 1320 210 = 44 7
(b) Find the probability that exactly two are red.
P = ( 3 2 ) × 7 12 × 6 11 × 5 10 = 3 × 210 1320 = 21 44 P = \binom{3}{2} \times \frac{7}{12} \times \frac{6}{11} \times \frac{5}{10} = 3 \times \frac{210}{1320} = \frac{21}{44} P = ( 2 3 ) × 12 7 × 11 6 × 10 5 = 3 × 1320 210 = 44 21
Wait, let me recalculate using a tree diagram approach:
RRB: 7 12 × 6 11 × 5 10 = 7 44 \dfrac{7}{12} \times \dfrac{6}{11} \times \dfrac{5}{10} = \dfrac{7}{44} 12 7 × 11 6 × 10 5 = 44 7
RBR: 7 12 × 5 11 × 6 10 = 7 44 \dfrac{7}{12} \times \dfrac{5}{11} \times \dfrac{6}{10} = \dfrac{7}{44} 12 7 × 11 5 × 10 6 = 44 7
BRR: 5 12 × 7 11 × 6 10 = 7 44 \dfrac{5}{12} \times \dfrac{7}{11} \times \dfrac{6}{10} = \dfrac{7}{44} 12 5 × 11 7 × 10 6 = 44 7
P = 7 44 + 7 44 + 7 44 = 21 44 P = \frac{7}{44} + \frac{7}{44} + \frac{7}{44} = \frac{21}{44} P = 44 7 + 44 7 + 44 7 = 44 21
Question 3 (Paper 2 style)
The time taken to complete a task follows a normal distribution with mean 45 minutes and standard
deviation 8 minutes.
(a) Find the probability that a randomly selected person takes between 40 and 50 minutes.
P ( 40 < X < 50 ) = P ( 40 − 45 8 < Z < 50 − 45 8 ) = P ( − 0.625 < Z < 0.625 ) P(40 \lt X \lt 50) = P\!\left(\frac{40-45}{8} \lt Z \lt \frac{50-45}{8}\right) = P(-0.625 \lt Z \lt 0.625) P ( 40 < X < 50 ) = P ( 8 40 − 45 < Z < 8 50 − 45 ) = P ( − 0.625 < Z < 0.625 )
≈ 2 Φ ( 0.625 ) − 1 ≈ 2 ( 0.734 ) − 1 = 0.468 \approx 2\Phi(0.625) - 1 \approx 2(0.734) - 1 = 0.468 ≈ 2Φ ( 0.625 ) − 1 ≈ 2 ( 0.734 ) − 1 = 0.468
(b) The fastest 10% of people receive a certificate. Find the maximum time to qualify.
P ( X < x ) = 0.10 ⟹ x − 45 8 = − 1.282 P(X \lt x) = 0.10 \implies \frac{x - 45}{8} = -1.282 P ( X < x ) = 0.10 ⟹ 8 x − 45 = − 1.282
x = 45 − 1.282 × 8 = 45 − 10.26 = 34.74 m i n u t e s x = 45 - 1.282 \times 8 = 45 - 10.26 = 34.74 \mathrm{ minutes} x = 45 − 1.282 × 8 = 45 − 10.26 = 34.74 minutes
Question 4 (Paper 1 style)
X ∼ B ( 15 , 0.3 ) X \sim B(15, 0.3) X ∼ B ( 15 , 0.3 ) . Find P ( X = 5 ) P(X = 5) P ( X = 5 ) .
P ( X = 5 ) = ( 15 5 ) ( 0.3 ) 5 ( 0.7 ) 10 P(X = 5) = \binom{15}{5}(0.3)^5(0.7)^{10} P ( X = 5 ) = ( 5 15 ) ( 0.3 ) 5 ( 0.7 ) 10
Using a GDC: P ( X = 5 ) ≈ 0.206 P(X = 5) \approx 0.206 P ( X = 5 ) ≈ 0.206 .
Question 5 (Paper 2 style)
A continuous random variable X X X has PDF f ( x ) = 3 x 2 8 f(x) = \dfrac{3x^2}{8} f ( x ) = 8 3 x 2 for 0 ≤ x ≤ 2 0 \le x \le 2 0 ≤ x ≤ 2 .
(a) Verify that f ( x ) f(x) f ( x ) is a valid PDF.
∫ 0 2 3 x 2 8 d x = 3 8 [ x 3 3 ] 0 2 = 3 8 ⋅ 8 3 = 1 \int_0^2 \frac{3x^2}{8}\,dx = \frac{3}{8}\left[\frac{x^3}{3}\right]_0^2 = \frac{3}{8} \cdot \frac{8}{3} = 1 ∫ 0 2 8 3 x 2 d x = 8 3 [ 3 x 3 ] 0 2 = 8 3 ⋅ 3 8 = 1
(b) Find E ( X ) E(X) E ( X ) .
E ( X ) = ∫ 0 2 x ⋅ 3 x 2 8 d x = ∫ 0 2 3 x 3 8 d x = 3 8 [ x 4 4 ] 0 2 = 3 8 × 4 = 3 2 E(X) = \int_0^2 x \cdot \frac{3x^2}{8}\,dx = \int_0^2 \frac{3x^3}{8}\,dx = \frac{3}{8}\left[\frac{x^4}{4}\right]_0^2 = \frac{3}{8} \times 4 = \frac{3}{2} E ( X ) = ∫ 0 2 x ⋅ 8 3 x 2 d x = ∫ 0 2 8 3 x 3 d x = 8 3 [ 4 x 4 ] 0 2 = 8 3 × 4 = 2 3
(c) Find the mode.
Since f ( x ) = 3 x 2 8 f(x) = \dfrac{3x^2}{8} f ( x ) = 8 3 x 2 is increasing on [ 0 , 2 ] [0, 2] [ 0 , 2 ] , the mode is x = 2 x = 2 x = 2 .
Summary
Distribution Notation E ( X ) E(X) E ( X ) V a r ( X ) \mathrm{Var}(X) Var ( X ) Binomial B ( n , p ) B(n, p) B ( n , p ) n p np n p n p ( 1 − p ) np(1-p) n p ( 1 − p ) Normal N ( μ , σ 2 ) N(\mu, \sigma^2) N ( μ , σ 2 ) μ \mu μ σ 2 \sigma^2 σ 2
Key Formula Expression Addition rule P ( A ∪ B ) = P ( A ) + P ( B ) − P ( A ∩ B ) P(A \cup B) = P(A) + P(B) - P(A \cap B) P ( A ∪ B ) = P ( A ) + P ( B ) − P ( A ∩ B ) Conditional probability P ( A ∥ B ) = P ( A ∩ B ) P ( B ) P(A\|B) = \dfrac{P(A \cap B)}{P(B)} P ( A ∥ B ) = P ( B ) P ( A ∩ B ) Bayes' theorem P ( A ∥ B ) = P ( B ∥ A ) P ( A ) P ( B ) P(A\|B) = \dfrac{P(B\|A)P(A)}{P(B)} P ( A ∥ B ) = P ( B ) P ( B ∥ A ) P ( A ) Standardisation Z = X − μ σ Z = \dfrac{X - \mu}{\sigma} Z = σ X − μ
Always define your random variable clearly at the start of probability questions. For normal
distribution problems, draw a sketch of the bell curve and shade the relevant area. For binomial
problems, verify the four conditions before applying the formula.
Probability Distributions: Additional Topics
Poisson Distribution
The Poisson distribution models the number of events occurring in a fixed interval of time or space.
X ∼ P o ( λ ) X \sim \mathrm{Po}(\lambda) X ∼ Po ( λ ) where λ \lambda λ is the mean number of events.
P ( X = x ) = e − λ λ x x ! , x = 0 , 1 , 2 , … P(X = x) = \frac{e^{-\lambda}\lambda^x}{x!}, \quad x = 0, 1, 2, \ldots P ( X = x ) = x ! e − λ λ x , x = 0 , 1 , 2 , …
E ( X ) = λ , V a r ( X ) = λ E(X) = \lambda, \quad \mathrm{Var}(X) = \lambda E ( X ) = λ , Var ( X ) = λ
A call centre receives an average of 4 calls per minute. Find the probability of receiving exactly 6
calls in a minute.
P ( X = 6 ) = e − 4 ⋅ 4 6 6 ! = e − 4 ⋅ 4096 720 = 4096 720 × 54.60 ≈ 0.104 P(X = 6) = \frac{e^{-4} \cdot 4^6}{6!} = \frac{e^{-4} \cdot 4096}{720} = \frac{4096}{720 \times 54.60} \approx 0.104 P ( X = 6 ) = 6 ! e − 4 ⋅ 4 6 = 720 e − 4 ⋅ 4096 = 720 × 54.60 4096 ≈ 0.104
Geometric Distribution
Models the number of trials until the first success.
X ∼ G e o ( p ) X \sim \mathrm{Geo}(p) X ∼ Geo ( p ) where p p p is the probability of success on each trial.
P ( X = x ) = ( 1 − p ) x − 1 p , x = 1 , 2 , 3 , … P(X = x) = (1-p)^{x-1}p, \quad x = 1, 2, 3, \ldots P ( X = x ) = ( 1 − p ) x − 1 p , x = 1 , 2 , 3 , …
E ( X ) = 1 p , V a r ( X ) = 1 − p p 2 E(X) = \frac{1}{p}, \quad \mathrm{Var}(X) = \frac{1-p}{p^2} E ( X ) = p 1 , Var ( X ) = p 2 1 − p
A die is rolled until a 6 appears. Find the probability that it takes exactly 4 rolls.
P ( X = 4 ) = ( 5 6 ) 3 × 1 6 = 125 1296 ≈ 0.0965 P(X = 4) = \left(\frac{5}{6}\right)^3 \times \frac{1}{6} = \frac{125}{1296} \approx 0.0965 P ( X = 4 ) = ( 6 5 ) 3 × 6 1 = 1296 125 ≈ 0.0965
Combinations and Permutations
Factorial
n ! = n × ( n − 1 ) × ⋯ × 2 × 1 , 0 ! = 1 n! = n \times (n-1) \times \cdots \times 2 \times 1, \quad 0! = 1 n ! = n × ( n − 1 ) × ⋯ × 2 × 1 , 0 ! = 1
Permutations
The number of ways to arrange r r r objects from n n n distinct objects (order matters):
n P r = n ! ( n − r ) ! {}^nP_r = \frac{n!}{(n-r)!} n P r = ( n − r )! n !
Combinations
The number of ways to choose r r r objects from n n n distinct objects (order does not matter):
n C r = ( n r ) = n ! r ! ( n − r ) ! {}^nC_r = \binom{n}{r} = \frac{n!}{r!(n-r)!} n C r = ( r n ) = r ! ( n − r )! n !
A committee of 4 is to be chosen from 7 men and 5 women. How many committees have at least 2 women?
Total ways = ( 12 4 ) = 495 = \dbinom{12}{4} = 495 = ( 4 12 ) = 495 .
Ways with 0 women: ( 7 4 ) = 35 \dbinom{7}{4} = 35 ( 4 7 ) = 35 .
Ways with 1 woman: ( 5 1 ) ( 7 3 ) = 5 × 35 = 175 \dbinom{5}{1}\dbinom{7}{3} = 5 \times 35 = 175 ( 1 5 ) ( 3 7 ) = 5 × 35 = 175 .
Ways with at least 2 women = 495 − 35 − 175 = 285 = 495 - 35 - 175 = 285 = 495 − 35 − 175 = 285 .
Additional Exam-Style Questions
Question 6 (Paper 2 style)
A bag contains 4 red and 6 blue marbles. Marbles are drawn one at a time without replacement until a
red marble is drawn.
(a) Find the probability that exactly 3 draws are needed.
P = 6 10 × 5 9 × 4 8 = 120 720 = 1 6 P = \frac{6}{10} \times \frac{5}{9} \times \frac{4}{8} = \frac{120}{720} = \frac{1}{6} P = 10 6 × 9 5 × 8 4 = 720 120 = 6 1
(b) Find the expected number of draws.
Let X X X be the number of draws. We need E ( X ) E(X) E ( X ) .
P ( X = 1 ) = 4 10 = 0.4 P(X=1) = \dfrac{4}{10} = 0.4 P ( X = 1 ) = 10 4 = 0.4
P ( X = 2 ) = 6 10 × 4 9 = 24 90 = 4 15 P(X=2) = \dfrac{6}{10} \times \dfrac{4}{9} = \dfrac{24}{90} = \dfrac{4}{15} P ( X = 2 ) = 10 6 × 9 4 = 90 24 = 15 4
P ( X = 3 ) = 6 10 × 5 9 × 4 8 = 1 6 P(X=3) = \dfrac{6}{10} \times \dfrac{5}{9} \times \dfrac{4}{8} = \dfrac{1}{6} P ( X = 3 ) = 10 6 × 9 5 × 8 4 = 6 1
P ( X = 4 ) = 6 10 × 5 9 × 4 8 × 4 7 = 4 42 P(X=4) = \dfrac{6}{10} \times \dfrac{5}{9} \times \dfrac{4}{8} \times \dfrac{4}{7} = \dfrac{4}{42} P ( X = 4 ) = 10 6 × 9 5 × 8 4 × 7 4 = 42 4
P ( X = 5 ) = 6 10 × 5 9 × 4 8 × 3 7 × 4 6 = 3 42 P(X=5) = \dfrac{6}{10} \times \dfrac{5}{9} \times \dfrac{4}{8} \times \dfrac{3}{7} \times \dfrac{4}{6} = \dfrac{3}{42} P ( X = 5 ) = 10 6 × 9 5 × 8 4 × 7 3 × 6 4 = 42 3
P ( X = 6 ) = 6 10 × 5 9 × 4 8 × 3 7 × 2 6 × 4 5 = 2 105 P(X=6) = \dfrac{6}{10} \times \dfrac{5}{9} \times \dfrac{4}{8} \times \dfrac{3}{7} \times \dfrac{2}{6} \times \dfrac{4}{5} = \dfrac{2}{105} P ( X = 6 ) = 10 6 × 9 5 × 8 4 × 7 3 × 6 2 × 5 4 = 105 2
P ( X = 7 ) = 6 10 × 5 9 × 4 8 × 3 7 × 2 6 × 1 5 × 4 4 = 1 210 P(X=7) = \dfrac{6}{10} \times \dfrac{5}{9} \times \dfrac{4}{8} \times \dfrac{3}{7} \times \dfrac{2}{6} \times \dfrac{1}{5} \times \dfrac{4}{4} = \dfrac{1}{210} P ( X = 7 ) = 10 6 × 9 5 × 8 4 × 7 3 × 6 2 × 5 1 × 4 4 = 210 1
E ( X ) = 1 ( 0.4 ) + 2 ( 4 15 ) + 3 ( 1 6 ) + 4 ( 4 42 ) + 5 ( 3 42 ) + 6 ( 2 105 ) + 7 ( 1 210 ) E(X) = 1(0.4) + 2\!\left(\frac{4}{15}\right) + 3\!\left(\frac{1}{6}\right) + 4\!\left(\frac{4}{42}\right) + 5\!\left(\frac{3}{42}\right) + 6\!\left(\frac{2}{105}\right) + 7\!\left(\frac{1}{210}\right) E ( X ) = 1 ( 0.4 ) + 2 ( 15 4 ) + 3 ( 6 1 ) + 4 ( 42 4 ) + 5 ( 42 3 ) + 6 ( 105 2 ) + 7 ( 210 1 )
= 0.4 + 0.533 + 0.5 + 0.381 + 0.357 + 0.114 + 0.033 = 2.318 = 0.4 + 0.533 + 0.5 + 0.381 + 0.357 + 0.114 + 0.033 = 2.318 = 0.4 + 0.533 + 0.5 + 0.381 + 0.357 + 0.114 + 0.033 = 2.318
Question 7 (Paper 1 style)
X ∼ B ( 12 , 0.25 ) X \sim B(12, 0.25) X ∼ B ( 12 , 0.25 ) . Find P ( X ≤ 2 ) P(X \le 2) P ( X ≤ 2 ) .
P ( X ≤ 2 ) = P ( X = 0 ) + P ( X = 1 ) + P ( X = 2 ) P(X \le 2) = P(X=0) + P(X=1) + P(X=2) P ( X ≤ 2 ) = P ( X = 0 ) + P ( X = 1 ) + P ( X = 2 )
= ( 12 0 ) ( 0.25 ) 0 ( 0.75 ) 12 + ( 12 1 ) ( 0.25 ) 1 ( 0.75 ) 11 + ( 12 2 ) ( 0.25 ) 2 ( 0.75 ) 10 = \binom{12}{0}(0.25)^0(0.75)^{12} + \binom{12}{1}(0.25)^1(0.75)^{11} + \binom{12}{2}(0.25)^2(0.75)^{10} = ( 0 12 ) ( 0.25 ) 0 ( 0.75 ) 12 + ( 1 12 ) ( 0.25 ) 1 ( 0.75 ) 11 + ( 2 12 ) ( 0.25 ) 2 ( 0.75 ) 10
= 0.0317 + 0.1267 + 0.2323 = 0.3907 = 0.0317 + 0.1267 + 0.2323 = 0.3907 = 0.0317 + 0.1267 + 0.2323 = 0.3907
Question 8 (Paper 2 style)
The heights of Year 12 students follow a normal distribution with mean 165 c m 165\mathrm{ cm} 165 cm and
standard deviation 8 c m 8\mathrm{ cm} 8 cm .
(a) What percentage of students are taller than 180 c m 180\mathrm{ cm} 180 cm ?
P ( X > 180 ) = P ( Z > 180 − 165 8 ) = P ( Z > 1.875 ) = 1 − 0.9696 = 0.0304 P(X \gt 180) = P\!\left(Z \gt \frac{180-165}{8}\right) = P(Z \gt 1.875) = 1 - 0.9696 = 0.0304 P ( X > 180 ) = P ( Z > 8 180 − 165 ) = P ( Z > 1.875 ) = 1 − 0.9696 = 0.0304
About 3.0 % 3.0\% 3.0% .
(b) The school needs to order desks for the middle 90% of students. What height range should the
desks accommodate?
Middle 90% means 5th to 95th percentile.
5th percentile: h − 165 8 = − 1.645 ⟹ h = 165 − 13.16 = 151.8 c m \dfrac{h - 165}{8} = -1.645 \implies h = 165 - 13.16 = 151.8\mathrm{ cm} 8 h − 165 = − 1.645 ⟹ h = 165 − 13.16 = 151.8 cm .
95th percentile: h − 165 8 = 1.645 ⟹ h = 165 + 13.16 = 178.2 c m \dfrac{h - 165}{8} = 1.645 \implies h = 165 + 13.16 = 178.2\mathrm{ cm} 8 h − 165 = 1.645 ⟹ h = 165 + 13.16 = 178.2 cm .
Desks should accommodate heights from about 152 c m 152\mathrm{ cm} 152 cm to 178 c m 178\mathrm{ cm} 178 cm .
Diagnostic Test
Ready to test your understanding of Probability ? The diagnostic test contains the hardest questions within the IB specification for this topic, each with a full worked solution.
Unit tests probe edge cases and common misconceptions. Integration tests combine Probability with other IB mathematics topics to test synthesis under exam conditions.
See Diagnostic Guide for instructions on self-marking and building a personal test matrix.