Skip to main content

Proof and Logic

This note covers IB Mathematics AA — Topic 1.4: Proof in full depth. Everything here is examinable at both Standard Level and Higher Level, with HL extensions marked where relevant.

1. Logic Foundations

1.1 Propositions and Truth Values

A proposition is a declarative sentence that is either true or false — never both, never neither.

"7 is prime" is a proposition (true). "Solve for x" is not a proposition. "This sentence is false" is not a proposition (it is paradoxical).

The truth value of a proposition PP is denoted {T}(P){T,F}\mathcal{'\{'}T{'\}'}(P) \in \{T, F\}.

1.2 Logical Connectives

There are five fundamental connectives. Let PP and QQ be propositions.

ConnectiveSymbolRead asMeaning
Negation¬P\neg P"not P"True when P is false
ConjunctionPQP \wedge Q"P and Q"True when both are true
DisjunctionPQP \vee Q"P or Q"True when at least one is true
ImplicationP    QP \implies Q"P implies Q"False only when P true and Q false
BiconditionalP    QP \iff Q"P if and only if Q"True when P and Q have same truth value

Truth Tables

Negation:

PP¬P\neg P
TF
FT

Conjunction:

PPQQPQP \wedge Q
TTT
TFF
FTF
FFF

Disjunction (inclusive OR):

PPQQPQP \vee Q
TTT
TFT
FTT
FFF

Implication:

PPQQP    QP \implies Q
TTT
TFF
FTT
FFT

The implication P    QP \implies Q is only false when a true premise leads to a false conclusion. This is the single most misunderstood truth table entry in all of mathematics. When PP is false, the implication is vacuously true — there is no counterexample to "whenever P holds, Q holds."

Biconditional:

PPQQP    QP \iff Q
TTT
TFF
FTF
FFT

1.3 Logical Equivalence

Two compound propositions are logically equivalent (written \equiv) if they have identical truth values for every possible assignment of truth values to their component propositions.

Key equivalences (Laws of Logic):

  1. Commutative: PQQPP \wedge Q \equiv Q \wedge P, and PQQPP \vee Q \equiv Q \vee P
  2. Associative: (PQ)RP(QR)(P \wedge Q) \wedge R \equiv P \wedge (Q \wedge R), and similarly for \vee
  3. Distributive: P(QR)(PQ)(PR)P \wedge (Q \vee R) \equiv (P \wedge Q) \vee (P \wedge R), and P(QR)(PQ)(PR)P \vee (Q \wedge R) \equiv (P \vee Q) \wedge (P \vee R)
  4. Identity: PTPP \wedge T \equiv P, and PFPP \vee F \equiv P
  5. Domination: PTTP \vee T \equiv T, and PFFP \wedge F \equiv F
  6. Idempotent: PPPP \wedge P \equiv P, and PPPP \vee P \equiv P
  7. Double Negation: ¬(¬P)P\neg(\neg P) \equiv P
  8. Absorption: P(PQ)PP \vee (P \wedge Q) \equiv P, and P(PQ)PP \wedge (P \vee Q) \equiv P

1.4 De Morgan's Laws

These are essential for negating compound statements.

¬(PQ)¬P¬Q\neg(P \wedge Q) \equiv \neg P \vee \neg Q

¬(PQ)¬P¬Q\neg(P \vee Q) \equiv \neg P \wedge \neg Q

Intuition: To negate an "and," negate each part and switch to "or." To negate an "or," negate each part and switch to "and."

Worked Example — Negating a compound statement:

Negate: "The function ff is continuous and differentiable."

Let CC: "ff is continuous." Let DD: "ff is differentiable." The statement is CDC \wedge D. Its negation is ¬C¬D\neg C \vee \neg D, i.e., "ff is either not continuous or not differentiable."

info

A common mistake is to negate "P and Q" as "not P and not Q." That is wrong. The negation of "ff is continuous and differentiable" is NOT "ff is not continuous and not differentiable" — that is too strong.

1.5 The Contrapositive

For any implication P    QP \implies Q, the contrapositive is ¬Q    ¬P\neg Q \implies \neg P.

Theorem: An implication and its contrapositive are logically equivalent.

P    Q¬Q    ¬PP \implies Q \equiv \neg Q \implies \neg P

This is proved by comparing truth tables. Both are false only when PP is true and QQ is false.

Compare with two other related statements that are NOT equivalent:

  • The converse of P    QP \implies Q is Q    PQ \implies P (not equivalent in general)
  • The inverse of P    QP \implies Q is ¬P    ¬Q\neg P \implies \neg Q (not equivalent in general, but equivalent to the converse)
danger

danger implication. Never confuse the contrapositive with the converse.

Worked Example:

Statement: "If nn is even, then n2n^2 is even."

Contrapositive: "If n2n^2 is odd, then nn is odd."

Converse: "If n2n^2 is even, then nn is even." (True in this case, but not logically guaranteed.)

Inverse: "If nn is odd, then n2n^2 is odd." (Also true here, but again not guaranteed.)

1.6 Tautologies and Contradictions

A tautology is a compound proposition that is true for every possible truth assignment. A contradiction is a compound proposition that is false for every possible truth assignment.

Important tautology: P¬PP \vee \neg P (Law of Excluded Middle)

Important contradiction: P¬PP \wedge \neg P

Another key tautology — Modus Ponens: ((P    Q)P)    Q((P \implies Q) \wedge P) \implies Q

This says: if you know P    QP \implies Q and you know PP, then you can conclude QQ. This is the fundamental rule of direct proof.

Modus Tollens: ((P    Q)¬Q)    ¬P((P \implies Q) \wedge \neg Q) \implies \neg P

This says: if you know P    QP \implies Q and you know ¬Q\neg Q, then you can conclude ¬P\neg P. This is the fundamental rule of proof by contrapositive.

1.7 Quantifiers

Quantifiers let us express statements about collections of objects.

Universal quantifier (\forall): "For all" — the statement must hold for every element in the domain.

xS,  P(x)\forall x \in S, \; P(x)

Existential quantifier (\exists): "There exists" — the statement must hold for at least one element.

xS,  P(x)\exists x \in S, \; P(x)

Negating quantified statements:

¬(x,  P(x))x,  ¬P(x)\neg(\forall x, \; P(x)) \equiv \exists x, \; \neg P(x)

¬(x,  P(x))x,  ¬P(x)\neg(\exists x, \; P(x)) \equiv \forall x, \; \neg P(x)

To negate a universal statement, you get an existential counterexample. To negate an existential statement, you must show it fails for every case.

Nested quantifiers require careful handling. The order matters:

xy,  P(x,y)isNOTequivalenttoyx,  P(x,y)\forall x \, \exists y, \; P(x,y) \quad \mathrm{is NOT equivalent to} \quad \exists y \, \forall x, \; P(x,y)

Worked Example — Negating a nested quantifier statement:

Negate: "For every positive real number ε\varepsilon, there exists a positive real number δ\delta such that..."

¬(ε>0,  δ>0,  P(ε,δ))ε>0,  δ>0,  ¬P(ε,δ)\neg\left(\forall \varepsilon \gt 0, \; \exists \delta \gt 0, \; P(\varepsilon, \delta)\right) \equiv \exists \varepsilon \gt 0, \; \forall \delta \gt 0, \; \neg P(\varepsilon, \delta)

The existential becomes universal and the universal becomes existential. This is the logical backbone of epsilon-delta definitions in analysis.

warning

warning negate the predicate. The order of quantifiers does NOT change — it remains the same sequence but with each quantifier flipped.

Worked Example: Negate n{Z}+,  pprime,  p>n\forall n \in \mathbb{'\{'}Z{'\}'}^+, \; \exists p \mathrm{ prime}, \; p \gt n

Step 1: Identify the structure — it is n,  p,  P(n,p)\forall n, \; \exists p, \; P(n,p) where P(n,p)P(n,p) is "pp is prime and p>np \gt n."

Step 2: Negate — n{Z}+,  pprime,  ¬(p>n)\exists n \in \mathbb{'\{'}Z{'\}'}^+, \; \forall p \mathrm{ prime}, \; \neg(p \gt n)

Step 3: Simplify — n{Z}+,  pprime,  pn\exists n \in \mathbb{'\{'}Z{'\}'}^+, \; \forall p \mathrm{ prime}, \; p \le n

Reading: "There exists a positive integer nn such that every prime pp satisfies pnp \le n." This is false (there are infinitely many primes), so the original statement is true.


2. Methods of Proof

2.1 Direct Proof

Strategy: To prove P    QP \implies Q, assume PP is true and use logical deduction to arrive at QQ.

This is the most straightforward method. You chain together known results, definitions, and algebraic manipulations.

Template:

  1. Assume PP.
  2. [Logical steps...]
  3. Therefore QQ.
  4. Hence P    QP \implies Q.

Worked Example — If nn is odd, then n2n^2 is odd:

Assume nn is odd. By definition, n=2k+1n = 2k + 1 for some k{Z}k \in \mathbb{'\{'}Z{'\}'}.

n2=(2k+1)2=4k2+4k+1=2(2k2+2k)+1n^2 = (2k+1)^2 = 4k^2 + 4k + 1 = 2(2k^2 + 2k) + 1

Since 2k2+2k2k^2 + 2k is an integer (sum and product of integers), let m=2k2+2k{Z}m = 2k^2 + 2k \in \mathbb{'\{'}Z{'\}'}. Then n2=2m+1n^2 = 2m + 1, which is odd by definition.

Therefore, if nn is odd, then n2n^2 is odd.

Worked Example — The sum of two even numbers is even:

Let a=2ma = 2m and b=2nb = 2n for some m,n{Z}m, n \in \mathbb{'\{'}Z{'\}'}.

a+b=2m+2n=2(m+n)a + b = 2m + 2n = 2(m + n)

Since m+n{Z}m + n \in \mathbb{'\{'}Z{'\}'}, we have a+b=2(m+n)a + b = 2(m+n) is even.

Exercise: Prove that the product of two odd numbers is odd.

Let a=2m+1a = 2m + 1 and b=2n+1b = 2n + 1 for some m,n{Z}m, n \in \mathbb{'\{'}Z{'\}'}.

ab=(2m+1)(2n+1)=4mn+2m+2n+1=2(2mn+m+n)+1a \cdot b = (2m+1)(2n+1) = 4mn + 2m + 2n + 1 = 2(2mn + m + n) + 1

Since 2mn+m+n{Z}2mn + m + n \in \mathbb{'\{'}Z{'\}'}, the product is of the form 2k+12k + 1, hence odd.

2.2 Proof by Contradiction

Strategy: To prove a statement PP, assume ¬P\neg P and derive a logical contradiction (something that is always false, like 1=01 = 0 or 0<00 \lt 0).

Template:

  1. Assume ¬P\neg P (the negation of what you want to prove).
  2. [Logical steps leading to a contradiction...]
  3. This contradicts [known fact].
  4. Therefore ¬P\neg P is false, so PP is true.

This method is especially powerful when the statement you want to prove is a negation itself ("there does not exist..." or "there are no...").

info

info which means ¬P\neg P must be false, hence PP is true. It relies on the Law of Excluded Middle (P¬PP \vee \neg P must be true).

Worked Example — 2\sqrt{2} is irrational:

Assume 2\sqrt{2} is rational. Then 2=ab\sqrt{2} = \frac{a}{b} where a,b{Z}+a, b \in \mathbb{'\{'}Z{'\}'}^+, gcd(a,b)=1\gcd(a, b) = 1 (the fraction is in lowest terms).

Squaring both sides: 2=a2b22 = \frac{a^2}{b^2}, so a2=2b2a^2 = 2b^2.

Since a2=2b2a^2 = 2b^2, we have a2a^2 is even. Therefore aa is even (by the lemma: if a2a^2 is even, then aa is even — proved below).

Write a=2ka = 2k for some k{Z}+k \in \mathbb{'\{'}Z{'\}'}^+. Substituting:

(2k)2=2b2(2k)^2 = 2b^2, so 4k2=2b24k^2 = 2b^2, so b2=2k2b^2 = 2k^2.

Therefore b2b^2 is even, so bb is even.

But now both aa and bb are even, contradicting gcd(a,b)=1\gcd(a, b) = 1.

Therefore our assumption is false, and 2\sqrt{2} is irrational.

note

note odd, then a=2k+1a = 2k+1, so a2=4k2+4k+1=2(2k2+2k)+1a^2 = 4k^2 + 4k + 1 = 2(2k^2+2k) + 1, which is odd. Hence if a2a^2 is even, aa cannot be odd, so aa is even.

Exercise: Prove that 3\sqrt{3} is irrational.

Assume 3=ab\sqrt{3} = \frac{a}{b} in lowest terms. Then a2=3b2a^2 = 3b^2, so a2a^2 is divisible by 3. By the lemma (if 3a23 \mid a^2, then 3a3 \mid a), write a=3ka = 3k. Then 9k2=3b29k^2 = 3b^2, so b2=3k2b^2 = 3k^2, hence 3b23 \mid b^2, so 3b3 \mid b. Both aa and bb are divisible by 3, contradicting lowest terms.

The supporting lemma "if pa2p \mid a^2 for prime pp, then pap \mid a" follows from the Fundamental Theorem of Arithmetic: if pp appears in the prime factorization of a2a^2, it must appear in the factorization of aa.

2.3 Proof by Contrapositive

Strategy: To prove P    QP \implies Q, instead prove ¬Q    ¬P\neg Q \implies \neg P. Since an implication is logically equivalent to its contrapositive, this proves the original statement.

When to use this: When the hypothesis PP feels "too big" or "too loose" to work with directly, but the negation of the conclusion ¬Q\neg Q gives you something concrete to grab.

Template:

  1. We prove the contrapositive: if ¬Q\neg Q, then ¬P\neg P.
  2. Assume ¬Q\neg Q.
  3. [Logical steps...]
  4. Therefore ¬P\neg P.
  5. Hence P    QP \implies Q.

Worked Example — If n2n^2 is even, then nn is even:

We prove the contrapositive: if nn is odd, then n2n^2 is odd.

Assume nn is odd. Then n=2k+1n = 2k + 1 for some k{Z}k \in \mathbb{'\{'}Z{'\}'}.

n2=(2k+1)2=4k2+4k+1=2(2k2+2k)+1n^2 = (2k+1)^2 = 4k^2 + 4k + 1 = 2(2k^2 + 2k) + 1

This is odd. Therefore, if nn is odd, n2n^2 is odd. By contrapositive, if n2n^2 is even, then nn is even.

tip

tip "n2n^2 even implies nn even" by contradiction (assume n2n^2 even and nn odd, derive that n2n^2 is both even and odd). But the contrapositive proof is cleaner — it is a direct proof of the equivalent statement.

Exercise: Prove that if 3n+23n + 2 is odd, then nn is odd (by contrapositive).

Contrapositive: If nn is even, then 3n+23n + 2 is even.

Assume nn is even, so n=2kn = 2k for some k{Z}k \in \mathbb{'\{'}Z{'\}'}.

3n+2=3(2k)+2=6k+2=2(3k+1)3n + 2 = 3(2k) + 2 = 6k + 2 = 2(3k + 1)

Since 3k+1{Z}3k + 1 \in \mathbb{'\{'}Z{'\}'}, 3n+23n + 2 is even. By contrapositive, if 3n+23n + 2 is odd, then nn is odd.

2.4 Mathematical Induction

Induction proves statements of the form nn0,  P(n)\forall n \ge n_0, \; P(n), where n0{Z}n_0 \in \mathbb{'\{'}Z{'\}'}.

2.4.1 Standard (Weak) Induction

Template:

  1. Base case: Verify P(n0)P(n_0) is true.
  2. Inductive hypothesis: Assume P(k)P(k) is true for some arbitrary kn0k \ge n_0.
  3. Inductive step: Using the hypothesis that P(k)P(k) holds, prove that P(k+1)P(k+1) holds.
  4. Conclusion: By the Principle of Mathematical Induction, P(n)P(n) is true for all nn0n \ge n_0.

Worked Example — Sum formula 1+2++n=n(n+1)21 + 2 + \cdots + n = \frac{n(n+1)}{2}:

Let P(n)P(n): 1+2++n=n(n+1)21 + 2 + \cdots + n = \frac{n(n+1)}{2}.

Base case (n=1n = 1): LHS =1= 1, RHS =122=1= \frac{1 \cdot 2}{2} = 1. So P(1)P(1) is true.

Inductive hypothesis: Assume P(k)P(k) is true for some k1k \ge 1, i.e., 1+2++k=k(k+1)21 + 2 + \cdots + k = \frac{k(k+1)}{2}.

Inductive step: We must show P(k+1)P(k+1): 1+2++k+(k+1)=(k+1)(k+2)21 + 2 + \cdots + k + (k+1) = \frac{(k+1)(k+2)}{2}.

Starting from the LHS:

1+2++k+(k+1)=k(k+1)2+(k+1)1 + 2 + \cdots + k + (k+1) = \frac{k(k+1)}{2} + (k+1)

(by the inductive hypothesis)

=k(k+1)+2(k+1)2=(k+1)(k+2)2= \frac{k(k+1) + 2(k+1)}{2} = \frac{(k+1)(k + 2)}{2}

This is exactly the RHS of P(k+1)P(k+1). Therefore P(k+1)P(k+1) holds.

Conclusion: By induction, P(n)P(n) is true for all n1n \ge 1.

Worked Example — Sum of squares 12+22++n2=n(n+1)(2n+1)61^2 + 2^2 + \cdots + n^2 = \frac{n(n+1)(2n+1)}{6}:

Let P(n)P(n): i=1ni2=n(n+1)(2n+1)6\displaystyle\sum_{i=1}^{n} i^2 = \frac{n(n+1)(2n+1)}{6}.

Base case (n=1n = 1): LHS =1= 1, RHS =1236=1= \frac{1 \cdot 2 \cdot 3}{6} = 1. True.

Inductive hypothesis: Assume P(k)P(k) holds for some k1k \ge 1.

Inductive step:

i=1k+1i2=i=1ki2+(k+1)2=k(k+1)(2k+1)6+(k+1)2\sum_{i=1}^{k+1} i^2 = \sum_{i=1}^{k} i^2 + (k+1)^2 = \frac{k(k+1)(2k+1)}{6} + (k+1)^2

=k(k+1)(2k+1)+6(k+1)26= \frac{k(k+1)(2k+1) + 6(k+1)^2}{6}

=(k+1)[k(2k+1)+6(k+1)]6= \frac{(k+1)[k(2k+1) + 6(k+1)]}{6}

=(k+1)[2k2+k+6k+6]6= \frac{(k+1)[2k^2 + k + 6k + 6]}{6}

=(k+1)(2k2+7k+6)6= \frac{(k+1)(2k^2 + 7k + 6)}{6}

=(k+1)(k+2)(2k+3)6= \frac{(k+1)(k+2)(2k+3)}{6}

=(k+1)((k+1)+1)(2(k+1)+1)6= \frac{(k+1)((k+1)+1)(2(k+1)+1)}{6}

This is P(k+1)P(k+1). By induction, the formula holds for all n1n \ge 1.

Exercise: Prove by induction that 1+3+5++(2n1)=n21 + 3 + 5 + \cdots + (2n-1) = n^2 for all n1n \ge 1.

Let P(n)P(n): i=1n(2i1)=n2\displaystyle\sum_{i=1}^{n} (2i - 1) = n^2.

Base case (n=1n = 1): LHS =1= 1, RHS =1= 1. True.

Inductive hypothesis: Assume P(k)P(k): 1+3++(2k1)=k21 + 3 + \cdots + (2k-1) = k^2.

Inductive step: 1+3++(2k1)+(2(k+1)1)=k2+(2k+1)=k2+2k+1=(k+1)21 + 3 + \cdots + (2k-1) + (2(k+1)-1) = k^2 + (2k + 1) = k^2 + 2k + 1 = (k+1)^2.

By induction, P(n)P(n) holds for all n1n \ge 1.

Exercise: Prove by induction that 2n>n2^n \gt n for all n1n \ge 1.

Let P(n)P(n): 2n>n2^n \gt n.

Base case (n=1n = 1): 21=2>12^1 = 2 \gt 1. True.

Inductive hypothesis: Assume 2k>k2^k \gt k for some k1k \ge 1.

Inductive step: 2k+1=22k>2k2^{k+1} = 2 \cdot 2^k \gt 2k (by IH, since 2k>k2^k \gt k and multiplying by 2).

Now, 2k=k+kk+12k = k + k \ge k + 1 for k1k \ge 1. Therefore 2k+1>k+12^{k+1} \gt k + 1.

By induction, P(n)P(n) holds for all n1n \ge 1.

2.4.2 Strong Induction

In strong induction, the inductive hypothesis assumes P(n0),P(n0+1),,P(k)P(n_0), P(n_0+1), \ldots, P(k) are ALL true, not just P(k)P(k).

Template:

  1. Base case(s): Verify P(n0)P(n_0) (and possibly more base cases).
  2. Inductive hypothesis: Assume P(j)P(j) is true for all n0jkn_0 \le j \le k.
  3. Inductive step: Using that P(j)P(j) holds for all jkj \le k, prove P(k+1)P(k+1).
  4. Conclusion: By the Principle of Strong Induction, P(n)P(n) is true for all nn0n \ge n_0.

When to use strong induction: When proving P(k+1)P(k+1) requires not just P(k)P(k) but some earlier case P(j)P(j) where j<kj \lt k.

warning

Weak and strong induction are logically equivalent — anything provable by one is provable by the other. But strong induction can make certain proofs much more natural. Use it when the inductive step needs to reference cases earlier than just kk.

Worked Example — Every integer n2n \ge 2 is a product of primes:

Let P(n)P(n): "nn is a product of primes (possibly a single prime)."

Base case (n=2n = 2): 2 is prime, hence a product of primes. True.

Strong inductive hypothesis: Assume P(j)P(j) is true for all 2jk2 \le j \le k.

Inductive step for n=k+1n = k + 1:

Case 1: k+1k+1 is prime. Then it is trivially a product of primes.

Case 2: k+1k+1 is composite. Then k+1=abk+1 = ab where 2abk2 \le a \le b \le k. By the strong inductive hypothesis, both aa and bb are products of primes. Therefore k+1=abk+1 = ab is a product of primes.

In either case, P(k+1)P(k+1) holds. By strong induction, every integer n2n \ge 2 is a product of primes.

Exercise: Prove that every amount of postage 4\ge 4 cents can be formed using 4-cent and 5-cent stamps.

Let P(n)P(n): "Postage of nn cents can be formed using 4-cent and 5-cent stamps."

Base cases: P(4)P(4): one 4-cent stamp. P(5)P(5): one 5-cent stamp. P(6)P(6): cannot — wait, this needs checking. P(7)P(7): cannot either. Let me reconsider.

Actually, we need P(8)P(8): two 4-cent stamps. P(9)P(9): one 4-cent + one 5-cent. P(10)P(10): two 5-cent stamps. P(11)P(11): ... Hmm, P(12)P(12): three 4-cent stamps. P(13)P(13): two 4-cent + one 5-cent. P(14)P(14): one 4-cent + two 5-cent. P(15)P(15): three 5-cent stamps.

Let us use base cases P(12)P(12), P(13)P(13), P(14)P(14), P(15)P(15). Strong inductive hypothesis: assume P(j)P(j) for all 12jk12 \le j \le k, where k15k \ge 15.

For P(k+1)P(k+1): note that (k+1)4=k312(k+1) - 4 = k - 3 \ge 12 (since k15k \ge 15). By strong IH, P(k3)P(k-3) holds, meaning we can form (k3)(k-3) cents. Adding one 4-cent stamp gives (k3)+4=k+1(k-3) + 4 = k+1 cents.

By strong induction, P(n)P(n) holds for all n12n \ge 12. But we should verify the smaller cases: P(4)P(4) through P(11)P(11). We find P(4)P(4), P(5)P(5), P(8)P(8), P(9)P(9), P(10)P(10) hold, but P(6)P(6), P(7)P(7), P(11)P(11) do not. So the correct statement is: all postage n8n \ge 8 except n=11n = 11. Or: all n12n \ge 12.

The cleanest formulation: all n12n \ge 12 can be formed. Base cases P(12)P(12) through P(15)P(15). Inductive step as above.

2.5 Proof by Exhaustion

Strategy: When the domain of possible cases is finite and small, prove the statement by verifying each case individually.

This is brute-force but rigorous. It is only practical when the number of cases is manageable.

Worked Example — Show that n2+n+17n^2 + n + 17 is prime for n=0,1,2,3,4n = 0, 1, 2, 3, 4:

n=0n = 0: 0+0+17=170 + 0 + 17 = 17 (prime) n=1n = 1: 1+1+17=191 + 1 + 17 = 19 (prime) n=2n = 2: 4+2+17=234 + 2 + 17 = 23 (prime) n=3n = 3: 9+3+17=299 + 3 + 17 = 29 (prime) n=4n = 4: 16+4+17=3716 + 4 + 17 = 37 (prime)

All five cases verified. (Note: this polynomial produces primes for n=0n = 0 through n=15n = 15, but fails at n=16n = 16 since 162+16+17=289=17216^2 + 16 + 17 = 289 = 17^2.)

2.6 Counterexamples

A single counterexample is sufficient to disprove a universal statement.

Strategy: To disprove xS,  P(x)\forall x \in S, \; P(x), find one aSa \in S such that P(a)P(a) is false.

Worked Example — Disprove: "All prime numbers are odd."

Counterexample: 22 is prime and 22 is even.

Worked Example — Disprove: "n2n+41n^2 - n + 41 is prime for all n{N}n \in \mathbb{'\{'}N{'\}'}."

When n=41n = 41: 41241+41=412=168141^2 - 41 + 41 = 41^2 = 1681, which is composite (41×4141 \times 41).

tip

To prove a universal claim requires a general proof. To disprove it requires only one counterexample. This asymmetry is fundamental to mathematical logic.

Exercise: Disprove: "For all positive integers aa and bb, gcd(a+b,ab)=gcd(a,b)\gcd(a+b, a-b) = \gcd(a, b)."

Take a=3a = 3, b=1b = 1. Then gcd(3+1,31)=gcd(4,2)=2\gcd(3+1, 3-1) = \gcd(4, 2) = 2, but gcd(3,1)=1\gcd(3, 1) = 1. Since 212 \ne 1, the statement is false.


3. Classical Proof Examples

3.1 2\sqrt{2} is Irrational (Detailed)

This proof appeared in Section 2.2. Here we present it with full commentary on every logical step.

Theorem: 2{Q}\sqrt{2} \notin \mathbb{'\{'}Q{'\}'}.

Proof:

We proceed by contradiction.

Assume 2{Q}\sqrt{2} \in \mathbb{'\{'}Q{'\}'}. Then there exist coprime integers a,ba, b with b>0b \gt 0 such that 2=ab\sqrt{2} = \frac{a}{b}.

The requirement that gcd(a,b)=1\gcd(a, b) = 1 is without loss of generality: any rational number can be expressed as a fraction in lowest terms.

Squaring: 2=a2b22 = \frac{a^2}{b^2}, so a2=2b2a^2 = 2b^2. Equation (1).

From (1): a2a^2 is divisible by 2, so a2a^2 is even. By the contrapositive of "odd implies square is odd," aa is even.

Since aa is even, write a=2ma = 2m for some integer mm. Substituting into (1):

(2m)2=2b2(2m)^2 = 2b^2, i.e., 4m2=2b24m^2 = 2b^2, i.e., b2=2m2b^2 = 2m^2. Equation (2).

From (2): b2b^2 is even, so bb is even (same reasoning as above).

But now aa and bb are both even, meaning gcd(a,b)2\gcd(a, b) \ge 2, contradicting gcd(a,b)=1\gcd(a, b) = 1.

This contradiction shows our assumption is false. Hence 2{Q}\sqrt{2} \notin \mathbb{'\{'}Q{'\}'}. \blacksquare

3.2 There Are Infinitely Many Primes

Theorem (Euclid): The set of prime numbers is infinite.

Proof:

We proceed by contradiction.

Assume there are only finitely many primes: p1,p2,,pnp_1, p_2, \ldots, p_n.

Consider the number N=p1p2pn+1N = p_1 p_2 \cdots p_n + 1.

Since N>1N \gt 1, by the Fundamental Theorem of Arithmetic, NN has a prime factor pp.

This prime pp must be one of p1,p2,,pnp_1, p_2, \ldots, p_n (since we assumed these are all the primes).

But N=p1p2pn+1N = p_1 p_2 \cdots p_n + 1, and for each pip_i:

N0+11(modpi)N \equiv 0 + 1 \equiv 1 \pmod{p_i}

So piNp_i \nmid N for all ii. This contradicts that some pip_i divides NN.

Therefore, there are infinitely many primes. \blacksquare

info

info example, 23571113+1=30031=59×5092 \cdot 3 \cdot 5 \cdot 7 \cdot 11 \cdot 13 + 1 = 30031 = 59 \times 509. The proof only requires that NN has SOME prime factor not in the list.

3.3 log23\log_2{3} is Irrational

Theorem: log23\log_2{3} is irrational.

Proof:

We proceed by contradiction. Assume log23\log_2{3} is rational.

Then log23=pq\log_2{3} = \frac{p}{q} for some coprime positive integers p,qp, q with q1q \ge 1.

This means 2p/q=32^{p/q} = 3, so 2p=3q2^p = 3^q.

Since p1p \ge 1, 2p2^p is a power of 2. Its only prime factor is 2.

Since q1q \ge 1, 3q3^q is a power of 3. Its only prime factor is 3.

By the Fundamental Theorem of Arithmetic, prime factorizations are unique. The number 2p=3q2^p = 3^q would need to have prime factorization consisting of only 2's AND only 3's simultaneously. This is impossible unless p=q=0p = q = 0, but p1p \ge 1.

Contradiction. Hence log23\log_2{3} is irrational. \blacksquare

Exercise: Prove that log35\log_3{5} is irrational.

Assume log35=pq\log_3{5} = \frac{p}{q} in lowest terms, so 3p=5q3^p = 5^q. The LHS has only prime factor 3; the RHS has only prime factor 5. By uniqueness of prime factorization, this is impossible unless p=q=0p = q = 0, contradicting p1p \ge 1. Hence log35\log_3{5} is irrational.

3.4 Sum Formulas

We proved these by induction in Section 2.4. Here is the complete reference:

i=1ni=n(n+1)2\sum_{i=1}^{n} i = \frac{n(n+1)}{2}

i=1ni2=n(n+1)(2n+1)6\sum_{i=1}^{n} i^2 = \frac{n(n+1)(2n+1)}{6}

i=1ni3=n2(n+1)24=(n(n+1)2)2\sum_{i=1}^{n} i^3 = \frac{n^2(n+1)^2}{4} = \left(\frac{n(n+1)}{2}\right)^2

The last identity is notable: the sum of cubes equals the square of the sum of the first nn integers.

Exercise: Prove i=1ni3=(n(n+1)2)2\sum_{i=1}^{n} i^3 = \left(\frac{n(n+1)}{2}\right)^2 by induction.

Let P(n)P(n): i=1ni3=(n(n+1)2)2\displaystyle\sum_{i=1}^{n} i^3 = \left(\frac{n(n+1)}{2}\right)^2.

Base case (n=1n = 1): LHS =1= 1, RHS =(122)2=1= \left(\frac{1 \cdot 2}{2}\right)^2 = 1. True.

Inductive hypothesis: Assume P(k)P(k) holds.

Inductive step:

i=1k+1i3=(k(k+1)2)2+(k+1)3\sum_{i=1}^{k+1} i^3 = \left(\frac{k(k+1)}{2}\right)^2 + (k+1)^3

=k2(k+1)24+(k+1)3=(k+1)2(k24+(k+1))= \frac{k^2(k+1)^2}{4} + (k+1)^3 = (k+1)^2 \left(\frac{k^2}{4} + (k+1)\right)

=(k+1)2k2+4k+44=(k+1)2(k+2)24= (k+1)^2 \cdot \frac{k^2 + 4k + 4}{4} = (k+1)^2 \cdot \frac{(k+2)^2}{4}

=((k+1)(k+2)2)2= \left(\frac{(k+1)(k+2)}{2}\right)^2

This is P(k+1)P(k+1). By induction, P(n)P(n) holds for all n1n \ge 1.

3.5 Divisibility Proofs

Theorem: If n2n^2 is even, then nn is even.

We proved this by contrapositive in Section 2.3. Here are additional divisibility results.

Theorem: If aba \mid b and bcb \mid c, then aca \mid c (transitivity of divisibility).

Proof:

Since aba \mid b, there exists m{Z}m \in \mathbb{'\{'}Z{'\}'} such that b=amb = am. Since bcb \mid c, there exists n{Z}n \in \mathbb{'\{'}Z{'\}'} such that c=bnc = bn.

Therefore c=bn=(am)n=a(mn)c = bn = (am)n = a(mn).

Since mn{Z}mn \in \mathbb{'\{'}Z{'\}'}, we have aca \mid c. \blacksquare

Theorem: If dad \mid a and dbd \mid b, then d(ax+by)d \mid (ax + by) for all x,y{Z}x, y \in \mathbb{'\{'}Z{'\}'}.

Proof:

Since dad \mid a, write a=dma = dm for some m{Z}m \in \mathbb{'\{'}Z{'\}'}. Since dbd \mid b, write b=dnb = dn for some n{Z}n \in \mathbb{'\{'}Z{'\}'}.

ax+by=(dm)x+(dn)y=d(mx+ny)ax + by = (dm)x + (dn)y = d(mx + ny).

Since mx+ny{Z}mx + ny \in \mathbb{'\{'}Z{'\}'}, d(ax+by)d \mid (ax + by). \blacksquare

note

This theorem is the foundation of the Euclidean algorithm. The expression ax+byax + by is called a linear combination of aa and bb. The greatest common divisor gcd(a,b)\gcd(a, b) can always be expressed as a linear combination of aa and bb (Bezout's identity).

3.6 Inequality Proofs

AM-GM Inequality (two variables): For a,b0a, b \ge 0:

a+b2ab\frac{a + b}{2} \ge \sqrt{ab}

with equality if and only if a=ba = b.

Proof:

Note that (ab)20(\sqrt{a} - \sqrt{b})^2 \ge 0 for all a,b0a, b \ge 0.

Expanding: a2ab+b0a - 2\sqrt{ab} + b \ge 0, so a+b2aba + b \ge 2\sqrt{ab}.

Dividing by 2: a+b2ab\frac{a+b}{2} \ge \sqrt{ab}. Equality holds when a=b\sqrt{a} = \sqrt{b}, i.e., a=ba = b. \blacksquare

Cauchy-Schwarz Inequality (statement): For real numbers a1,,ana_1, \ldots, a_n and b1,,bnb_1, \ldots, b_n:

(i=1naibi)2(i=1nai2)(i=1nbi2)\left(\sum_{i=1}^{n} a_i b_i\right)^2 \le \left(\sum_{i=1}^{n} a_i^2\right) \left(\sum_{i=1}^{n} b_i^2\right)

with equality when the vectors are proportional (ai=λbia_i = \lambda b_i for all ii and some scalar λ\lambda).

The proof is beyond the scope of this note but uses the fact that (aix+bi)20\sum (a_i x + b_i)^2 \ge 0 for all x{R}x \in \mathbb{'\{'}R{'\}'}, and a quadratic in xx that is always non-negative must have a non-positive discriminant.

Exercise: Prove the AM-GM inequality for three variables: a+b+c3abc3\frac{a+b+c}{3} \ge \sqrt[3]{abc} for a,b,c0a, b, c \ge 0.

Apply two-variable AM-GM twice. First, a+b2ab\frac{a+b}{2} \ge \sqrt{ab}. Let m=a+b2m = \frac{a+b}{2}. Then:

m+c2mcabc=(abc)1/4\frac{m + c}{2} \ge \sqrt{mc} \ge \sqrt{\sqrt{ab} \cdot c} = (abc)^{1/4}

Wait, that gives us a+b2+c2(abc)1/4\frac{\frac{a+b}{2} + c}{2} \ge (abc)^{1/4}, i.e., a+b+2c4(abc)1/4\frac{a+b+2c}{4} \ge (abc)^{1/4}.

That is not quite right for three-variable AM-GM. A cleaner approach: let a=x3a = x^3, b=y3b = y^3, c=z3c = z^3 with x,y,z0x, y, z \ge 0. We need x3+y3+z33xyz\frac{x^3 + y^3 + z^3}{3} \ge xyz.

Consider x3+y3+z33xyz=12(x+y+z)[(xy)2+(yz)2+(zx)2]0x^3 + y^3 + z^3 - 3xyz = \frac{1}{2}(x+y+z)[(x-y)^2 + (y-z)^2 + (z-x)^2] \ge 0.

Since x+y+z0x+y+z \ge 0 and squares are non-negative, x3+y3+z33xyzx^3 + y^3 + z^3 \ge 3xyz. Dividing by 3 and substituting back gives the result.


4. Number Theory Proofs (HL Extension)

4.1 Divisibility Properties

For integers a,b,ca, b, c with a0a \ne 0:

Definition: aba \mid b (read "aa divides bb") means there exists k{Z}k \in \mathbb{'\{'}Z{'\}'} such that b=akb = ak.

Key properties:

  1. a0a \mid 0 for all a0a \ne 0 (since 0=a00 = a \cdot 0)
  2. 1b1 \mid b for all bb (since b=1bb = 1 \cdot b)
  3. If aba \mid b and bab \mid a, then a=±ba = \pm b
  4. If aba \mid b and aca \mid c, then a(b+c)a \mid (b + c) and a(bc)a \mid (b - c)

Proof of property 3:

aba \mid b implies b=mab = ma for some m{Z}m \in \mathbb{'\{'}Z{'\}'}. bab \mid a implies a=nba = nb for some n{Z}n \in \mathbb{'\{'}Z{'\}'}.

Substituting: a=n(ma)=(nm)aa = n(ma) = (nm)a, so (nm1)a=0(nm - 1)a = 0. Since a0a \ne 0, we have nm=1nm = 1.

In integers, nm=1nm = 1 implies (n,m)=(1,1)(n, m) = (1, 1) or (n,m)=(1,1)(n, m) = (-1, -1).

If m=1m = 1: b=ab = a. If m=1m = -1: b=ab = -a. So a=±ba = \pm b. \blacksquare

4.2 Congruences and Modular Arithmetic

Definition: ab(modn)a \equiv b \pmod{n} means n(ab)n \mid (a - b), i.e., ab=kna - b = kn for some k{Z}k \in \mathbb{'\{'}Z{'\}'}.

Key properties of congruences:

If ab(modn)a \equiv b \pmod{n} and cd(modn)c \equiv d \pmod{n}, then:

  1. a+cb+d(modn)a + c \equiv b + d \pmod{n}
  2. acbd(modn)a - c \equiv b - d \pmod{n}
  3. acbd(modn)ac \equiv bd \pmod{n}
  4. ambm(modn)a^m \equiv b^m \pmod{n} for any m{Z}+m \in \mathbb{'\{'}Z{'\}'}^+

Proof of property 1:

ab(modn)a \equiv b \pmod{n} means ab=kna - b = kn. cd(modn)c \equiv d \pmod{n} means cd=lnc - d = ln.

(a+c)(b+d)=(ab)+(cd)=kn+ln=(k+l)n(a+c) - (b+d) = (a-b) + (c-d) = kn + ln = (k+l)n.

Therefore n[(a+c)(b+d)]n \mid [(a+c) - (b+d)], so a+cb+d(modn)a+c \equiv b+d \pmod{n}. \blacksquare

Proof of property 3:

a=b+kna = b + kn and c=d+lnc = d + ln for some k,l{Z}k, l \in \mathbb{'\{'}Z{'\}'}.

ac=(b+kn)(d+ln)=bd+bln+dkn+kln2=bd+n(bl+dk+kln)ac = (b + kn)(d + ln) = bd + bln + dkn + kln^2 = bd + n(bl + dk + kln).

Therefore n(acbd)n \mid (ac - bd), so acbd(modn)ac \equiv bd \pmod{n}. \blacksquare

warning

Division does NOT work with congruences in general. From acbc(modn)ac \equiv bc \pmod{n}, you can only conclude ab(modn)a \equiv b \pmod{n} if gcd(c,n)=1\gcd(c, n) = 1. For example, 60(mod3)6 \equiv 0 \pmod{3} and 30(mod3)3 \equiv 0 \pmod{3}, but 63=2≢00\frac{6}{3} = 2 \not\equiv \frac{0}{0} (undefined).

Worked Example: Find the last two digits of 71007^{100}.

We need 7100(mod100)7^{100} \pmod{100}.

First, find 7100(mod4)7^{100} \pmod{4}: 731(mod4)7 \equiv 3 \equiv -1 \pmod{4}, so 7100(1)100=1(mod4)7^{100} \equiv (-1)^{100} = 1 \pmod{4}.

Next, find 7100(mod25)7^{100} \pmod{25}: By Euler's theorem (or Fermat's Little Theorem since 25 is a prime power and gcd(7,25)=1\gcd(7, 25) = 1), ϕ(25)=20\phi(25) = 20, so 7201(mod25)7^{20} \equiv 1 \pmod{25}, hence 7100=(720)515=1(mod25)7^{100} = (7^{20})^5 \equiv 1^5 = 1 \pmod{25}.

We need xx such that x1(mod4)x \equiv 1 \pmod{4} and x1(mod25)x \equiv 1 \pmod{25}. By CRT, x1(mod100)x \equiv 1 \pmod{100}.

The last two digits are 01.

4.3 Fermat's Little Theorem

Theorem (Fermat's Little Theorem): If pp is prime and gcd(a,p)=1\gcd(a, p) = 1, then:

ap11(modp)a^{p-1} \equiv 1 \pmod{p}

Equivalently, for any integer aa and prime pp: apa(modp)a^p \equiv a \pmod{p}.

Proof sketch (using the fact that {a,2a,3a,,(p1)a}\{a, 2a, 3a, \ldots, (p-1)a\} is a permutation of {1,2,,p1}\{1, 2, \ldots, p-1\} modulo pp):

Consider the product 123(p1)=(p1)!1 \cdot 2 \cdot 3 \cdots (p-1) = (p-1)!.

Modulo pp, the numbers a,2a,3a,,(p1)aa, 2a, 3a, \ldots, (p-1)a are all nonzero and pairwise non-congruent modulo pp (since jaka(modp)ja \equiv ka \pmod{p} implies p(jk)ap \mid (j-k)a, and since pap \nmid a, we get p(jk)p \mid (j-k), which means jk(modp)j \equiv k \pmod{p}, and since 1j,kp11 \le j, k \le p-1, we get j=kj = k).

Therefore {a,2a,,(p1)a}\{a, 2a, \ldots, (p-1)a\} is a complete residue system modulo pp excluding 0, so:

a2a3a(p1)a(p1)!(modp)a \cdot 2a \cdot 3a \cdots (p-1)a \equiv (p-1)! \pmod{p}

ap1(p1)!(p1)!(modp)a^{p-1} \cdot (p-1)! \equiv (p-1)! \pmod{p}

Since p(p1)!p \nmid (p-1)! (Wilson's theorem tells us (p1)!1(modp)(p-1)! \equiv -1 \pmod{p}), we can divide both sides by (p1)!(p-1)! modulo pp:

ap11(modp)a^{p-1} \equiv 1 \pmod{p} \quad \blacksquare

Worked Example — Find 2100(mod13)2^{100} \pmod{13}:

Since 13 is prime and gcd(2,13)=1\gcd(2, 13) = 1, by Fermat: 2121(mod13)2^{12} \equiv 1 \pmod{13}.

100=812+4100 = 8 \cdot 12 + 4, so 2100=(212)8241816163(mod13)2^{100} = (2^{12})^8 \cdot 2^4 \equiv 1^8 \cdot 16 \equiv 16 \equiv 3 \pmod{13}.

Exercise: Find 350(mod11)3^{50} \pmod{11}.

By Fermat: 3101(mod11)3^{10} \equiv 1 \pmod{11}.

50=51050 = 5 \cdot 10, so 350=(310)515=1(mod11)3^{50} = (3^{10})^5 \equiv 1^5 = 1 \pmod{11}.

Answer: 3501(mod11)3^{50} \equiv 1 \pmod{11}.

4.4 Fundamental Theorem of Arithmetic

Theorem: Every integer n2n \ge 2 can be expressed uniquely (up to order of factors) as a product of primes:

n=p1a1p2a2pkakn = p_1^{a_1} \cdot p_2^{a_2} \cdots p_k^{a_k}

where p1<p2<<pkp_1 \lt p_2 \lt \cdots \lt p_k are primes and ai1a_i \ge 1.

Existence was proved in Section 2.4.2 using strong induction. Here we sketch uniqueness.

Uniqueness proof (sketch):

Suppose nn has two prime factorizations:

n=p1p2pr=q1q2qsn = p_1 p_2 \cdots p_r = q_1 q_2 \cdots q_s

where all pi,qjp_i, q_j are primes. Since p1n=q1q2qsp_1 \mid n = q_1 q_2 \cdots q_s, by Euclid's lemma (pabp \mid ab and pp prime implies pap \mid a or pbp \mid b), p1p_1 divides some qjq_j.

Since qjq_j is prime and p1p_1 is prime with p1qjp_1 \mid q_j, we must have p1=qjp_1 = q_j. Cancel this factor and repeat. By induction, the factorizations are identical. \blacksquare

4.5 GCD and LCM Properties

Definition: gcd(a,b)\gcd(a, b) is the greatest common divisor of aa and bb — the largest positive integer dividing both.

Definition: lcm(a,b)\mathrm{lcm}(a, b) is the least common multiple — the smallest positive integer that both aa and bb divide.

Key relationship:

gcd(a,b)lcm(a,b)=ab\gcd(a, b) \cdot \mathrm{lcm}(a, b) = |ab|

Proof (using prime factorizations):

Let a=piαia = \prod p_i^{\alpha_i} and b=piβib = \prod p_i^{\beta_i} (with αi,βi0\alpha_i, \beta_i \ge 0, padding with zeros where needed).

gcd(a,b)=pimin(αi,βi)\gcd(a, b) = \prod p_i^{\min(\alpha_i, \beta_i)}

lcm(a,b)=pimax(αi,βi)\mathrm{lcm}(a, b) = \prod p_i^{\max(\alpha_i, \beta_i)}

gcd(a,b)lcm(a,b)=pimin(αi,βi)+max(αi,βi)=piαi+βi=ab\gcd(a,b) \cdot \mathrm{lcm}(a,b) = \prod p_i^{\min(\alpha_i, \beta_i) + \max(\alpha_i, \beta_i)} = \prod p_i^{\alpha_i + \beta_i} = ab. \blacksquare

Bezout's Identity: For integers a,ba, b (not both zero), there exist integers x,yx, y such that:

gcd(a,b)=ax+by\gcd(a, b) = ax + by

This is proved constructively by the Euclidean algorithm (reversing the steps).

4.6 Euclidean Algorithm Correctness

Theorem: The Euclidean algorithm correctly computes gcd(a,b)\gcd(a, b).

Algorithm: For ab0a \ge b \ge 0:

a=bq1+r1,0r1<ba = bq_1 + r_1, \quad 0 \le r_1 \lt b b=r1q2+r2,0r2<r1b = r_1 q_2 + r_2, \quad 0 \le r_2 \lt r_1 r1=r2q3+r3,0r3<r2r_1 = r_2 q_3 + r_3, \quad 0 \le r_3 \lt r_2

\vdots

rn2=rn1qn+rn,0rn<rn1r_{n-2} = r_{n-1} q_n + r_n, \quad 0 \le r_n \lt r_{n-1} rn1=rnqn+1+0r_{n-1} = r_n q_{n+1} + 0

The algorithm terminates with gcd(a,b)=rn\gcd(a, b) = r_n.

Proof of correctness:

Claim 1: The remainders form a strictly decreasing sequence of non-negative integers: b>r1>r2>0b \gt r_1 \gt r_2 \gt \cdots \ge 0. This follows from the division algorithm: each remainder is strictly less than the previous divisor.

Claim 2: The algorithm terminates. Since the remainders are non-negative integers and strictly decrease, the process must eventually reach remainder 0 (by the well-ordering principle).

Claim 3: gcd(a,b)=gcd(b,r1)=gcd(r1,r2)==gcd(rn1,rn)\gcd(a, b) = \gcd(b, r_1) = \gcd(r_1, r_2) = \cdots = \gcd(r_{n-1}, r_n). This follows from the key lemma: gcd(a,b)=gcd(b,abq)\gcd(a, b) = \gcd(b, a - bq) for any integer qq.

Proof of the key lemma: if dad \mid a and dbd \mid b, then d(abq)d \mid (a - bq). Conversely, if dbd \mid b and d(abq)d \mid (a - bq), then d(abq+bq)=ad \mid (a - bq + bq) = a. So dd divides both aa and bb if and only if dd divides both bb and abqa - bq. Therefore gcd(a,b)=gcd(b,abq)\gcd(a, b) = \gcd(b, a - bq).

Claim 4: gcd(rn1,rn)=rn\gcd(r_{n-1}, r_n) = r_n. Since rnrn1r_n \mid r_{n-1} (the last step is rn1=rnqn+1r_{n-1} = r_n q_{n+1}), the GCD is rnr_n.

Combining claims: gcd(a,b)=rn\gcd(a, b) = r_n. \blacksquare

Worked Example: Compute gcd(252,105)\gcd(252, 105) and express it as a linear combination.

252=1052+42252 = 105 \cdot 2 + 42 105=422+21105 = 42 \cdot 2 + 21 42=212+042 = 21 \cdot 2 + 0

So gcd(252,105)=21\gcd(252, 105) = 21.

Back-substitution to find Bezout coefficients:

21=10542221 = 105 - 42 \cdot 2

42=252105242 = 252 - 105 \cdot 2

21=105(2521052)2=1052522+1054=1055252221 = 105 - (252 - 105 \cdot 2) \cdot 2 = 105 - 252 \cdot 2 + 105 \cdot 4 = 105 \cdot 5 - 252 \cdot 2

So gcd(252,105)=21=5105+(2)252\gcd(252, 105) = 21 = 5 \cdot 105 + (-2) \cdot 252.


5. Common Pitfalls

5.1 Circular Reasoning

Circular reasoning occurs when you implicitly assume what you are trying to prove.

Example of circular reasoning:

"Prove that 2\sqrt{2} is irrational."

Bad proof: "Since 2\sqrt{2} is irrational, it cannot be written as ab\frac{a}{b}. Therefore 2\sqrt{2} is irrational."

The conclusion (2\sqrt{2} is irrational) appears in the hypothesis. This proves nothing.

How to avoid: Check that every step of your proof relies only on axioms, definitions, and previously established theorems — never on the statement you are proving.

danger

Circular reasoning is the most dangerous logical fallacy in mathematics because it can look convincing. Always verify that your proof does not contain the conclusion as an unstated assumption.

5.2 Assuming the Conclusion

Related to circular reasoning, this occurs when you "work backwards" from the conclusion without checking reversibility.

Example:

"Prove: If x2=4x^2 = 4, then x=2x = 2."

Bad proof: x2=4x^2 = 4, so x=4=2x = \sqrt{4} = 2. Done.

This is wrong because x=2x = -2 also satisfies x2=4x^2 = 4. The step x=4x = \sqrt{4} implicitly assumes x0x \ge 0.

The statement itself is false: x2=4x^2 = 4 does not imply x=2x = 2. The correct implication is x2=4    x=2x^2 = 4 \implies x = 2 or x=2x = -2.

5.3 Incorrect Negation of Implications

The negation of P    QP \implies Q is NOT P    ¬QP \implies \neg Q.

¬(P    Q)P¬Q\neg(P \implies Q) \equiv P \wedge \neg Q

Reasoning: P    QP \implies Q is logically equivalent to ¬PQ\neg P \vee Q (check the truth table). So:

¬(P    Q)¬(¬PQ)P¬Q\neg(P \implies Q) \equiv \neg(\neg P \vee Q) \equiv P \wedge \neg Q

Worked Example:

Negate: "If it rains, the ground gets wet."

Correct negation: "It rains AND the ground does NOT get wet." (The premise holds but the conclusion fails.)

Incorrect negation: "If it rains, the ground does not get wet." (This is a different implication entirely.)

warning

This is one of the most common errors on IB exams. Memorize: the negation of "PP implies QQ" is "PP AND not QQ."

5.4 Induction Base Case Errors

The pitfall: Skipping the base case or proving the wrong base case.

Example:

"Prove: n=n+1n = n + 1 for all n1n \ge 1."

Bad proof: Assume k=k+1k = k + 1. Then k+1=(k+1)+1=k+2k + 1 = (k + 1) + 1 = k + 2. So k+1=k+2k + 1 = k + 2, i.e., k+1=(k+1)+1k + 1 = (k+1) + 1. QED.

This "proof" never checks the base case. The statement 1=1+1=21 = 1 + 1 = 2 is false, so the base case fails.

Another example: When using strong induction with a recursive definition like the Fibonacci sequence, you may need MULTIPLE base cases. Proving the induction step from P(k)P(k) and P(k1)P(k-1) to P(k+1)P(k+1) requires both P(1)P(1) and P(2)P(2) as base cases.

Worked Example: Prove Fn2nF_n \le 2^n for all n1n \ge 1, where FnF_n is the nn-th Fibonacci number.

Let P(n)P(n): Fn2nF_n \le 2^n.

Base cases: F1=12=21F_1 = 1 \le 2 = 2^1. True. F2=14=22F_2 = 1 \le 4 = 2^2. True. We need TWO base cases because the recurrence Fk+1=Fk+Fk1F_{k+1} = F_k + F_{k-1} references two previous terms.

Strong inductive hypothesis: Assume Fj2jF_j \le 2^j for all 1jk1 \le j \le k, where k2k \ge 2.

Inductive step:

Fk+1=Fk+Fk12k+2k1=2k1(2+1)=32k1F_{k+1} = F_k + F_{k-1} \le 2^k + 2^{k-1} = 2^{k-1}(2 + 1) = 3 \cdot 2^{k-1}

Now, 32k142k1=2k+13 \cdot 2^{k-1} \le 4 \cdot 2^{k-1} = 2^{k+1} since 343 \le 4.

Therefore Fk+12k+1F_{k+1} \le 2^{k+1}. By strong induction, P(n)P(n) holds for all n1n \ge 1.

If we had only checked P(1)P(1) and tried to use weak induction, the inductive step from P(k)P(k) alone would not suffice because Fk+1F_{k+1} depends on Fk1F_{k-1} as well.

5.5 Weak vs. Strong Induction Confusion

Weak induction assumes P(k)P(k) and proves P(k+1)P(k+1).

Strong induction assumes P(1),P(2),,P(k)P(1), P(2), \ldots, P(k) and proves P(k+1)P(k+1).

Common mistake: Using weak induction when you need strong induction. If P(k+1)P(k+1) depends on P(k1)P(k-1) or earlier, weak induction's hypothesis is insufficient.

Common mistake (reverse): Always using strong induction "to be safe." While not logically wrong, this is usually unnecessary and makes proofs harder to read. Use weak induction unless the structure of the problem clearly requires strong induction.

tip

Rule of thumb: If the statement for nn depends only on the statement for n1n-1 (like n!=n(n1)!n! = n \cdot (n-1)!), use weak induction. If it depends on earlier terms (like Fn=Fn1+Fn2F_n = F_{n-1} + F_{n-2}), use strong induction.

5.6 Confusing Converse with Contrapositive

OriginalP    QP \implies Q
Contrapositive (equivalent)¬Q    ¬P\neg Q \implies \neg P
Converse (NOT equivalent)Q    PQ \implies P
Inverse (NOT equivalent)¬P    ¬Q\neg P \implies \neg Q

A true implication does NOT guarantee a true converse.

Example: "If a number is prime, then it is positive." True (for integers). Converse: "If a number is positive, then it is prime." False (44 is positive but not prime).

5.7 Forgetting Vacuous Truth

The implication P    QP \implies Q is true whenever PP is false, regardless of QQ.

Example: "If 1=01 = 0, then the moon is made of cheese." This is TRUE (vacuously), because the premise "1=01 = 0" is false.

Practical consequence: To disprove P    QP \implies Q, you must show PP is true AND QQ is false. Showing PP is false does NOT disprove the implication.

5.8 Incorrect Quantifier Negation

StatementCorrect Negation
x,  P(x)\forall x, \; P(x)x,  ¬P(x)\exists x, \; \neg P(x)
x,  P(x)\exists x, \; P(x)x,  ¬P(x)\forall x, \; \neg P(x)
x,  y,  P(x,y)\forall x, \; \exists y, \; P(x,y)x,  y,  ¬P(x,y)\exists x, \; \forall y, \; \neg P(x,y)
x,  y,  P(x,y)\exists x, \; \forall y, \; P(x,y)x,  y,  ¬P(x,y)\forall x, \; \exists y, \; \neg P(x,y)
danger

The order of quantifiers matters. "For every person, there exists a mother" is very different from "There exists a person who is the mother of everyone." The negation of "for every x there exists y" is "there exists x such that for every y" — quantifiers flip but their order is preserved.


6. Proof Techniques Summary

MethodWhen to UseKey Idea
DirectStraightforward chain from P to QAssume P, derive Q
ContradictionStatement is a negation, or no obvious direct pathAssume not-P, find contradiction
ContrapositiveNot-Q gives more to work with than PProve not-Q implies not-P
Induction (weak)Statement depends on previous caseBase case + inductive step
Induction (strong)Statement depends on multiple previous casesBase cases + strong inductive step
ExhaustionFinite, small number of casesCheck each case
CounterexampleDisproving a universal claimFind one exception
Comprehensive Exercise Set
  1. Prove that 6\sqrt{6} is irrational.

  2. Prove that the product of any three consecutive integers is divisible by 6.

  3. Prove by induction that i=0n2i=2n+11\sum_{i=0}^{n} 2^i = 2^{n+1} - 1 for all n0n \ge 0.

  4. Prove that if n2n^2 is divisible by 3, then nn is divisible by 3.

  5. Disprove: "For all real numbers xx, x2>xx^2 \gt x."

  6. Negate: "For every real number xx, there exists a real number yy such that x+y=0x + y = 0."

  7. Prove that there is no largest prime number.

  8. Prove that log57\log_5{7} is irrational.

  9. Find gcd(372,84)\gcd(372, 84) using the Euclidean algorithm, and express it as a linear combination.

  10. Prove that if ab(modm)a \equiv b \pmod{m} and nmn \mid m, then ab(modn)a \equiv b \pmod{n}.

Answers to Selected Exercises

Exercise 1: Assume 6=ab\sqrt{6} = \frac{a}{b} in lowest terms. Then a2=6b2a^2 = 6b^2, so a2a^2 is even (divisible by 2), hence aa is even. Write a=2ka = 2k. Then 4k2=6b24k^2 = 6b^2, so 2k2=3b22k^2 = 3b^2. This means 3b23b^2 is even, so b2b^2 is even, so bb is even. Both aa and bb are even, contradicting lowest terms.

Exercise 2: Among any three consecutive integers n,n+1,n+2n, n+1, n+2: one is divisible by 3 (since one of n,n+1,n+2n, n+1, n+2 is a multiple of 3), and at least one is even (among any two consecutive integers, one is even, and we have three). Since 2 and 3 are coprime, the product is divisible by 23=62 \cdot 3 = 6.

Exercise 4: By contrapositive. If 3n3 \nmid n, then n1(mod3)n \equiv 1 \pmod{3} or n2(mod3)n \equiv 2 \pmod{3}. If n1n \equiv 1, then n21(mod3)n^2 \equiv 1 \pmod{3}. If n2n \equiv 2, then n241(mod3)n^2 \equiv 4 \equiv 1 \pmod{3}. In either case n2≢0(mod3)n^2 \not\equiv 0 \pmod{3}, so 3n23 \nmid n^2.

Exercise 5: Counterexample: x=0.5x = 0.5. Then x2=0.250.5x^2 = 0.25 \not\gt 0.5. Also x=0x = 0: 02=000^2 = 0 \not\gt 0.

Exercise 6: x{R},  y{R},  x+y0\exists x \in \mathbb{'\{'}R{'\}'}, \; \forall y \in \mathbb{'\{'}R{'\}'}, \; x + y \ne 0.

Exercise 10: ab(modm)a \equiv b \pmod{m} means m(ab)m \mid (a-b). Since nmn \mid m and m(ab)m \mid (a-b), by transitivity of divisibility, n(ab)n \mid (a-b). Therefore ab(modn)a \equiv b \pmod{n}.


tip

Diagnostic Test Ready to test your understanding of Proof and Logic? The diagnostic test contains the hardest questions within the IB specification for this topic, each with a full worked solution.

Unit tests probe edge cases and common misconceptions. Integration tests combine Proof and Logic with other IB mathematics topics to test synthesis under exam conditions.

See Diagnostic Guide for instructions on self-marking and building a personal test matrix.