NDA Maths · Probability

Conditional Probability, Total Probability & Bayes'

Updating a probability once you know something has happened — conditional probability, the multiplication rule, total probability, and Bayes' flip.

Why this matters

This is the conceptual peak of the chapter and its second-largest subtopic (29 questions), with most rated MODERATE — the marks that separate scorers. Everything here flows from one idea: knowing that B happened shrinks the sample space to B. From that come the multiplication rule, total probability over a partition, and Bayes' theorem for reversing a conditional. Master the bag/machine/factory archetypes and you cover almost every question in this subtopic.

Concept 1 of 4

Conditional probability

Intuition

Once you know event BB has happened, the only outcomes still possible are those inside BB — the sample space shrinks to BB. The conditional probability of AA given BB is the share of that shrunken world in which AA also holds.

Definition

The conditional probability of AA given BB (with P(B)>0P(B) > 0) is P(AB)=P(AB)P(B)P(A \mid B) = \dfrac{P(A \cap B)}{P(B)}. It re-normalises the joint probability P(AB)P(A \cap B) by the probability of the condition BB. If AA and BB are independent, conditioning changes nothing: P(AB)=P(A)P(A \mid B) = P(A).

Conditional probability

P(AB)=P(AB)P(B),P(B)>0P(A \mid B) = \dfrac{P(A \cap B)}{P(B)}, \qquad P(B) > 0
  • P(AB)P(A \cap B)probability both occur
  • P(B)P(B)probability of the condition — the new "total"

Diagram · P(A | B) restricts the world to B

SABA∩B

Once B is given, only the amber region counts — it's the new whole. P(A | B) is the slice of B that also lies in A: P(A | B) = P(A∩B) / P(B). Dividing by P(B) is exactly "rescale B to be the new 100%".

Worked example

For two events, P(AB)=0.2P(A \cap B) = 0.2 and P(B)=0.5P(B) = 0.5. Find P(AB)P(A \mid B).
  1. Apply the definition: P(AB)=P(AB)P(B)=0.20.5P(A \mid B) = \dfrac{P(A \cap B)}{P(B)} = \dfrac{0.2}{0.5}.
  2. =0.4= 0.4.
Answer:0.40.4
Practice this conceptself-check · 4 quick reps

Try it yourself

P(AB)=16P(A \cap B) = \dfrac{1}{6} and P(B)=13P(B) = \dfrac{1}{3}. Find P(AB)P(A \mid B).

Practice — Level 1 (4 reps)

Quick reps to lock in the method. Try each, then check.

  1. 1.
    P(AB)=0.3,P(B)=0.6P(A\cap B)=0.3, P(B)=0.6. P(AB)P(A\mid B)?
  2. 2.
    P(AB)=14,P(B)=12P(A\cap B)=\tfrac{1}{4}, P(B)=\tfrac{1}{2}. P(AB)P(A\mid B)?
  3. 3.
    If A,BA,B independent, P(AB)=P(A\mid B)=?
  4. 4.
    P(AB)=0.1,P(AB)=0.5P(A\cap B)=0.1, P(A\mid B)=0.5. P(B)P(B)?

From the bank · past-year question

Example 1ProbabilityEASY
If P(A)=0.3P(A)=0.3, P(B)=0.4P(B)=0.4 and P(AB)=0.5P(A|B)=0.5, then what is the value of P(BA)P(B|A)?

[Q118 · Sep · 2025]

Mind which event is the condition: P(AB)P(BA)P(A\mid B) \ne P(B\mid A) in general

The denominator is the probability of the GIVEN event. P(AB)P(A\mid B) divides by P(B)P(B); P(BA)P(B\mid A) divides by P(A)P(A). They are equal only when P(A)=P(B)P(A) = P(B).

The condition must have positive probability

P(AB)P(A\mid B) is undefined when P(B)=0P(B) = 0. Check the condition is possible before dividing.

Concept 2 of 4

Multiplication rule & restricted sample space

Intuition

Rearranging the conditional definition gives the general multiplication rule: the chance of both is the chance of one times the chance of the other given the first. And when a question says "given that …", the fastest route is often to just list the outcomes inside the condition and count within them.

Definition

General multiplication rule: P(AB)=P(B)P(AB)=P(A)P(BA)P(A \cap B) = P(B)\,P(A \mid B) = P(A)\,P(B \mid A) (no independence needed). Restricted sample space: for equally-likely outcomes, P(AB)=n(AB)n(B)P(A \mid B) = \dfrac{n(A \cap B)}{n(B)} — count favourable outcomes among the outcomes in BB only. This is the quickest method for dice/card "given that" problems.

Multiplication rule / restricted counting

P(AB)=P(B)P(AB),P(AB)=n(AB)n(B)P(A \cap B) = P(B)\,P(A \mid B), \qquad P(A \mid B) = \dfrac{n(A \cap B)}{n(B)}
  • n(B)n(B)number of outcomes in the condition — the restricted total
  • n(AB)n(A \cap B)favourable outcomes within the condition

Worked example

Two fair dice are thrown. Given that the sum is 6, what is the probability that one of the dice shows a 2?
  1. Restrict to sum =6= 6: outcomes (1,5),(2,4),(3,3),(4,2),(5,1)(1,5),(2,4),(3,3),(4,2),(5,1), so n(B)=5n(B) = 5.
  2. Among these, those showing a 2: (2,4)(2,4) and (4,2)(4,2), so n(AB)=2n(A \cap B) = 2.
  3. Conditional probability: 25\dfrac{2}{5}.
Answer:25\dfrac{2}{5}
Practice this conceptself-check · 4 quick reps

Try it yourself

A fair die is rolled. Given that the number is even, what is the probability that it is a 4?

Practice — Level 1 (4 reps)

Quick reps to lock in the method. Try each, then check.

  1. 1.
    Two dice, given sum =8=8: how many outcomes?
  2. 2.
    Die rolled, given odd: P(it is 3)P(\text{it is } 3)?
  3. 3.
    P(B)=0.4,P(AB)=0.5P(B)=0.4, P(A\mid B)=0.5. P(AB)P(A\cap B)?
  4. 4.
    Two dice, given sum =6=6: P(a doublet)P(\text{a doublet})?

From the bank · past-year question

Example 2ProbabilityMODERATE
If two fair dice are rolled then what is the conditional probability that the first dice lands on 6 given that the sum of numbers on the dice is 8?

[Q110 · Apr · 2019]

Under a condition, the TOTAL changes to n(B)n(B), not 36 (or 6)

"Given that …" shrinks the sample space. Divide by the number of outcomes in the condition, not by the original total. Forgetting to restrict is the classic conditional-probability error.

The general multiplication rule needs P(AB)P(A \mid B), not P(A)P(A)

P(AB)=P(A)P(B)P(A \cap B) = P(A)P(B) is only the independent case. In general use P(AB)=P(B)P(AB)P(A \cap B) = P(B)P(A \mid B).

Concept 3 of 4

Total probability (over a partition)

Intuition

When an outcome can arrive through several mutually exclusive routes — pick a bag, then draw a ball — its overall probability is the weighted sum over the routes: probability of each route times the probability of the outcome along that route.

Definition

If B1,,BnB_1, \dots, B_n partition the sample space (mutually exclusive and exhaustive), then for any event AA, P(A)=iP(Bi)P(ABi)P(A) = \sum_{i} P(B_i)\,P(A \mid B_i). Each term is "probability of route ii" times "probability of AA given route ii".

Total probability

P(A)=i=1nP(Bi)P(ABi)P(A) = \sum_{i=1}^{n} P(B_i)\,P(A \mid B_i)
  • BiB_ithe mutually exclusive, exhaustive routes (partition)
  • P(ABi)P(A \mid B_i)probability of AA along route ii

Visualization · total probability & Bayes tree

P(B₁)=0.6P(B₂)=0.4P(A|B₁)=0.2P(A|B₂)=0.5B₁B₂A: 0.12A: 0.2not Anot A
P(A) = 0.12 + 0.2 = 0.32P(B₁|A) = 0.12/0.32 = 0.375

Each leaf is a route product P(Bᵢ)·P(A|Bᵢ). Total probability adds the two leaves that end in A; Bayes' theorem divides one of those leaves by that total to flip the conditioning.

Worked example

Bag I has 3 red and 2 white balls; Bag II has 1 red and 4 white. A bag is chosen at random and one ball is drawn. What is the probability it is red?
  1. Each bag is chosen with probability 12\dfrac{1}{2}.
  2. Red given Bag I: 35\dfrac{3}{5}; red given Bag II: 15\dfrac{1}{5}.
  3. Total probability: 1235+1215=310+110=410=25\dfrac{1}{2}\cdot\dfrac{3}{5} + \dfrac{1}{2}\cdot\dfrac{1}{5} = \dfrac{3}{10} + \dfrac{1}{10} = \dfrac{4}{10} = \dfrac{2}{5}.
Answer:25\dfrac{2}{5}
Practice this conceptself-check · 4 quick reps

Try it yourself

Box A has 2 white and 3 black balls; Box B has 4 white and 1 black. A box is chosen at random and one ball is drawn. Find the probability it is white.

Practice — Level 1 (4 reps)

Quick reps to lock in the method. Try each, then check.

  1. 1.
    Routes 12,12\tfrac{1}{2},\tfrac{1}{2}; P(A)=14,34P(A\mid\cdot)=\tfrac{1}{4},\tfrac{3}{4}. P(A)P(A)?
  2. 2.
    Two equally-likely bags, P(red)=0.6,0.2P(\text{red})=0.6, 0.2. P(red)P(\text{red})?
  3. 3.
    Total probability needs the routes to be …?
  4. 4.
    P(B1)=0.3,P(AB1)=0.5;P(B2)=0.7,P(AB2)=0.1P(B_1)=0.3,P(A\mid B_1)=0.5; P(B_2)=0.7,P(A\mid B_2)=0.1. P(A)P(A)?

From the bank · past-year question

Example 3ProbabilityMODERATE
One bag: 3W 2B; another: 5W 3B. Bag chosen randomly, ball drawn. P(white)?

[Q117 · Apr · 2018]

Weight each route by its own probability

P(A)P(A) is not the average of the conditional probabilities unless the routes are equally likely. Always multiply each P(ABi)P(A \mid B_i) by P(Bi)P(B_i) before summing.

The routes must be a partition

Total probability requires the BiB_i to be mutually exclusive and to cover every possibility. Missing a route (or overlapping routes) breaks the sum.

Concept 4 of 4

Bayes' theorem (reversing the conditional)

Intuition

Total probability goes forward — route to outcome. Bayes' theorem goes backward: given the outcome, which route did it most likely come from? It rescales each route's forward contribution by the total probability of the outcome.

Definition

For a partition B1,,BnB_1, \dots, B_n and an observed event AA, P(BkA)=P(Bk)P(ABk)iP(Bi)P(ABi)P(B_k \mid A) = \dfrac{P(B_k)\,P(A \mid B_k)}{\sum_i P(B_i)\,P(A \mid B_i)}. The numerator is route kk's forward contribution; the denominator is the total probability of AA from the previous concept.

Bayes' theorem

P(BkA)=P(Bk)P(ABk)iP(Bi)P(ABi)P(B_k \mid A) = \dfrac{P(B_k)\,P(A \mid B_k)}{\displaystyle\sum_{i} P(B_i)\,P(A \mid B_i)}
  • numeratorthe chosen route's forward contribution P(Bk)P(ABk)P(B_k)P(A\mid B_k)
  • denominatortotal probability of AA over all routes

Worked example

A factory makes 60% of its items on machine A (2% defective) and 40% on machine B (5% defective). An item is found defective. What is the probability it was made on machine A?
  1. Forward contributions: A: 0.6×0.02=0.0120.6 \times 0.02 = 0.012; B: 0.4×0.05=0.0200.4 \times 0.05 = 0.020.
  2. Total probability of a defective: 0.012+0.020=0.0320.012 + 0.020 = 0.032.
  3. Bayes: P(Adefective)=0.0120.032=38=0.375P(A \mid \text{defective}) = \dfrac{0.012}{0.032} = \dfrac{3}{8} = 0.375.
Answer:38\dfrac{3}{8} (0.375)
Practice this conceptself-check · 4 quick reps

Try it yourself

1% of a population has a disease. A test is positive 90% of the time for the diseased and 10% of the time for the healthy. A person tests positive. What is the probability they have the disease?

Practice — Level 1 (4 reps)

Quick reps to lock in the method. Try each, then check.

  1. 1.
    Forward contributions 0.0120.012 and 0.0200.020. P(firstA)P(\text{first}\mid A)?
  2. 2.
    Bayes' denominator is computed by which rule?
  3. 3.
    Routes 12,12\tfrac{1}{2},\tfrac{1}{2}; P(A)=0.2,0.6P(A\mid\cdot)=0.2,0.6. P(B2A)P(B_2\mid A)?
  4. 4.
    If both routes give the same P(ABi)P(A\mid B_i), P(BkA)P(B_k\mid A) equals?

From the bank · past-year question

Example 4ProbabilityMODERATE
A building may collapse due to faulty design (prob 0.1) or not (prob 0.9). Collapse prob: 0.95 if faulty, 0.45 if not. If building collapsed, what is probability it was due to faulty design?

[Q110 · Sep · 2023]

Do not confuse P(BkA)P(B_k \mid A) with P(ABk)P(A \mid B_k)

The question gives you the forward conditionals P(ABi)P(A \mid B_i) (defect rate per machine) and asks for the reverse P(BkA)P(B_k \mid A) (which machine, given a defect). Bayes is exactly the tool that flips them — don't report the forward number.

The denominator is the FULL total probability, not just P(ABk)P(A \mid B_k)

Divide route kk's contribution by the sum over ALL routes. Using only the chosen route's term gives 1 every time — a sure sign the denominator is wrong.

Summary — formulas & gotchas at a glance

A revision cheat-sheet for the formulas and gotchas above. Click any concept name to jump back to its full explanation.

Formulas (4)

  • Conditional probability

    Conditional probability

    P(AB)=P(AB)P(B),P(B)>0P(A \mid B) = \dfrac{P(A \cap B)}{P(B)}, \qquad P(B) > 0
  • Multiplication rule & restricted sample space

    Multiplication rule / restricted counting

    P(AB)=P(B)P(AB),P(AB)=n(AB)n(B)P(A \cap B) = P(B)\,P(A \mid B), \qquad P(A \mid B) = \dfrac{n(A \cap B)}{n(B)}
  • Total probability (over a partition)

    Total probability

    P(A)=i=1nP(Bi)P(ABi)P(A) = \sum_{i=1}^{n} P(B_i)\,P(A \mid B_i)
  • Bayes' theorem (reversing the conditional)

    Bayes' theorem

    P(BkA)=P(Bk)P(ABk)iP(Bi)P(ABi)P(B_k \mid A) = \dfrac{P(B_k)\,P(A \mid B_k)}{\displaystyle\sum_{i} P(B_i)\,P(A \mid B_i)}

Watch out for (8)

Mastery check — 5 interleaved questions

Try each one before clicking. Questions are interleaved across the concepts above, not grouped — interleaving sharpens transfer.

Example 1ProbabilityEASY
Consider the following data for the items that follow: There are 90 applicants for a job. Some of them are graduates. Some of them have less than three years experience. | | Number of graduates | Number of non-graduates | |---|---|---| | At least 3 years experience | 18 | 9 | | Less than 3 years experience | 36 | 27 | Let G be the event that the first applicant interviewed is a graduate and T be the event that first applicant interviewed has at least 3 years experience.
What is P(GT)P\left(G\cap\overline{T}\right) equal to?

[Q112 · Apr · 2022]

Example 2ProbabilityMODERATE
Three perfect dice are rolled. Under the condition that no two show the same face, what is the probability that one of the faces shown is an ace (one)?

[Q118 · Sep · 2024]

Example 3ProbabilityHARD
One bag contains 3 white and 2 black balls, another bag contains 2 white and 3 black balls. Two balls are drawn from the first bag and put into the second bag and then a ball is drawn from the second bag. What is the probability that it is white?

[Q117 · Apr · 2023]

Example 4ProbabilityMODERATE
In a bolt factory, machines X, Y, Z manufacture bolts that are respectively 25%, 35% and 40% of the factory's total output. The machines X, Y, Z respectively produce 2%, 4% and 5% defective bolts. A bolt is drawn at random from the product and is found to be defective. What is the probability that it was manufactured by machine X?

[Q106 · Sep · 2018]

Example 5ProbabilityEASY
Consider the following data for the items that follow: There are 90 applicants for a job. Some of them are graduates. Some of them have less than three years experience. | | Number of graduates | Number of non-graduates | |---|---|---| | At least 3 years experience | 18 | 9 | | Less than 3 years experience | 36 | 27 | Let G be the event that the first applicant interviewed is a graduate and T be the event that first applicant interviewed has at least 3 years experience.
What is P(GT)P\left(G|\overline{T}\right) equal to?

[Q113 · Apr · 2022]

Drill every past-year question on this subtopic

29 questions from the bank — paginated, with cart and Word-export support.

Related notes