NDA Maths · Statistics
Regression and Correlation
How two variables move together — correlation measures the strength of the link, regression draws the best-fit line.
Why this matters
27 PYQs across 2017–2026, with 6 HARD — the highest hard-rate of any Statistics subtopic. Almost every recent paper asks one of three shapes: properties of the correlation coefficient r under linear transformation, finding regression lines, identifying which equation is which, or computing the angle between them. Five tight concepts cover the entire surface.
Concept 1 of 5
Correlation Coefficient and Its Properties
Intuition
Definition
For paired observations, , bounded by . If and , then — magnitude is preserved, sign flips when one of is negative.
Correlation Coefficient and Invariance Rule
- covariance of X and Y
- standard deviations of X and Y
- if have same sign, otherwise
Visualization · slide r, watch the cloud tighten
At r = ±1 the points fall exactly on a line; toward r = 0 the cloud loses any linear shape. Positive r slopes up, negative r slopes down. r is unitless and always lies in [−1, 1] — it captures tightness and direction, not how steep the line is.
Worked example
- Identify . Shifts do not affect .
- Compute .
- Apply the rule: .
- Result: . Magnitude preserved, sign flipped.
Practice this conceptself-check · 4 quick reps
Try it yourself
Practice — Level 1 (4 reps)
Quick reps to lock in the method. Try each, then check.
- 1.between is . between and ?
- 2.is . between and ?
- 3.is . between and ?
- 4.A computation gives . Possible?
From the bank · past-year question
[Q111 · Sep · 2023]
is bounded by and — always
Shift does not change ; only scale-with-negative-sign flips it
Concept 2 of 5
Lines of Regression
Intuition
Definition
The regression line of on has slope and passes through . The regression line of on has slope and also passes through .
Lines of Regression (point-slope form)
- slope of on line
- slope of on line
- the only point on BOTH regression lines
Visualization · drag the line, watch the error
Red dashes are residuals (vertical distance from each point to the line). SSE is the sum of their squares. The least-squares regression line is the one that makes SSE as small as possible.
Worked example
- With only two points, the regression line is the line joining them — correlation is perfect ().
- Compute slope: .
- Use point-slope through : .
- Simplify: .
Practice this conceptself-check · 4 quick reps
Try it yourself
Practice — Level 1 (4 reps)
Quick reps to lock in the method. Try each, then check.
- 1.Both regression lines always pass through which point?
- 2.Regression lines are and . Find .
- 3.. If and , slope?
- 4.Slope of the -on- line through and ?
From the bank · past-year question
[Q109 · Apr · 2023]
Both regression lines always pass through
From raw bivariate data, compute via the Pearson form
Concept 3 of 5
Regression Coefficients and Their Link to r
Intuition
Definition
, with . Therefore always. Also — the two slopes can never have opposite signs.
Product Identity
- Sign of same as the common sign of and
Worked example
- Pairing A — first line as on : , so . Second line as on : , so .
- Check the product: — valid.
- Pairing B (the swap) would give both slopes , product — impossible, so pairing A is correct.
- Hence . Both slopes are positive, so .
Practice this conceptself-check · 4 quick reps
Try it yourself
Practice — Level 1 (4 reps)
Quick reps to lock in the method. Try each, then check.
- 1., . Find .
- 2., . Find .
- 3., . Find .
- 4.Can and hold for one dataset?
From the bank · past-year question
[Q111 · Apr · 2026]
is non-negotiable
Both slopes share the sign of
Concept 4 of 5
Identifying Which Regression Line is Which
Intuition
Definition
Two regression lines and can be paired in two ways. The correct pairing is the one for which the product of the slopes (interpreted as ) is at most 1. The wrong pairing always gives a product greater than 1 (provided the lines are distinct).
Sieve Inequality
Diagram · which regression line is which
Both lines pass through the mean point (x̄, ȳ). The y-on-x line is the flatter one (it minimises vertical gaps); x-on-y is steeper. Their slopes are byx and 1/bxy, with byx·bxy = r².
Worked example
- Pairing A: first as on gives , so . Second as on gives , so . Product = — valid.
- Pairing B (swap): first as on gives ; second as on gives . Product = — rejected.
- Conclusion: the first line is on (); the second is on ().
Practice this conceptself-check · 4 quick reps
Try it yourself
Practice — Level 1 (4 reps)
Quick reps to lock in the method. Try each, then check.
- 1.A pairing gives slope product . Valid ?
- 2.Pairing A product , pairing B product . Which is correct?
- 3.The product of the two regression slopes equals?
- 4.Why can the wrong pairing exceed ?
From the bank · past-year question
[Q101 · Sep · 2024]
Try the inequality before doing anything else
Concept 5 of 5
Angle Between the Two Regression Lines
Intuition
Definition
Treat the two regression lines as ordinary straight lines in the plane with slopes and (read directly from each equation after solving for ). The acute angle between them satisfies the standard formula below. When the slopes coincide and ; when , and the lines are perpendicular.
Angle between two lines (applied to regression)
- slopes of the two regression lines in the plane
- acute angle between the lines
Diagram · angle between the regression lines
The lines meet at (x̄, ȳ) at angle θ, where tan θ = |(m₂ − m₁) / (1 + m₁m₂)|. As correlation strengthens (r → ±1) the two lines rotate together and θ → 0 — they coincide at perfect correlation. As r → 0 they splay apart, signalling no linear relationship.
Worked example
- Solve each line for . Line 1: , so .
- Line 2: , so .
- Apply the formula: .
- Simplify: .
Practice this conceptself-check · 4 quick reps
Try it yourself
Practice — Level 1 (4 reps)
Quick reps to lock in the method. Try each, then check.
- 1.If , the angle between the two regression lines?
- 2.If , the angle between the regression lines?
- 3.Slopes , . Find .
- 4.Slopes , . The lines are?
From the bank · past-year question
[Q104 · Apr · 2024]
Slope of the -on- line is NOT in the plane
Acute angle only — take absolute value
Summary — formulas & gotchas at a glance
A revision cheat-sheet for the formulas and gotchas above. Click any concept name to jump back to its full explanation.
Formulas (5)
- Correlation Coefficient and Its Properties
Correlation Coefficient and Invariance Rule
- Lines of Regression
Lines of Regression (point-slope form)
- Regression Coefficients and Their Link to r
Product Identity
- Identifying Which Regression Line is Which
Sieve Inequality
- Angle Between the Two Regression Lines
Angle between two lines (applied to regression)
Watch out for (9)
- is bounded by and — always→ Correlation Coefficient and Its Properties
- Shift does not change ; only scale-with-negative-sign flips it→ Correlation Coefficient and Its Properties
- Both regression lines always pass through→ Lines of Regression
- From raw bivariate data, compute via the Pearson form→ Lines of Regression
- is non-negotiable→ Regression Coefficients and Their Link to r
- Both slopes share the sign of→ Regression Coefficients and Their Link to r
- Try the inequality before doing anything else→ Identifying Which Regression Line is Which
- Slope of the -on- line is NOT in the plane→ Angle Between the Two Regression Lines
- Acute angle only — take absolute value→ Angle Between the Two Regression Lines
Mastery check — 5 interleaved questions
Try each one before clicking. Questions are interleaved across the concepts above, not grouped — interleaving sharpens transfer.
[Q114 · Sep · 2022]
[Q117 · Apr · 2020]
[Q106 · Sep · 2021]
[Q104 · Apr · 2024]
[Q106 · Apr · 2018]
Drill every past-year question on this subtopic
27 questions from the bank — paginated, with cart and Word-export support.