How can catastrophic cancellation be avoided?

Main techniques include: (1) rationalization to eliminate the problematic subtraction, (2) Taylor expansion approximation, (3) algebraic identities such as Vieta's formulas, and (4) using higher-precision arithmetic.

Catastrophic Cancellation

Q: What is catastrophic cancellation?

Catastrophic cancellation occurs when subtracting two nearly equal floating-point numbers. The leading significant digits cancel, and the result retains far fewer significant digits than the operands. The subtraction does not create new error but reveals pre-existing rounding errors.

Q: What is the difference between catastrophic cancellation and loss of significance?

Catastrophic cancellation occurs in subtraction of nearly equal values and loses upper significant digits. Loss of significance occurs in addition/subtraction of values with very different magnitudes, where the smaller value's lower digits are lost. Cancellation is specific to subtraction; loss of significance can occur in addition too.

Goals

Understand the quantitative analysis of catastrophic cancellation, verify its effects through concrete numerical examples, and learn how to derive safe alternative formulas.

Prerequisites

Contents

1. Definition
2. Mechanism
3. Typical Occurrences
4. Avoidance Techniques
5. Comparison with Loss of Significance
6. Worked Example
7. FAQ
8. References

1. Definition

Catastrophic cancellation occurs when subtracting two nearly equal floating-point numbers. The leading significant digits cancel, and the result retains far fewer significant digits than the original operands.

The Essence of Cancellation

When $x \approx y$, computing $x - y$ causes all matching upper digits to cancel. The remaining lower digits contain rounding errors, so the relative error of the result is dramatically larger than that of the inputs.

Cancellation does not "create" new error; it "reveals" pre-existing rounding errors.

2. Mechanism

Consider 4-digit decimal floating-point arithmetic.

$x = 1.234 \times 10^0$ (true value $1.2345678\ldots$)

$y = 1.233 \times 10^0$ (true value $1.2334567\ldots$)

$x - y = 0.001 \times 10^0 = 1.000 \times 10^{-3}$

The original $x$ and $y$ had 4 significant digits, but the subtraction result has only 1 reliable digit. After normalization to $1.000 \times 10^{-3}$, the trailing three zeros are "garbage digits" carrying no information.

Figure 1. Mechanism of catastrophic cancellation. Subtracting nearly equal values causes upper digits to cancel, drastically reducing significant digits.

3. Typical Occurrences

Quadratic Formula

In the solution $x = \dfrac{-b \pm \sqrt{b^2 - 4ac}}{2a}$ of $ax^2 + bx + c = 0$, when $b^2 \gg 4ac$, we get $\sqrt{b^2 - 4ac} \approx |b|$, and one of $-b + \sqrt{b^2-4ac}$ or $-b - \sqrt{b^2-4ac}$ suffers from cancellation.

Numerical Differentiation

In the difference formula $f'(x) \approx \dfrac{f(x+h) - f(x)}{h}$, when $h$ is very small, $f(x+h) \approx f(x)$ and the numerator suffers from cancellation.

Trigonometric Functions

Computing $1 - \cos x$ for $x \approx 0$ causes cancellation since $\cos x \approx 1 - x^2/2$.

Exponential Functions

Computing $e^x - 1$ for $x \approx 0$ causes cancellation since $e^x \approx 1$. Most languages provide expm1(x) for accurate evaluation of $e^x - 1$, and log1p(x) for $\log(1+x)$.

4. Avoidance Techniques

Rationalization

To avoid cancellation in $\sqrt{x+1} - \sqrt{x}$, rationalize the numerator:

$$\sqrt{x+1} - \sqrt{x} = \dfrac{(\sqrt{x+1} - \sqrt{x})(\sqrt{x+1} + \sqrt{x})}{\sqrt{x+1} + \sqrt{x}} = \dfrac{1}{\sqrt{x+1} + \sqrt{x}}$$

The right-hand side has no subtraction and is free of cancellation.

Taylor Expansion

Instead of computing $1 - \cos x$ for $x \approx 0$, use:

$$1 - \cos x = \dfrac{x^2}{2} - \dfrac{x^4}{24} + \cdots \approx \dfrac{x^2}{2}$$

Many programming languages provide built-in functions such as expm1(x) ($e^x - 1$) and log1p(x) ($\log(1+x)$).

Algebraic Identities

For the quadratic formula, compute the stable root $x_1$ first, then use Vieta's formula $x_1 x_2 = c/a$ to find the other root.

Addition Theorem

The identity $1 - \cos x = 2\sin^2(x/2)$ avoids cancellation.

5. Comparison with Loss of Significance

Property	Catastrophic Cancellation	Loss of Significance
Cause	Subtraction of nearly equal values	Addition/subtraction of values with very different magnitudes
What is lost	Upper significant digits	Lower digits of the smaller value
Operation	Subtraction only	Can occur in addition too
Result	Increased relative error	Disappearance of small value's information
Avoidance	Expression reformulation, expansion	Reordering operations, compensated summation

6. Worked Example

Example: Quadratic $x^2 - 10^8 x + 1 = 0$

The discriminant is $b^2 - 4ac = 10^{16} - 4 \approx 10^{16}$, so $\sqrt{b^2 - 4ac} \approx 10^8$.

Unstable computation: $x_2 = \dfrac{10^8 - \sqrt{10^{16} - 4}}{2}$ suffers severe cancellation.

Stable computation: First compute $x_1 = \dfrac{10^8 + \sqrt{10^{16} - 4}}{2} \approx 10^8$, then use Vieta's formula $x_2 = 1/x_1 \approx 10^{-8}$.

7. FAQ

Q1. What is catastrophic cancellation?

It occurs when subtracting two nearly equal floating-point numbers, causing the leading significant digits to cancel and dramatically reducing the significant digits of the result.

Q2. How does it differ from loss of significance?

Cancellation loses upper digits in subtraction of nearly equal values. Loss of significance loses lower digits of the smaller value in addition/subtraction of values with very different magnitudes.

Q3. How can it be avoided?

Rationalization, Taylor expansion, algebraic identities (Vieta's formulas), and higher-precision arithmetic.

8. References

Wikipedia "Catastrophic cancellation"
Wikipedia "Loss of significance"
D. Goldberg, "What Every Computer Scientist Should Know About Floating-Point Arithmetic," ACM Computing Surveys, vol. 23, no. 1, pp. 5--48, 1991.
N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed., SIAM, 2002.