Can the bisection method be extended to multiple dimensions?

For a scalar-valued function f: R^n -> R, one can parameterize the line segment between two points as g(t) = f(a + t(b-a)) and apply the standard one-dimensional bisection method. However, for a vector-valued system f: R^n -> R^n, there is no notion of sign, so the bisection method cannot be applied.

Bisection Method

Q: What is the bisection method?

Given a continuous function f with f(a)f(b)<0, the bisection method repeatedly halves the interval [a,b] and selects the subinterval that contains the root. Convergence is guaranteed by the Intermediate Value Theorem.

Q: What is the convergence rate of the bisection method?

The bisection method exhibits linear (first-order) convergence: each iteration halves the interval width. After n iterations the error is at most (b-a)/2^n. Although slower than Newton's method (quadratic convergence), the bisection method guarantees convergence.

Q: What are the disadvantages of the bisection method?

Its main disadvantages are slow convergence (only one bit of accuracy per iteration), the requirement that f have opposite signs at the endpoints of the initial interval, and the inability to handle systems of equations whose codomain is a vector space.

Goals

Understand the principle of the bisection method (Intermediate Value Theorem), its algorithm, linear convergence, stopping criteria, and comparison with Newton's and secant methods.

Prerequisites

Chapter 6: Absolute Error / Chapter 7: Relative Error
Chapter 9: Truncation Error
Continuous functions and the Intermediate Value Theorem

Contents

1. Principle
2. Algorithm
3. Convergence
4. Stopping Criteria
5. Advantages and Disadvantages
6. Worked Example
7. Frequently Asked Questions
8. References

1. Principle

The bisection method is a root-finding algorithm based on the Intermediate Value Theorem.

Intermediate Value Theorem

If $f$ is continuous on $[a,b]$ and $f(a)\cdot f(b) < 0$, then there exists at least one $c \in (a,b)$ such that $f(c) = 0$.

By applying this theorem iteratively, the interval is halved at each step, bracketing the root ever more tightly.

2. Algorithm

Bisection Method

Choose an initial interval $[a,b]$ such that $f$ is continuous and $f(a)\cdot f(b) < 0$.
Compute the midpoint $c = (a+b)/2$.
If $f(c) = 0$, then $c$ is a root; terminate.
If $f(a)\cdot f(c) < 0$, set $b \leftarrow c$; otherwise set $a \leftarrow c$.
Repeat steps 2--4 until a stopping criterion is satisfied.

3. Convergence

After $n$ iterations, the width of the interval containing the root is

$$|b_n - a_n| = \dfrac{b_0 - a_0}{2^n}$$

and the error of the midpoint $c_n$ satisfies

$$|c_n - x^*| \leq \dfrac{b_0 - a_0}{2^{n+1}}$$

This is linear convergence: each iteration yields approximately one additional bit of accuracy. When the initial interval has unit width, obtaining $k$ decimal digits of precision requires approximately $k \times \log_2 10 \approx 3.32k$ iterations.

Convergence speed comparison of the bisection method, Newton's method, and the secant method — Figure 2. Convergence speed comparison ($f(x) = e^x - 3$, root $x^* = \ln 3$, initial interval $[0, 3]$, measured values). Newton's method exhibits quadratic convergence (reaching machine precision in 7 iterations), the secant method shows superlinear convergence (9 iterations), while the bisection method converges linearly with oscillation, requiring 25 iterations.

4. Stopping Criteria

Iteration is terminated when any of the following conditions is satisfied. Here $\varepsilon$ and $\delta$ are user-specified tolerances (e.g., $10^{-8}$), and $N_{\max}$ is the maximum number of iterations (to prevent infinite loops).

Interval width: $|b_n - a_n| < \varepsilon$ (the root is located within precision $\varepsilon$)
Function value: $|f(c_n)| < \delta$ ($c_n$ is sufficiently close to a root)
Iteration count: $n \geq N_{\max}$ (computation time limit)

The interval width criterion is the most reliable. The required number of iterations can be determined in advance as $n \geq \lceil\log_2((b_0-a_0)/\varepsilon)\rceil - 1$.

4.1 Practical guidelines for the stopping criterion

Interval-width tolerance $\varepsilon$: a safe choice is roughly twice the required absolute accuracy (e.g. for $10^{-8}$ accuracy, set $\varepsilon = 5 \times 10^{-9}$). The error of the returned midpoint $c_n$ is at most $\varepsilon/2$.
Combine with the function-value test: relying on $|f(c_n)| < \delta$ alone risks early termination in regions where $f$ has a steep slope (a tiny $|f|$ does not necessarily imply a small error in $x$). Combine it with the interval-width test using a logical AND.
Maximum iteration count $N_{\max}$: set roughly twice the theoretical requirement $\lceil\log_2((b_0-a_0)/\varepsilon)\rceil$ as a safety margin to absorb floating-point round-off that may force extra iterations.

5. Advantages and Disadvantages

Advantages	Disadvantages
Convergence is guaranteed for continuous functions	Slow convergence (linear)
No derivative $f'(x)$ required	Requires opposite signs at endpoints of initial interval
Extremely simple to implement	Difficulty finding repeated roots (no sign change)
Number of iterations is predictable in advance	Difficult to extend to multiple dimensions

The fact that no derivative is required is a significant practical advantage. Newton's method converges quadratically but requires the computation of $f'(x)$ at each iteration. When an analytical expression for $f'(x)$ is unavailable, or when $f$ itself is the output of a numerical simulation, Newton's method becomes impractical. The secant method does not require derivatives either, but may diverge depending on the choice of initial values. The bisection method is "slow but reliable" and is also useful as a preprocessing step to provide initial values for other methods.

The reason that extension to multiple dimensions is difficult lies in the essence of the bisection method: bracketing the root between two points with opposite signs. Whether the bisection method is applicable depends not on the dimension of the domain but on the dimension of the codomain.

Scalar codomain ($f: \mathbb{R}^n \to \mathbb{R}$) -- The bisection method is applicable. By parameterizing the line segment between two points $\mathbf{a}$ and $\mathbf{b}$ in $n$-dimensional space as $g(t) = f(\mathbf{a} + t(\mathbf{b} - \mathbf{a}))$, the standard one-dimensional bisection method can be applied to $g: [0,1] \to \mathbb{R}$.
Vector codomain ($\mathbf{f}: \mathbb{R}^n \to \mathbb{R}^n$) -- The bisection method is not applicable. Since the output is a vector, there is no notion of "positive or negative," and it is impossible to determine which side contains the root.

Repeated roots

The bisection method assumes $f(a) \cdot f(b) < 0$, but at a repeated root (a root $x^*$ with $f(x^*) = 0$ and $f'(x^*) = 0$) the function does not change sign in a neighborhood of the root, so the method cannot be applied directly.

Example: $f(x) = (x-1)^2$ has a repeated root at $x=1$, but $f(x) \ge 0$ for both $x < 1$ and $x > 1$. Regardless of how the initial interval $[a, b]$ is chosen, $f(a) f(b) \ge 0$, so the bisection precondition fails.

Workarounds:

Apply the bisection method to $f'(x)$ instead (a repeated root of $f$ is often a simple root of $f'$).
Switch to Newton's method. Newton does not require a sign change, so it can be used even at repeated roots where bisection cannot start. However, it loses its quadratic convergence; at a root of multiplicity $m$ it becomes linear with rate $1 - 1/m$. For a double root ($m=2$) the rate is $1/2$, the same as bisection; for $m \ge 3$ Newton is actually slower than bisection (whose rate stays at $1/2$).
Use modified Newton $x_{n+1} = x_n - m\, f(x_n)/f'(x_n)$ with known multiplicity $m$ to recover quadratic convergence.

6. Worked Example

Find the root $x^* = \sqrt{2} \approx 1.4142$ of $f(x) = x^2 - 2$ on the interval $[1, 2]$. We have $f(1)=-1 < 0$ and $f(2)=2 > 0$.

$n$	$a_n$	$b_n$	$c_n$	$f(c_n)$	Selection
0	1.0000	2.0000	1.5000	$+0.2500$	Left half
1	1.0000	1.5000	1.2500	$-0.4375$	Right half
2	1.2500	1.5000	1.3750	$-0.1094$	Right half
3	1.3750	1.5000	1.4375	$+0.0664$	Left half
4	1.3750	1.4375	1.4063	$-0.0225$	Right half
5	1.4063	1.4375	1.4219	$+0.0217$	Left half

The sign of $f(c_n)$ alternates as $+, -, -, +, -, +$, while the interval width decreases as $1.0 \to 0.5 \to 0.25 \to \cdots$, halving at each step and converging to the root $x^* = \sqrt{2}$.

Midpoints c_n plotted on the curve at each iteration, showing the shrinking intervals — Figure 3. Midpoints $c_n$ at each iteration ($f(x) = x^2 - 2$). Dashed lines from each $a_n$ and $b_n$ to the curve indicate the sign of $f$. The interval alternates between left and right halves, converging to the root $x^* = \sqrt{2}$.

7. Frequently Asked Questions

Q1. What is the bisection method?

Given a continuous function $f$ with $f(a)f(b)<0$, the bisection method repeatedly halves the interval and selects the subinterval containing the root. Convergence is guaranteed by the Intermediate Value Theorem.

Q2. What is the convergence rate of the bisection method?

It exhibits linear (first-order) convergence: each iteration halves the interval width. To obtain $k$ decimal digits of precision, approximately $3.32k$ iterations are required.

Q3. What are the disadvantages of the bisection method?

The main disadvantages are its slow convergence, the requirement that $f$ change sign over the initial interval, and the difficulty of extending it to multiple dimensions.

8. References

Wikipedia, "Bisection method" (English)
R. L. Burden & J. D. Faires, Numerical Analysis, 10th ed., Cengage, 2016.
W. H. Press et al., Numerical Recipes, 3rd ed., Cambridge, 2007.

Implementation in sangi

The algorithm in this article is available in sangi's bisection.