Gaussian Elimination
Goal
Master Gaussian elimination -- the most fundamental direct method for solving systems of linear equations -- including forward elimination, back substitution, pivot selection strategies, and its relationship with LU decomposition.
Prerequisites
Matrix-vector multiplication, elementary row operations on augmented matrices
Table of Contents
1. Overview
Gaussian elimination is the most fundamental direct method for solving a system of linear equations $A\mathbf{x} = \mathbf{b}$. It applies elementary row operations to the augmented matrix $[A|\mathbf{b}]$ to transform it into upper triangular form (forward elimination), then obtains the solution by back substitution.
Named after C. F. Gauss (1777--1855), similar techniques can be found in the ancient Chinese text The Nine Chapters on the Mathematical Art (Jiuzhang Suanshu), dating back to before the common era.
2. Forward Elimination
From the $n \times n$ coefficient matrix $A$ and the right-hand side vector $\mathbf{b}$, form the augmented matrix. For each column $k$ ($k = 1, 2, \ldots, n-1$), subtract a constant multiple of row $k$ from all rows below row $k$ to eliminate $a_{ik}$.
Elimination Multiplier
At step $k$, for $i > k$, define the multiplier $m_{ik}$ as
and subtract $m_{ik}$ times row $k$ from row $i$: $a_{ij}^{(k+1)} = a_{ij}^{(k)} - m_{ik}\, a_{kj}^{(k)}$.
3. Back Substitution
Once forward elimination yields the upper triangular system $U\mathbf{x} = \mathbf{c}$, the solution is obtained starting from the last row:
$$x_n = \dfrac{c_n}{u_{nn}}, \qquad x_k = \dfrac{1}{u_{kk}}\!\left(c_k - \displaystyle\sum_{j=k+1}^{n} u_{kj}\, x_j\right), \quad k = n{-}1, \ldots, 1$$4. Animated Computation Example
The following interactive widget demonstrates the complete process of forward elimination and back substitution on a concrete system of three linear equations, step by step.
Example Using Augmented Matrix
The same system of equations from the animation above is presented here using augmented matrix row operations.
Forward Elimination:
$$\left[\begin{array}{ccc|c} 2 & 1 & -1 & 8 \\ -3 & -1 & 2 & -11 \\ -2 & 1 & 2 & -3 \end{array}\right]$$$R_2 \leftarrow R_2 + \dfrac{3}{2}R_1$, $R_3 \leftarrow R_3 + R_1$:
$$\left[\begin{array}{ccc|c} 2 & 1 & -1 & 8 \\ 0 & \dfrac{1}{2} & \dfrac{1}{2} & 1 \\ 0 & 2 & 1 & 5 \end{array}\right]$$$R_3 \leftarrow R_3 - 4R_2$:
$$\left[\begin{array}{ccc|c} 2 & 1 & -1 & 8 \\ 0 & \dfrac{1}{2} & \dfrac{1}{2} & 1 \\ 0 & 0 & -1 & 1 \end{array}\right]$$Back Substitution: From row 3, $x_3 = -1$; substituting into row 2 gives $x_2 = 3$; substituting into row 1 gives $x_1 = 2$.
C++ implementation with partial pivoting
#include <vector>
#include <cmath>
#include <stdexcept>
std::vector<double> gaussianElimination(
std::vector<std::vector<double>> A,
std::vector<double> b)
{
const int n = static_cast<int>(A.size());
for (int k = 0; k < n - 1; ++k) {
// partial pivot: find the row with max |A[i][k]|
int pivot = k;
for (int i = k + 1; i < n; ++i)
if (std::abs(A[i][k]) > std::abs(A[pivot][k])) pivot = i;
if (pivot != k) { std::swap(A[k], A[pivot]); std::swap(b[k], b[pivot]); }
if (std::abs(A[k][k]) < 1e-12)
throw std::runtime_error("singular matrix");
// eliminate
for (int i = k + 1; i < n; ++i) {
double m = A[i][k] / A[k][k];
for (int j = k; j < n; ++j) A[i][j] -= m * A[k][j];
b[i] -= m * b[k];
}
}
// back substitution
std::vector<double> x(n);
for (int i = n - 1; i >= 0; --i) {
double s = b[i];
for (int j = i + 1; j < n; ++j) s -= A[i][j] * x[j];
x[i] = s / A[i][i];
}
return x;
}
Implementation notes
- Cost of row swaps: Physically copying whole row vectors becomes expensive for large matrices. Production code keeps a row permutation $\pi$ and accesses via
A[π[i]][j]. - Singularity threshold: Declare singularity when $|a_{kk}| < \tau \cdot \max_i |a_{ik}|$ (with $\tau$ on the order of 10-100 times the machine epsilon). An absolute threshold alone is fragile under scaling.
- Blocking: For performance, use blocked elimination (LAPACK's
dgetrf) that dispatches inner kernels to BLAS-3. The element-by-element loop above is fine for small problems only.
5. Pivot Selection
At step $k$ of forward elimination, if the pivot $a_{kk}^{(k)}$ is zero or close to zero, division by zero or large rounding errors may occur. To avoid this, pivot selection (pivoting) is performed.
5.1 Partial Pivoting
In column $k$, the element with the largest absolute value at or below row $k$ is selected, and the corresponding row swap is performed. Since swapping rows merely reorders the equations, the correspondence between variables is unaffected. In practice, partial pivoting is sufficient in most cases and is adopted by standard numerical libraries such as LAPACK.
5.2 Complete Pivoting (Full Pivoting)
In complete pivoting, at step $k$, the element with the largest absolute value is selected from all entries at or below row $k$ and at or to the right of column $k$, and both rows and columns are swapped.
A column swap corresponds to a reordering of variables. For example, swapping columns 2 and 3 interchanges the roles of $x_2$ and $x_3$. Therefore, after all elimination steps are complete, the column swap history must be traced in reverse order to restore the original variable ordering.
Partial vs. Complete: Complete pivoting offers slightly better numerical stability, but finding the maximum element requires an additional $O(n^2)$ cost. The search cost for partial pivoting is $O(n)$, and in practice, partial pivoting is sufficient in most cases.
5.3 Pivot Selection for Rational Arithmetic
The pivoting strategies described above assume floating-point arithmetic. When performing Gaussian elimination exactly using rational numbers (fractions with arbitrary-precision integers), there is no rounding error; instead, coefficient swell becomes the concern. During elimination, the number of digits in the numerators and denominators can grow explosively, making computation extremely slow.
To mitigate this problem, in contrast to floating-point arithmetic, it is effective to choose the nonzero element with the smallest absolute value as the pivot. More precisely, by selecting the element whose height $H(p/q) = \max(|p|, |q|)$ (where $p/q$ is in lowest terms) is minimized as the pivot, the numerators and denominators of the elimination multipliers remain small, thereby suppressing coefficient swell.
Note: In rational Gaussian elimination, besides pivot selection, other techniques exist to avoid coefficient swell, such as reducing intermediate values to lowest terms at each step, or using the Bareiss algorithm, which performs elimination using only integers without introducing fractions.
6. Computational Complexity
The computational complexity of Gaussian elimination for a system of $n$ linear equations is as follows.
- Forward Elimination: $\dfrac{2}{3}n^3 + O(n^2)$ floating-point operations
- Back Substitution: $n^2 + O(n)$ floating-point operations
- Overall: $O(n^3)$
7. Relationship with LU Decomposition
The forward elimination process of Gaussian elimination is equivalent to decomposing the matrix $A$ into the product of a lower triangular matrix $L$ and an upper triangular matrix $U$:
$$A = LU$$The elimination multipliers $m_{ik}$ become the subdiagonal elements of $L$, and the resulting upper triangular matrix is $U$. Once the $LU$ decomposition is computed, a solution for any different right-hand side $\mathbf{b}$ can be obtained in $O(n^2)$ operations, making it efficient when solving with the same coefficient matrix but multiple right-hand sides.
8. Pitfalls
- Growth factor and stability: The numerical stability of Gaussian elimination with partial pivoting depends on the growth factor $\rho_n = \max_{i,j,k} |a_{ij}^{(k)}| / \max_{i,j} |a_{ij}|$. The theoretical worst case is $\rho_n \le 2^{n-1}$ (attained by the Wilkinson matrix), but for practical matrices $\rho_n = O(n)$ almost always. Complete pivoting grows more slowly, with $\rho_n \le n^{1/2} (2 \cdot 3^{1/2} \cdots n^{1/(n-1)})^{1/2}$.
- Scaling sensitivity: When rows have widely differing scales, the largest-absolute-value criterion alone can pick a poor pivot. Some implementations adopt "scaled partial pivoting", which normalizes each row by $\|A_{i,:}\|_\infty$ before applying the criterion.
- Detecting singularity: When $|a_{kk}^{(k)}| < \tau \cdot \|A\|$ (with $\tau$ on the order of the machine epsilon), the matrix is numerically singular and the solver should switch to a least-squares or SVD-based path.
- Conditioning limits accuracy: The computed solution $\hat{\mathbf{x}}$ has relative error on the order of $\kappa(A) \cdot u$ (with $u$ the unit roundoff); when $\kappa(A) \approx 1/u$, all significant digits are lost. Estimating the condition number first (e.g., LAPACK's
?gecon) is recommended. - Iterative refinement: Compute the residual $\mathbf{r} = \mathbf{b} - A\hat{\mathbf{x}}$ in higher precision, solve $A\mathbf{d} = \mathbf{r}$ with the existing $LU$, and apply $\hat{\mathbf{x}} \leftarrow \hat{\mathbf{x}} + \mathbf{d}$ one or two times. This recovers accuracy whenever $\kappa(A) \cdot u \ll 1$.
- Dense-matrix assumption: Plain Gaussian elimination targets dense matrices. Applying it directly to a sparse matrix causes fill-in (zeros becoming nonzeros during elimination) and destroys sparsity.
Conditioning in practice: the Hilbert matrix
The Hilbert matrix $H_{ij} = 1/(i+j-1)$ is a classic ill-conditioned example. At $n = 10$, $\kappa_2(H_{10}) \approx 1.6 \times 10^{13}$. In double precision ($u \approx 1.1 \times 10^{-16}$),
$$\kappa_2(H_{10}) \cdot u \approx 1.8 \times 10^{-3}$$so the relative error is bounded by $\sim 10^{-3}$ -- only about 3 correct significant digits. At $n = 15$, $\kappa_2(H_{15}) \approx 10^{18}$ and the double-precision solution becomes essentially meaningless. Mitigations: (i) extended precision (multi-precision or quad), (ii) iterative refinement, (iii) switch to orthogonalization-based solvers (QR / SVD).
Sparse matrices and sparse LU
Even a 5-band tridiagonal matrix can suffer cascading fill-in under naive Gaussian elimination and end up effectively dense. Sparse $LU$ solvers therefore proceed in three stages:
- Reordering: A symmetric row/column permutation chosen to minimize fill-in -- Minimum Degree, AMD (Approximate Minimum Degree), Nested Dissection, etc.
- Symbolic factorization: Before any arithmetic, determine which positions will become nonzero and allocate the data structure.
- Numerical factorization: Compute the actual $LU$ values along the precomputed pattern.
Representative implementations: SuperLU (general purpose), UMFPACK (unsymmetric; the backend of SciPy spsolve), PARDISO (Intel MKL), MUMPS (parallel).
9. Frequently Asked Questions
Q1. What is Gaussian elimination?
Gaussian elimination is a direct method that solves a system of linear equations in two stages: forward elimination and back substitution. Its computational complexity is $O(n^3)$, and it is the most fundamental algorithm in numerical linear algebra.
Q2. Why is pivoting necessary?
If the pivot element is zero or close to zero, division by zero or large rounding errors can occur. Partial pivoting selects the element with the largest absolute value in the column as the pivot, thereby improving numerical stability.
Q3. What is the relationship between Gaussian elimination and LU decomposition?
The forward elimination process is equivalent to decomposing the matrix $A$ into $L$ (lower triangular) and $U$ (upper triangular). The elimination multipliers become the elements of $L$, and the eliminated matrix becomes $U$.
10. References
- Wikipedia, "Gaussian elimination"
- Wikipedia, "LU decomposition"
- G. H. Golub & C. F. Van Loan, Matrix Computations, 4th ed., Johns Hopkins, 2013 (Chapter 3 on $LU$ factorization; §3.4 on growth factors and the Wilkinson matrix).
- L. N. Trefethen & D. Bau III, Numerical Linear Algebra, SIAM, 1997 (Lectures 20-23 on $LU$ and partial pivoting).
- N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed., SIAM, 2002 (§9 on backward error analysis of Gaussian elimination; §12 on iterative refinement).
Implementation in sangi
The algorithm in this article is available in sangi's LinAlg module.