Proof of Row Rank = Column Rank (Strang style)

Chapter 11 Matrix Rank — Appendix

Goals of this page

Rigorously prove the theorem stated in Chapter 11 Matrix Rank §3: "for any matrix, row rank equals column rank," using the rank factorization $A = CB$ (Strang style). Rather than settling for a restatement of definitions, we extract inequalities in both directions from the symmetric structure of $A$ and $A^T$, and combine them.

Prerequisites

§1. Preparation: Row Space, Column Space, Dimension

We organize the terminology and basic identities used in the proof.

Terminology: row space, column space, dimension

Row space and column space: the row space $\mathrm{Row}(A)$ of $A$ is the subspace of $\mathbb{R}^n$ spanned by the row vectors of $A$; the column space $\mathrm{Col}(A)$ is the subspace of $\mathbb{R}^m$ spanned by the column vectors of $A$ (see Chapter 1 §4 subspaces and §5 bases and dimension).

Dimension: the dimension $\dim V$ of a subspace $V$ is the number of basis vectors (see Chapter 1 §5.3). Well-definedness of $\dim V$ (the basis count is independent of the choice of basis) is also established in Chapter 1. Linear independence is treated in Chapter 2.

Fact to fill in here: suppose a vector space $V$ is spanned by finitely many vectors $\boldsymbol{v}_1, \ldots, \boldsymbol{v}_k$ (they need not be linearly independent; some may be redundant — e.g., some may be linear combinations of the others). Pick a subset satisfying the following two conditions (let the selected count be $r$, with $0 \leq r \leq k$):

  • (i) the chosen $r$ vectors are linearly independent;
  • (ii) adding any of the remaining ($k - r$) vectors to the chosen ones breaks linear independence.

Then the chosen $r$ vectors form a basis of $V$, so $r = \dim V$.

Proof (sketch): by (ii), adding any unchosen vector $\boldsymbol{v}$ to the chosen set makes it linearly dependent, i.e., $\boldsymbol{v}$ is a linear combination of the chosen $r$ vectors (otherwise independence would be preserved). Hence each $\boldsymbol{v}_i$ is a linear combination of the chosen ones; since $\{\boldsymbol{v}_1, \ldots, \boldsymbol{v}_k\}$ spans $V$, the chosen $r$ vectors also span $V$. Combined with (i), they form a basis. $\blacksquare$

It follows that the maximum number of linearly independent elements in a spanning set equals $\dim V$ (both equal the basis count). Since the row space is spanned by all rows of $A$ and the column space is spanned by all columns of $A$:

$$\text{row rank}(A) = \dim \mathrm{Row}(A), \qquad \text{col rank}(A) = \dim \mathrm{Col}(A). \tag{1.1}$$

Moreover, since "rows of $A^T$ = columns of $A$" by the definition of transpose, the same set of vectors spans the same subspace, so

$$\mathrm{Row}(A^T) = \mathrm{Col}(A), \qquad \mathrm{Col}(A^T) = \mathrm{Row}(A). \tag{1.2}$$

Identities (1.1)(1.2) are used centrally in §2 below.

§2. Theorem: Row Rank = Column Rank

Theorem: row rank = column rank

For any matrix, row rank and column rank are equal. We therefore speak simply of "the rank" without distinguishing rows and columns.

Proof (Strang style): let $A$ be an $m \times n$ matrix with row rank $r$.

  1. By definition, the row space of $A$ has a basis of $r$ vectors $\boldsymbol{b}_1, \ldots, \boldsymbol{b}_r \in \mathbb{R}^n$.
  2. Each row $\boldsymbol{a}_i$ of $A$ is expressed uniquely in terms of this basis: $$\boldsymbol{a}_i = c_{i1}\boldsymbol{b}_1 + c_{i2}\boldsymbol{b}_2 + \cdots + c_{ir}\boldsymbol{b}_r \quad (i = 1, \ldots, m).$$
  3. Treating $\boldsymbol{a}_i$ and $\boldsymbol{b}_j$ as column vectors ($\mathbb{R}^n$), transposing each to row vectors $\boldsymbol{a}_i^\top$ and $\boldsymbol{b}_j^\top$, stacking vertically, and collecting the coefficients $c_{ij}$ into an $m \times r$ matrix, (2) is equivalent to the single matrix identity: $$ \underbrace{\begin{pmatrix} \boldsymbol{a}_1^\top \\ \boldsymbol{a}_2^\top \\ \vdots \\ \boldsymbol{a}_m^\top \end{pmatrix}}_{A\ (m \times n)} = \underbrace{\begin{pmatrix} c_{11} & c_{12} & \cdots & c_{1r} \\ c_{21} & c_{22} & \cdots & c_{2r} \\ \vdots & \vdots & \ddots & \vdots \\ c_{m1} & c_{m2} & \cdots & c_{mr} \end{pmatrix}}_{C\ (m \times r)} \underbrace{\begin{pmatrix} \boldsymbol{b}_1^\top \\ \boldsymbol{b}_2^\top \\ \vdots \\ \boldsymbol{b}_r^\top \end{pmatrix}}_{B\ (r \times n)} $$ i.e. $$A = CB.$$ This is the rank factorization, the foundation on which SVD and low-rank approximation rest.
  4. Comparing the $k$-th column of both sides: $$A_{*k} = C\, B_{*k}$$ each column of $A$ is a linear combination of the $r$ columns of $C$. Hence the column space of $A$ has dimension at most $r$: $$\text{col rank}(A) \leq r = \text{row rank}(A). \tag{2.1}$$
  5. Apply the same argument to $A^T$ ($A^T$ is $n \times m$). To avoid notational collision, use $r', \boldsymbol{d}_j, e_{ij}$ for $A^T$.
    Extraction structure for proof steps 1-4 and 5 (symmetric) (a) A's i-th row → aᵢ (steps 1-4) ····· ····· ····· ····· row i A (m × n) transpose ····· aᵢ (n × 1, column) aᵢ ∈ ℝⁿ (b) Aᵀ's i-th row → a′ᵢ (step 5) ···· ···· ···· ···· ···· row i Aᵀ (n × m) transpose ···· a′ᵢ (m × 1, column) a′ᵢ ∈ ℝᵐ (= A's i-th column)
    Figure 1: the two vector extractions in the proof (symmetric structure). (a) For the first half (steps 1-4), extract the $i$-th row of $A$ ($m \times n$) and transpose to obtain the column vector $\boldsymbol{a}_i \in \mathbb{R}^n$. (b) For the second half (step 5), extract the $i$-th row of $A^T$ ($n \times m$) and transpose to obtain the column vector $\boldsymbol{a}'_i \in \mathbb{R}^m$; equivalently, $\boldsymbol{a}'_i$ is the $i$-th column of $A$. Both follow the same "row → transpose → column vector" pattern, which is why steps 1-4 apply to $A^T$ verbatim.

    5-1. Let $r'$ be the row rank of $A^T$. By definition, the row space of $A^T$ has a basis of $r'$ vectors $\boldsymbol{d}_1, \ldots, \boldsymbol{d}_{r'} \in \mathbb{R}^m$ (remark: rows of $A^T$ = columns of $A$, so these also form a basis of the column space of $A$).

    5-2. Let $\boldsymbol{a}'_i$ denote the $i$-th row of $A^T$ (viewed as a column vector, $\boldsymbol{a}'_i \in \mathbb{R}^m$). It is expressed uniquely in this basis: $$\boldsymbol{a}'_i = e_{i1}\boldsymbol{d}_1 + e_{i2}\boldsymbol{d}_2 + \cdots + e_{i r'}\boldsymbol{d}_{r'} \quad (i = 1, \ldots, n).$$

    5-3. Collecting into matrix form: $$ \underbrace{\begin{pmatrix} (\boldsymbol{a}'_1)^\top \\ (\boldsymbol{a}'_2)^\top \\ \vdots \\ (\boldsymbol{a}'_n)^\top \end{pmatrix}}_{A^T\ (n \times m)} = \underbrace{\begin{pmatrix} e_{11} & e_{12} & \cdots & e_{1 r'} \\ e_{21} & e_{22} & \cdots & e_{2 r'} \\ \vdots & \vdots & \ddots & \vdots \\ e_{n1} & e_{n2} & \cdots & e_{n r'} \end{pmatrix}}_{E\ (n \times r')} \underbrace{\begin{pmatrix} \boldsymbol{d}_1^\top \\ \boldsymbol{d}_2^\top \\ \vdots \\ \boldsymbol{d}_{r'}^\top \end{pmatrix}}_{D\ (r' \times m)} $$ i.e. $$A^T = ED.$$

    5-4. Comparing the $k$-th column of both sides: $$(A^T)_{*k} = E\, D_{*k}$$ each column of $A^T$ is a linear combination of the $r'$ columns of $E$. Hence $$\text{col rank}(A^T) \leq r' = \text{row rank}(A^T). \tag{2.2}$$

    5-5. Apply the basic identities (1.1), (1.2) of §1 to $A^T$ to obtain:

    • $\text{row rank}(A^T) = \dim \mathrm{Row}(A^T) = \dim \mathrm{Col}(A) = \text{col rank}(A)$
      (first equality: (1.1) applied to $A^T$; second: (1.2) gives $\mathrm{Row}(A^T) = \mathrm{Col}(A)$; third: (1.1) again)
    • $\text{col rank}(A^T) = \dim \mathrm{Col}(A^T) = \dim \mathrm{Row}(A) = \text{row rank}(A)$
      (first equality: (1.1) applied to $A^T$; second: (1.2) gives $\mathrm{Col}(A^T) = \mathrm{Row}(A)$; third: (1.1) again)

    Each equality is justified in the annotation below the line.

    5-6. Substituting into (2.2) yields

    $$\text{row rank}(A) \leq \text{col rank}(A). \tag{2.3}$$
  6. Combining the inequality (2.1) from step 4 and the inequality (2.3) from step 5-6: $$\text{col rank}(A) \leq \text{row rank}(A) \quad (2.1), \qquad \text{row rank}(A) \leq \text{col rank}(A) \quad (2.3)$$ gives $\text{row rank}(A) = \text{col rank}(A).$ $\blacksquare$

§3. References

  • Strang, G. (2016). Introduction to Linear Algebra (5th ed.). Wellesley-Cambridge Press. — concise treatment of the rank factorization $A = CB$.
  • Axler, S. (2015). Linear Algebra Done Right (3rd ed.). Springer.