MA2007B Linear Algebra I Lecture Note 18

Let T:V→WT:V\to WT:V→W﻿ be a linear transformation. Let α={v1,…,vn}\alpha=\{v_1,\ldots,v_n\}α={v1​,…,vn​}﻿ be a basis for VVV﻿, and let β={w1,…,wm}\beta=\{w_1,\ldots,w_m\}β={w1​,…,wm​}﻿ be a basis for WWW﻿. Let {e1,…,en}\{e_1,\ldots,e_n\}{e1​,…,en​}﻿ and {f1,…,fm}\{f_1,\ldots,f_m\}{f1​,…,fm​}﻿ be the standard basis for Rn\mathbb{R}^nRn﻿ and Rm\mathbb{R}^mRm﻿ respectively. A transformation matrix [T]αβ[T]_\alpha^\beta[T]αβ​﻿ of TTT﻿ is a matrix such that the following diagram commutes:

ALT

where ρV:vi↦ei\rho_V:v_i\mapsto e_iρV​:vi​↦ei​﻿ and ρW:wi↦fi\rho_W:w_i\mapsto f_iρW​:wi​↦fi​﻿, i.e., we have the functional equation ρW∘T=[T]αβ∘ρV\rho_W\circ T=[T]_{\alpha}^{\beta}\circ\rho_VρW​∘T=[T]αβ​∘ρV​﻿. 

Example. Consider the linear transformation T(x,y,z)=(x+y,y+z,z+x)T(x,y,z)=(x+y,y+z,z+x)T(x,y,z)=(x+y,y+z,z+x)﻿. Let {e1,e2,e3}\{e_1,e_2,e_3\}{e1​,e2​,e3​}﻿ be the standard bases on R3.\mathbb{R}^3.R3.﻿ Let {w1,w2,w3}={[110],[011],[101]}\{w_1,w_2,w_3\}=\left\{\begin{bmatrix}1\\1\\0\end{bmatrix},\begin{bmatrix}0\\1\\1\end{bmatrix},\begin{bmatrix}1\\0\\1\end{bmatrix}\right\}{w1​,w2​,w3​}=⎩⎨⎧​​110​​,​011​​,​101​​⎭⎬⎫​﻿.

ALT

Then, the transformation matrix is

\begin{bmatrix} 0 & 1 & 0\\ 0 & 0 & 1\\ 1 & 0 & 0 \end{bmatrix}.

Example. Consider the differentiation operator ∂∂x\frac{\partial}{\partial x}∂x∂​﻿ acting on polynomials of degree nnn﻿. Let's find its transformation matrix. Consider the basis α={1,x,…,xn}\alpha=\{1,x,\ldots,x^n\}α={1,x,…,xn}﻿ for Pn(R)P_n(\mathbb{R})Pn​(R)﻿ and the basis β={1,x,…,xn−1}\beta=\{1,x,\ldots,x^{n-1}\}β={1,x,…,xn−1}﻿ for Pn−1(R)P_{n-1}(\mathbb{R})Pn−1​(R)﻿. We define the vertical maps ρV\rho_VρV​﻿ by xi↦ei+1x^i\mapsto e_{i+1}xi↦ei+1​﻿ and ρW\rho_WρW​﻿ by xi↦fi+1x^i\mapsto f_{i+1}xi↦fi+1​﻿, where {e1,…en+1}\{e_1,\ldots e_{n+1}\}{e1​,…en+1​}﻿ is a basis for Rn+1\mathbb{R}^{n+1}Rn+1﻿ and {f1,…,fn}\{f_1,\ldots,f_n\}{f1​,…,fn​}﻿ is a basis for Rn\mathbb{R}^nRn﻿. Therefore, ρV(a0+a1x+⋯+anxn)=a0e1+a1e1+⋯+anen+1\rho_V(a_0+a_1x+\cdots+a_nx^n)=a_0e_1+a_1e_1+\cdots+a_ne_{n+1}ρV​(a0​+a1​x+⋯+an​xn)=a0​e1​+a1​e1​+⋯+an​en+1​﻿ and ρW(∂∂x(a0+a1x+⋯+anxn))=a1f1+2a2f2+⋯+nanfn\rho_W\left(\frac{\partial}{\partial x}(a_0+a_1x+\cdots+a_nx^n)\right)=a_1f_1+2a_2f_2+\cdots+na_nf_nρW​(∂x∂​(a0​+a1​x+⋯+an​xn))=a1​f1​+2a2​f2​+⋯+nan​fn​﻿. We can then write the transformation matrix

\left[\frac{\partial}{\partial x}\right]_\alpha^\beta=\begin{bmatrix}0 & 1 & 0 & \cdots & 0\\ 0 & 0 & 2 & \cdots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & \cdots & n\end{bmatrix}.

Column space, row space, and range

Let AAA﻿ be an m×nm\times nm×n﻿ matrix. We can treat AAA﻿ as a linear transformation from Rn\mathbb{R}^nRn﻿ to Rm\mathbb{R}^mRm﻿. Let's explore the relationship between a matrix's column and row spaces and its corresponding linear transformation's range and null space.

The column space directly corresponds to the range of AAA﻿. For any y∈Im(A)y\in \text{Im}(A)y∈Im(A)﻿, we have y=Axy=Axy=Ax﻿ for some x∈Rnx\in \mathbb{R}^nx∈Rn﻿. Let v1,…,vnv_1,\ldots, v_nv1​,…,vn​﻿ be the column vectors of AAA﻿. If we write x=[x1x2⋯xn]x=\begin{bmatrix}
x_1\\x_2\\\cdots\\x_n
\end{bmatrix}x=​x1​x2​⋯xn​​​﻿, then we can express yyy﻿ as the column form of AxAxAx﻿,

y=x_1v_1+x_2v_2+\cdots +x_nv_n.

Therefore, since any element of Im(A)\text{Im}(A)Im(A)﻿ can be written as a linear combination of the column vectors of AAA﻿, we can conclude that Im(A)=C(A)\text{Im}(A)=C(A)Im(A)=C(A)﻿.

The following theorem will summarize the relationship between the column space and row space.

Theorem. Let AAA﻿ be any m×nm\times nm×n﻿ matrix.

rank(At)=rank(A)\text{rank}(A^t)=\text{rank}(A)rank(At)=rank(A)﻿, i.e., dim(R(A))=dim(C(A))\text{dim}(R(A))=\text{dim}(C(A))dim(R(A))=dim(C(A))﻿​

The rank of any matrix equals the maximum number of linearly independent rows of that matrix; that is, the rank of a matrix is the dimension of the subspace generated by the rows of that matrix

The rows and columns of any matrix generate subspaces of the same dimension, numerically equal to the rank of the matrix. 

proof. The key insight is that elementary row operations preserve dimension since elementary row matrices are invertible. Let En⋯E1AE_n\cdots E_1AEn​⋯E1​A﻿ be a triangular system with elementary matrices EiE_iEi​﻿. This gives us rank(En⋯E1A)=rank(A)\text{rank}(E_n\cdots E_1A)=\text{rank}(A)rank(En​⋯E1​A)=rank(A)﻿. Since En⋯E1AE_n\cdots E_1AEn​⋯E1​A﻿ is a triangular system, its nonzero row vectors form a basis for the row space of AAA﻿. When we observe that the columns also form a triangular system, we can conclude that the dimension of the column space equals the number of nonzero row vectors. Thus, dim⁡(C(En⋯E1A))=dim⁡(R(En⋯E1A))\dim(C(E_n\cdots E_1A))=\dim(R(E_n\cdots E_1A))dim(C(En​⋯E1​A))=dim(R(En​⋯E1​A))﻿.

Remark. Let A:Rn→RmA:\mathbb{R}^n\to\mathbb{R}^mA:Rn→Rm﻿ be an m×nm\times nm×n﻿ matrix. Since dim⁡(C(A))=dim⁡(R(A))=rank(A)\dim(C(A))=\dim(R(A))=\text{rank}(A)dim(C(A))=dim(R(A))=rank(A)﻿, we can use the rank-nullity theorem to show that a linear transformation induced by AAA﻿ is not one-to-one (equivalently, the null space of AAA﻿ is nontrivial) when n>mn>mn>m﻿. The argument follows: first, we have

n=\dim(\mathbb{R}^n)=\text{rank}(A)+\text{nullity}(A)=\dim(R(A))+\text{nullity}(A)

Therefore, nullity(A)=n−m>0\text{nullity}(A)=n-m>0nullity(A)=n−m>0﻿ implies that the null space of AAA﻿ is nontrivial and thus not one-to-one.

How to find a basis for a null space

Finding a basis for the null space of AAA﻿ is crucial because it helps determine the solution set of the linear equation Ax=bAx=bAx=b﻿ when solutions exist. More precisely, we have the following equality:

\{x|Ax=b\}=x_0+N(A)

where x0x_0x0​﻿ is a solution of Ax=bAx=bAx=b﻿ and N(A)N(A)N(A)﻿ is the null space of AAA﻿. This equality can be proved through set inclusion in both directions.

(⊆\subseteq⊆﻿) If x′x'x′﻿ is a solution, then A(x′−x0)=Ax′−Ax0=b−b=0A(x'-x_0)=Ax'-Ax_0=b-b=0A(x′−x0​)=Ax′−Ax0​=b−b=0﻿. Therefore, x′−x0∈N(A)x'-x_0\in N(A)x′−x0​∈N(A)﻿.

(⊇\supseteq⊇﻿) An element x′∈x0+N(A)x'\in x_0+N(A)x′∈x0​+N(A)﻿ can be written as x0+nx_0+nx0​+n﻿ for some n∈N(A)n\in N(A)n∈N(A)﻿. We compute A(x0+n)=Ax0+An=b+0=bA(x_0+n)=Ax_0+An=b+0=bA(x0​+n)=Ax0​+An=b+0=b﻿.

We use the following example to show the method of finding a basis for a null space.

Example. Let us find all solutions for

\begin{cases} x+y+z=0\\ x+y+w=0\\ x+y+2z-w=0\\ z-w=0 \end{cases}

This is equivalent to find a basis for the null space of the matrix

\begin{bmatrix} 1 & 1 & 1 & 0\\ 1 & 1 & 0 & 1\\ 1 & 1 & 2 & -1\\ 0 & 0 & 1 & -1 \end{bmatrix}

Applying Gaussian elimination to the end, we get

\begin{bmatrix} 1 & 1 & 0 & 1\\ 0 & 0 & 1 & -1\\ 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 \end{bmatrix}.

(After finding each pivot, we must eliminate all entries above and below it by making them zero)

Now, here's a useful trick to complete our task. We'll label each column with a variable. If the iii﻿-th column has a pivot, then we can express that variable in terms of other variables. If the iii﻿-th column has no pivot, then the variable (let's call it xix_ixi​﻿) is free to choose, and we'll add the trivial equation xi=xix_i=x_ixi​=xi​﻿ to our system of linear equations.

The last matrix above is the following linear equation system:

\begin{cases} x+y+w=0\\ z-w=0 \end{cases}\equiv \begin{cases} x=-y-w\\ y=y\\ z=w\\ w=w \end{cases}

Therefore, we have

\begin{bmatrix} x\\y\\z\\w \end{bmatrix}=y\begin{bmatrix} -1\\1\\0\\0 \end{bmatrix}+w\begin{bmatrix} -1\\0\\1\\1 \end{bmatrix}.

That is

N(A)=\text{spen}\left(\left\{\begin{bmatrix} -1\\1\\0\\0 \end{bmatrix},\begin{bmatrix} -1\\0\\1\\1 \end{bmatrix}\right\}\right).

Determinant

Historically, determinants played a major role in the study of linear algebra. They served as a computational tool to determine whether a linear equation system was singular. Additionally, Cramer's rule used determinants to provide explicit formulas for solving linear equation systems. Today, however, we primarily use determinants for computing eigenvalues.

In this chapter, we will develop an intuitive method to define determinants. We can also provide an axiomatic definition for determinants. From this perspective, we can prove that the only function satisfying these axioms is the one we defined intuitively. We will present the axiomatic definition without developing the related theorems further. Although determinants are no longer central to mathematical research, some of their properties remain essential to know. We will cover these properties in workshops and homework assignments.

Intuitive definition of determinant

The goal is to find an invariant of square matrices that we can use to determine whether a linear equation Ax=0Ax=0Ax=0﻿ has a unique solution. In R2\mathbb{R}^2R2﻿, consider the system

\begin{cases} a_{11}x+a_{12}y=0\\ a_{12}x+a_{22}y=0 \end{cases}

has a unique solution if and only if the vectors v1=[a11a12]v_1=\begin{bmatrix}a_{11}\\a_{12}\end{bmatrix}v1​=[a11​a12​​]﻿ and v2=[a12a21]v_2=\begin{bmatrix}a_{12}\\a_{21}\end{bmatrix}v2​=[a12​a21​​]﻿ span a nondegenerate parallelogram, i.e., the area of the parallelogram spanned by v1v_1v1​﻿ and v2v_2v2​﻿ is nonzero. We use the following special case to illustrate the idea that the area is a11a22−a12a21a_{11}a_{22}-a_{12}a_{21}a11​a22​−a12​a21​﻿.

ALT

The area of the parallelogram is (a11−a12)(a21+a22)−(−a12a22)−(a11a21)=a11a22−a12a21(a_{11}-a_{12})(a_{21}+a_{22})-(-a_{12}a_{22})-(a_{11}a_{21})=a_{11}a_{22}-a_{12}a_{21}(a11​−a12​)(a21​+a22​)−(−a12​a22​)−(a11​a21​)=a11​a22​−a12​a21​﻿. This formula is generally true, and we left the details for the readers (see 3Blue1BrownThe determinant | Chapter 6, Essence of linear algebra).

This motivate us to define

Definition. Let AAA﻿ be a 2×22\times 22×2﻿ matrix. The determinant of AAA﻿ is the scalar a11a22−a12a21a_{11}a_{22}-a_{12}a_{21}a11​a22​−a12​a21​﻿. We sall denote it by det⁡(A)\det(A)det(A)﻿.

Similarly, in R3\mathbb{R}^3R3﻿, the system 

\begin{cases} a_{11}x+a_{12}y+a_{13}z=0\\ a_{21}x+a_{22}y+a_{23}z=0\\ a_{31}x+a_{32}y+a_{33}z=0 \end{cases}

has a unique solution if and only if the volume of the parallelepiped is nonzero. To find this volume, mathematicians expressed the parallelepiped spanned by the three column vectors in terms of aija_{ij}aij​﻿. After carefully reorganizing the terms, the volume is given as

a_{11}(a_{22}a_{33}-a_{23}a_{32})-a_{12}(a_{21}a_{33}-a_{23}a_{31})+a_{13}(a_{21}a_{32}-a_{22}a_{31})

You can check the following two expressions are identical:

-a_{21}(a_{12}a_{33}-a_{13}a_{32})+a_{22}(a_{11}a_{33}-a_{13}a_{31})+a_{23}(a_{11}a_{32}-a_{12}a_{31})

or

a_{31}(a_{12}a_{23}-a_{13}a_{22})-a_{32}(a_{11}a_{23}-a_{13}a_{21})+a_{33}(a_{11}a_{12}-a_{21}a_{21})

While there are three additional equivalent expressions, all six can be unified in the following definition.

Definition. Let AAA﻿ be a 3×33\times 33×3﻿ matrix. The determinant of AAA﻿ is the scalar calculated for any i∈{1,2,3}i\in\{1,2,3\}i∈{1,2,3}﻿ by:

\sum_{j=1}^3(-1)^{i+j}a_{i,j}\det(A_{ij})

where AijA_{ij}Aij​﻿ is the submatrix of AAA﻿ obtained by deleting the iii﻿-th row and jjj﻿-th column. The term −1i+jdet⁡(Aij)-1^{i+j}\det(A_{ij})−1i+jdet(Aij​)﻿ is called the cofactor of ai,ja_{i,j}ai,j​﻿.

We summarize our discussion in the following table.

	Geometric point of view	Algebraic point of view
dim=2	The area of the parallelogram is nonzero.	$Ax=0$ has a unique solution
dim=3	The volume of the parallelepiped is nonzero.	$Ax=0$ has a unique solution
dim>3	??? is zero	$Ax=0$ has a unique solution

Our goal is to extend the definition of determinant to n×nn\times nn×n﻿ matrices where n>3n>3n>3﻿. This definition must preserve the key property that a matrix AAA﻿ has a nonzero determinant if and only if Ax=0Ax=0Ax=0﻿ has a unique solution.

Let's examine the patterns we found for n=2n=2n=2﻿ and n=3n=3n=3﻿, and see if we can extend them to n>3n>3n>3﻿.