MA2007B Linear Algebra I Lecture Note 18
/
MA2007B Linear Algebra I Lecture Note 18
製作於

MA2007B Linear Algebra I Lecture Note 18

檢視詳細資料
Let T:VWT:V\to W be a linear transformation. Let α={v1,,vn}\alpha=\{v_1,\ldots,v_n\} be a basis for VV, and let β={w1,,wm}\beta=\{w_1,\ldots,w_m\} be a basis for WW. Let {e1,,en}\{e_1,\ldots,e_n\} and {f1,,fm}\{f_1,\ldots,f_m\} be the standard basis for Rn\mathbb{R}^n and Rm\mathbb{R}^m respectively. A transformation matrix [T]αβ[T]_\alpha^\beta of TT is a matrix such that the following diagram commutes:
ALT
where ρV:viei\rho_V:v_i\mapsto e_i and ρW:wifi\rho_W:w_i\mapsto f_i, i.e., we have the functional equation ρWT=[T]αβρV\rho_W\circ T=[T]_{\alpha}^{\beta}\circ\rho_V.
Example. Consider the linear transformation T(x,y,z)=(x+y,y+z,z+x)T(x,y,z)=(x+y,y+z,z+x). Let {e1,e2,e3}\{e_1,e_2,e_3\} be the standard bases on R3.\mathbb{R}^3. Let {w1,w2,w3}={[110],[011],[101]}\{w_1,w_2,w_3\}=\left\{\begin{bmatrix}1\\1\\0\end{bmatrix},\begin{bmatrix}0\\1\\1\end{bmatrix},\begin{bmatrix}1\\0\\1\end{bmatrix}\right\}.
ALT
Then, the transformation matrix is
[010001100].\begin{bmatrix} 0 & 1 & 0\\ 0 & 0 & 1\\ 1 & 0 & 0 \end{bmatrix}.
Example. Consider the differentiation operator x\frac{\partial}{\partial x} acting on polynomials of degree nn. Let's find its transformation matrix. Consider the basis α={1,x,,xn}\alpha=\{1,x,\ldots,x^n\} for Pn(R)P_n(\mathbb{R}) and the basis β={1,x,,xn1}\beta=\{1,x,\ldots,x^{n-1}\} for Pn1(R)P_{n-1}(\mathbb{R}). We define the vertical maps ρV\rho_V by xiei+1x^i\mapsto e_{i+1} and ρW\rho_W by xifi+1x^i\mapsto f_{i+1}, where {e1,en+1}\{e_1,\ldots e_{n+1}\} is a basis for Rn+1\mathbb{R}^{n+1} and {f1,,fn}\{f_1,\ldots,f_n\} is a basis for Rn\mathbb{R}^n. Therefore, ρV(a0+a1x++anxn)=a0e1+a1e1++anen+1\rho_V(a_0+a_1x+\cdots+a_nx^n)=a_0e_1+a_1e_1+\cdots+a_ne_{n+1} and ρW(x(a0+a1x++anxn))=a1f1+2a2f2++nanfn\rho_W\left(\frac{\partial}{\partial x}(a_0+a_1x+\cdots+a_nx^n)\right)=a_1f_1+2a_2f_2+\cdots+na_nf_n. We can then write the transformation matrix
[x]αβ=[01000020000n].\left[\frac{\partial}{\partial x}\right]_\alpha^\beta=\begin{bmatrix}0 & 1 & 0 & \cdots & 0\\ 0 & 0 & 2 & \cdots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & \cdots & n\end{bmatrix}.

Column space, row space, and range

Let AA be an m×nm\times n matrix. We can treat AA as a linear transformation from Rn\mathbb{R}^n to Rm\mathbb{R}^m. Let's explore the relationship between a matrix's column and row spaces and its corresponding linear transformation's range and null space.
The column space directly corresponds to the range of AA. For any yIm(A)y\in \text{Im}(A), we have y=Axy=Ax for some xRnx\in \mathbb{R}^n. Let v1,,vnv_1,\ldots, v_n be the column vectors of AA. If we write x=[x1x2xn]x=\begin{bmatrix} x_1\\x_2\\\cdots\\x_n \end{bmatrix}, then we can express yy as the column form of AxAx,
y=x1v1+x2v2++xnvn.y=x_1v_1+x_2v_2+\cdots +x_nv_n.
Therefore, since any element of Im(A)\text{Im}(A) can be written as a linear combination of the column vectors of AA, we can conclude that Im(A)=C(A)\text{Im}(A)=C(A).
The following theorem will summarize the relationship between the column space and row space.
Theorem. Let AA be any m×nm\times n matrix.
rank(At)=rank(A)\text{rank}(A^t)=\text{rank}(A), i.e., dim(R(A))=dim(C(A))\text{dim}(R(A))=\text{dim}(C(A))
The rank of any matrix equals the maximum number of linearly independent rows of that matrix; that is, the rank of a matrix is the dimension of the subspace generated by the rows of that matrix
The rows and columns of any matrix generate subspaces of the same dimension, numerically equal to the rank of the matrix.
proof. The key insight is that elementary row operations preserve dimension since elementary row matrices are invertible. Let EnE1AE_n\cdots E_1A be a triangular system with elementary matrices EiE_i. This gives us rank(EnE1A)=rank(A)\text{rank}(E_n\cdots E_1A)=\text{rank}(A). Since EnE1AE_n\cdots E_1A is a triangular system, its nonzero row vectors form a basis for the row space of AA. When we observe that the columns also form a triangular system, we can conclude that the dimension of the column space equals the number of nonzero row vectors. Thus, dim(C(EnE1A))=dim(R(EnE1A))\dim(C(E_n\cdots E_1A))=\dim(R(E_n\cdots E_1A)).
Remark. Let A:RnRmA:\mathbb{R}^n\to\mathbb{R}^m be an m×nm\times n matrix. Since dim(C(A))=dim(R(A))=rank(A)\dim(C(A))=\dim(R(A))=\text{rank}(A), we can use the rank-nullity theorem to show that a linear transformation induced by AA is not one-to-one (equivalently, the null space of AA is nontrivial) when n>mn>m. The argument follows: first, we have
n=dim(Rn)=rank(A)+nullity(A)=dim(R(A))+nullity(A)n=\dim(\mathbb{R}^n)=\text{rank}(A)+\text{nullity}(A)=\dim(R(A))+\text{nullity}(A)
Therefore, nullity(A)=nm>0\text{nullity}(A)=n-m>0 implies that the null space of AA is nontrivial and thus not one-to-one.

How to find a basis for a null space

Finding a basis for the null space of AA is crucial because it helps determine the solution set of the linear equation Ax=bAx=b when solutions exist. More precisely, we have the following equality:
{xAx=b}=x0+N(A)\{x|Ax=b\}=x_0+N(A)
where x0x_0 is a solution of Ax=bAx=b and N(A)N(A) is the null space of AA. This equality can be proved through set inclusion in both directions.
(\subseteq) If xx' is a solution, then A(xx0)=AxAx0=bb=0A(x'-x_0)=Ax'-Ax_0=b-b=0. Therefore, xx0N(A)x'-x_0\in N(A).
(\supseteq) An element xx0+N(A)x'\in x_0+N(A) can be written as x0+nx_0+n for some nN(A)n\in N(A). We compute A(x0+n)=Ax0+An=b+0=bA(x_0+n)=Ax_0+An=b+0=b.
We use the following example to show the method of finding a basis for a null space.
Example. Let us find all solutions for
{x+y+z=0x+y+w=0x+y+2zw=0zw=0\begin{cases} x+y+z=0\\ x+y+w=0\\ x+y+2z-w=0\\ z-w=0 \end{cases}
This is equivalent to find a basis for the null space of the matrix
[1110110111210011]\begin{bmatrix} 1 & 1 & 1 & 0\\ 1 & 1 & 0 & 1\\ 1 & 1 & 2 & -1\\ 0 & 0 & 1 & -1 \end{bmatrix}
Applying Gaussian elimination to the end, we get
[1101001100000000].\begin{bmatrix} 1 & 1 & 0 & 1\\ 0 & 0 & 1 & -1\\ 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 \end{bmatrix}.
(After finding each pivot, we must eliminate all entries above and below it by making them zero)
Now, here's a useful trick to complete our task. We'll label each column with a variable. If the ii-th column has a pivot, then we can express that variable in terms of other variables. If the ii-th column has no pivot, then the variable (let's call it xix_i) is free to choose, and we'll add the trivial equation xi=xix_i=x_i to our system of linear equations.
The last matrix above is the following linear equation system:
{x+y+w=0zw=0{x=ywy=yz=ww=w\begin{cases} x+y+w=0\\ z-w=0 \end{cases}\equiv \begin{cases} x=-y-w\\ y=y\\ z=w\\ w=w \end{cases}
Therefore, we have
[xyzw]=y[1100]+w[1011].\begin{bmatrix} x\\y\\z\\w \end{bmatrix}=y\begin{bmatrix} -1\\1\\0\\0 \end{bmatrix}+w\begin{bmatrix} -1\\0\\1\\1 \end{bmatrix}.
That is
N(A)=spen({[1100],[1011]}).N(A)=\text{spen}\left(\left\{\begin{bmatrix} -1\\1\\0\\0 \end{bmatrix},\begin{bmatrix} -1\\0\\1\\1 \end{bmatrix}\right\}\right).

Determinant

Historically, determinants played a major role in the study of linear algebra. They served as a computational tool to determine whether a linear equation system was singular. Additionally, Cramer's rule used determinants to provide explicit formulas for solving linear equation systems. Today, however, we primarily use determinants for computing eigenvalues.
In this chapter, we will develop an intuitive method to define determinants. We can also provide an axiomatic definition for determinants. From this perspective, we can prove that the only function satisfying these axioms is the one we defined intuitively. We will present the axiomatic definition without developing the related theorems further. Although determinants are no longer central to mathematical research, some of their properties remain essential to know. We will cover these properties in workshops and homework assignments.

Intuitive definition of determinant

The goal is to find an invariant of square matrices that we can use to determine whether a linear equation Ax=0Ax=0 has a unique solution. In R2\mathbb{R}^2, consider the system
{a11x+a12y=0a12x+a22y=0\begin{cases} a_{11}x+a_{12}y=0\\ a_{12}x+a_{22}y=0 \end{cases}
has a unique solution if and only if the vectors v1=[a11a12]v_1=\begin{bmatrix}a_{11}\\a_{12}\end{bmatrix} and v2=[a12a21]v_2=\begin{bmatrix}a_{12}\\a_{21}\end{bmatrix} span a nondegenerate parallelogram, i.e., the area of the parallelogram spanned by v1v_1 and v2v_2 is nonzero. We use the following special case to illustrate the idea that the area is a11a22a12a21a_{11}a_{22}-a_{12}a_{21}.
ALT
The area of the parallelogram is (a11a12)(a21+a22)(a12a22)(a11a21)=a11a22a12a21(a_{11}-a_{12})(a_{21}+a_{22})-(-a_{12}a_{22})-(a_{11}a_{21})=a_{11}a_{22}-a_{12}a_{21}. This formula is generally true, and we left the details for the readers (see 3Blue1BrownThe determinant | Chapter 6, Essence of linear algebra).
This motivate us to define
Definition. Let AA be a 2×22\times 2 matrix. The determinant of AA is the scalar a11a22a12a21a_{11}a_{22}-a_{12}a_{21}. We sall denote it by det(A)\det(A).
Similarly, in R3\mathbb{R}^3, the system
{a11x+a12y+a13z=0a21x+a22y+a23z=0a31x+a32y+a33z=0\begin{cases} a_{11}x+a_{12}y+a_{13}z=0\\ a_{21}x+a_{22}y+a_{23}z=0\\ a_{31}x+a_{32}y+a_{33}z=0 \end{cases}
has a unique solution if and only if the volume of the parallelepiped is nonzero. To find this volume, mathematicians expressed the parallelepiped spanned by the three column vectors in terms of aija_{ij}. After carefully reorganizing the terms, the volume is given as
a11(a22a33a23a32)a12(a21a33a23a31)+a13(a21a32a22a31)a_{11}(a_{22}a_{33}-a_{23}a_{32})-a_{12}(a_{21}a_{33}-a_{23}a_{31})+a_{13}(a_{21}a_{32}-a_{22}a_{31})
You can check the following two expressions are identical:
a21(a12a33a13a32)+a22(a11a33a13a31)+a23(a11a32a12a31)-a_{21}(a_{12}a_{33}-a_{13}a_{32})+a_{22}(a_{11}a_{33}-a_{13}a_{31})+a_{23}(a_{11}a_{32}-a_{12}a_{31})
or
a31(a12a23a13a22)a32(a11a23a13a21)+a33(a11a12a21a21)a_{31}(a_{12}a_{23}-a_{13}a_{22})-a_{32}(a_{11}a_{23}-a_{13}a_{21})+a_{33}(a_{11}a_{12}-a_{21}a_{21})
While there are three additional equivalent expressions, all six can be unified in the following definition.
Definition. Let AA be a 3×33\times 3 matrix. The determinant of AA is the scalar calculated for any i{1,2,3}i\in\{1,2,3\} by:
j=13(1)i+jai,jdet(Aij)\sum_{j=1}^3(-1)^{i+j}a_{i,j}\det(A_{ij})
where AijA_{ij} is the submatrix of AA obtained by deleting the ii-th row and jj-th column. The term 1i+jdet(Aij)-1^{i+j}\det(A_{ij}) is called the cofactor of ai,ja_{i,j}.
We summarize our discussion in the following table.
Geometric point of view
Algebraic point of view
dim=2
The area of the parallelogram is nonzero.
Ax=0Ax=0 has a unique solution
dim=3
The volume of the parallelepiped is nonzero.
Ax=0Ax=0 has a unique solution
dim>3
??? is zero
Ax=0Ax=0 has a unique solution
Our goal is to extend the definition of determinant to n×nn\times n matrices where n>3n>3. This definition must preserve the key property that a matrix AA has a nonzero determinant if and only if Ax=0Ax=0 has a unique solution.
Let's examine the patterns we found for n=2n=2 and n=3n=3, and see if we can extend them to n>3n>3.