Up Main page

Elementary row operations as linear transformations

When one performs an elementary row operation on the augmented matrix \([ A ~|~ b]\) for the system \(Ax=b\), one actually is transforming both sides of the system with a linear transformation.

To illustrate the ideas, we consider each of the three kinds of elementary row operations on an example with \(A = \begin{bmatrix} 1 & 0 & 2\\ 2 & 6 & 0 \\-2 & 1 & 0\end{bmatrix}\), \(x = \begin{bmatrix} x_1\\x_2\\x_3\end{bmatrix}\), and \(b = \begin{bmatrix} -1\\-2\\1\end{bmatrix}\). Thus, the system is \(\begin{bmatrix} x_1 + 2x_3\\ 2x_1 + 6x_2 \\-2x_1 +x_2 \end{bmatrix} = \begin{bmatrix} -1\\-2\\1\end{bmatrix}\).

Interchanging two rows

Suppose we interchange rows 1 and 3. The resulting system is represented by \(\begin{bmatrix} -2x_1 +x_2 \\ 2x_1 + 6x_2 \\ x_1 + 2x_3 \end{bmatrix} = \begin{bmatrix} 1\\-2\\-1\end{bmatrix}\), which is exactly the same as \(T(Ax) = T(b)\) where \(T\) is the linear transformation given by \(T\left(\begin{bmatrix}u\\v\\w\end{bmatrix}\right) =\begin{bmatrix}w\\v\\u\end{bmatrix} = \begin{bmatrix} 0 & 0 & 1\\ 0 & 1 & 0\\ 1 & 0 & 0\end{bmatrix} \begin{bmatrix}u\\v\\w\end{bmatrix}\).

Notice that applying \(T\) to both sides of \(T(Ax)=T(b)\) gives us back \(Ax=b\). This is to be expected since \(T\) interchanges the 1st and 3rd entries of a tuple. Doing the interchange twice amounts to having done nothing.

In general, the matrix that represents the linear transformation that interchanges row \(i\) and row \(j\) can be obtained by interchanging row \(i\) and row \(j\) of the identity matrix of size \(m\) where \(m\) is the number equations in the system.

Multiplying a row by a nonzero constant

Suppose that row 2 of \([ A ~|~ b]\) is multiplied by \(\frac{1}{2}\). The resulting system is represented by \(\begin{bmatrix} x_1 + 2x_3 \\ x_1 + 3x_2 \\ -2x_1 +x_2 \end{bmatrix} = \begin{bmatrix} -1\\-1\\1\end{bmatrix}\), which is exactly the same as \(T(Ax) = T(b)\) where \(T\) is the linear transformation given by \(T\left(\begin{bmatrix}u\\v\\w\end{bmatrix}\right) =\begin{bmatrix}u\\ \frac{1}{2} v\\w\end{bmatrix} = \begin{bmatrix} 1 & 0 & 0\\ 0 & \frac{1}{2} & 0\\ 0 & 0 & 1\end{bmatrix} \begin{bmatrix}u\\v\\w\end{bmatrix}\).

Now, let \(U\) be the linear transformation given by \(U\left(\begin{bmatrix}u\\v\\w\end{bmatrix}\right) = \begin{bmatrix} 1 & 0 & 0\\ 0 & 2 & 0\\ 0 & 0 & 1\end{bmatrix} \begin{bmatrix}u\\v\\w\end{bmatrix}\). Then \(U(T(Ax)) = U(T(b))\) gives us back \(Ax=b\). Here, \(U\) represents the operation of multiplying row 2 by \(2\). Clearly, multiplying by \(2\) a row that has been previously multiplied by \(\frac{1}{2}\) gives us back the row originally.

In general, the matrix that represents the linear transformation that multiplies row \(i\) by some nonzero constant \(\alpha\) can be obtained by multiplying row \(i\) of the identity matrix by \(\alpha\).

Adding a constant multiple of a row to another row

Again, we work with the system given by \(Ax = b\). Suppose that \(2\) times row 1 is added to row 3. The resulting system is represented by \(\begin{bmatrix} x_1 + 2x_3 \\ 2x_1 + 6x_2 \\ x_2 + 4x_3 \end{bmatrix} = \begin{bmatrix} -1\\-2\\-1\end{bmatrix}\), which is exactly the same as \(T(Ax) = T(b)\) where \(T\) is the linear transformation given by \(T\left(\begin{bmatrix}u\\v\\w\end{bmatrix}\right) =\begin{bmatrix}u\\v\\2u+w\end{bmatrix} = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 2 & 0 & 1\end{bmatrix} \begin{bmatrix}u\\v\\w\end{bmatrix}\).

What is the elementary row operation that undoes what we have done? Because we have added \(2\) times row 1 to row 3, to undo this, we have to subtract \(2\) times row 1 from row 3. Thus, the undoing elementary row operation is to take \(-2\) times row 1 added to row 3. Taking the linear transformation \(U\) given by \(U\left(\begin{bmatrix}u\\v\\w\end{bmatrix}\right) = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ -2 & 0 & 1\end{bmatrix} \begin{bmatrix}u\\v\\w\end{bmatrix}\), we see that \(U(T(Ax)) = U(T(b))\) gives us back \(Ax = b\).

In general, the matrix that represents the linear transformation that adds \(\alpha\) times row \(i\) to row \(j\) is obtained from the identity matrix by adding \(\alpha\) times row \(i\) to row \(j\). For example, the matrix that represents adding 3 times row 4 to row 2 in a system with 5 equations is \(\begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 3 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix}\). This matrix is obtained from the \(5\times 5\) identity matrix by adding 3 times row 4 to row 2.

Row reduction as the result of a linear transformation

The discussion above shows that row reduction can be seen as a sequence of linear transformations applied to both sides of the system \(Ax=b\). Since the composition of linear transformations is a linear transformation, there must be a single linear transformation that captures the entire sequence. As a result, there is a single matrix \(M\) such that \(M(Ax) = Mb\) gives the result of the row reduction. We will see an example of computing \(M\).

Elementary matrices

In the above discussion, we also saw how each elementary row operation can be undone. In technical language, it means that the linear transformations that correspond to elementary row operations are invertible. This is an important fact because it means that applying elementary row operations to a system of linear equations does not result in any loss or gain of solutions.

The matrices that correspond to these linear transformations are called elementary matrices. It turns out that any square matrix that corresponds to an invertible linear transformation can be written as a product of elementary matrices. This remarkable fact will be exploited later when we look at determinants.

Quick Quiz

Exercise

  1. Let \(A = \begin{bmatrix} 1 & 2 \\ -1 & 1\end{bmatrix}\). Give an elementary matrix \(M\) such that \(MA = \begin{bmatrix} 1 & 2 \\ 1 & 5 \end{bmatrix}\).

  2. Give the \(4\times 4\) matrix that represents the following sequence of elementary row operations applied to a system of 4 linear equations: \(R_2 \leftrightarrow R_4\), \(R_1 \leftarrow R_1 + 2R_3\), \(R_1 \leftarrow 2R_1\).