Gram-Schmidt Orthonormalization Process

Gram-Schmidt orthonormalization process

Let \(V\) be a subspace of \(\mathbb{R}^n\) of dimension \(k\). We look at how one can obtain an orthonormal basis for \(V\) starting with any basis for \(V\).

Let \(\{v_1,\ldots,v_k\}\) be a basis for \(V\), not necessarily orthonormal.

We will construct \(\{u_1,\ldots,u_k\}\) iteratively such that \(\{u_1,\ldots, u_p\}\) is an orthonormal basis for the span of \(\{v_1,\ldots,v_p\}\).

For \(p = 1\), we simply take \(u_1 = \frac{1}{\|v_1\|}\).

Supposed that \(u_1,\ldots,u_{p-1}\) is an orthonormal basis for the span of \(\{v_1,\ldots,v_p\}\).

Observe that we need to obtain a vector \(d\) such that \(\|d\| = 1\), \(u_i\cdot d=0\) for \(i = 1,\ldots, p-1\), and that \(v_p\) can be written as a linear combination of \(u_1,\ldots, u_{p-1},d\). Ignoring the condition \(\|d\| = 1\) for now, we want to solve for \(d\) in the following system: \begin{eqnarray*} u_1\cdot d & = & 0 \\ u_2\cdot d & = & 0 \\ & \vdots & \\ u_{p-1}\cdot d & = & 0 \\ \lambda_1 u_1 \cdots \cdot \cdots \lambda_{p-1} u_{p-1} + \alpha d & = & v_p \end{eqnarray*} Here, the variables are \(\lambda_1,\ldots,\lambda_{p-1}\) and \(d_1,\ldots,d_n\). Unfortunately, the last equation in the system is not linear. But notice that given any solution, we cannot have \(\alpha = 0\) since \(v_p\) is not in the span of \(u_1,\ldots, u_{p-1}\). Hence, assuming that \(\alpha \neq 0\), we can make the substituion \(u = \alpha d\) and obtain the system \begin{eqnarray*} \frac{1}{\alpha}u_1\cdot u & = & 0 \\ \frac{1}{\alpha}u_2\cdot u & = & 0 \\ & \vdots & \\ \frac{1}{\alpha}u_{p-1}\cdot u & = & 0 \\ \lambda_1 u_1 \cdots \cdot \cdots \lambda_{p-1} u_{p-1} + u & = & v_p \end{eqnarray*} or equivalently, \begin{eqnarray*} u_1\cdot u & = & 0 \\ u_2\cdot u & = & 0 \\ & \vdots & \\ u_{p-1} \cdot u & = & 0 \\ \lambda_1 u_1 \cdots \cdot \cdots \lambda_{p-1} u_{p-1} + u & = & v_p \end{eqnarray*} Using the last equation, we can write \(u = v_p - (\lambda_1 u_1 + \cdots + \lambda_{p-1} u_{p-1})\). Substituting this into the \(i\)th equation where \(i \in \{1,\ldots, p-1\}\), we obtain \[ u_i\cdot (v_p - (\lambda_1 u_1 + \cdots + \lambda_{p-1} u_{p-1})) = 0, \] which simplifies to \(u_i\cdot v_p = \lambda_i\). Thus, we must have \[u = v_p - ((u_1\cdot v_p)u_1 + (u_2 \cdot v_p) u_2 + \cdots + (u_{p-1} \cdot v_p) u_{p-1}).\] Note that \(u \neq 0\). Hence, we can take \(u_p = \frac{1}{\|u\|} u\).

The process can be summarized as follows: \begin{eqnarray*} u_1 & = & \frac{1}{\|v_1\|}v_1, \\ u_2 & = & \frac{1}{\|e_2\|}e_2 \text{ where } e_2 = v_2 - (u_1\cdot v_2)u_1, \\ u_3 & = & \frac{1}{\|e_3\|}e_3 \text{ where } e_3 = v_3 - ((u_1\cdot v_3)u_1 + (u_2\cdot v_3)u_2), \\ & \vdots & \\ u_i & = & \frac{1}{\|e_i\|}e_i \text{ where } e_i = v_i - ((u_1\cdot v_i)u_1 + (u_2\cdot v_i)u_2 + \cdots +(u_{i-1}\cdot v_i)u_{i-1})) \\ & \vdots & \\ u_k & = & \frac{1}{\|e_k\|}e_k \text{ where } e_k = v_k - ((u_1\cdot v_k)u_1 + (u_2\cdot v_k)u_2 + \cdots +(u_{k-1}\cdot v_k)u_{k-1})) \\ \end{eqnarray*}

Example

Let \(v_1 = \begin{bmatrix} 1\\ 1 \\1 \\ 1 \end{bmatrix}\), \(v_2 = \begin{bmatrix} 1\\ 1 \\-1 \\ -1 \end{bmatrix}\), \(v_3 = \begin{bmatrix} 0\\ -1 \\2 \\ 1 \end{bmatrix}\). We now apply the Gram-Schmidt process to \(\{v_1,v_2,v_3\}\).

Note that \(u_1 = \frac{1}{\|v_1\|}v_1 = \frac{1}{2}\begin{bmatrix} 1 \\ 1\\ 1\\ 1 \end{bmatrix}\).

Now, \(e_2 = v_2 - (u_1\cdot v_2) u_1 = \begin{bmatrix} 1\\1 \\ -1 \\-1\end{bmatrix} - (0)u_1 = \begin{bmatrix} 1\\1 \\ -1 \\-1\end{bmatrix}\). Hence, \(u_2 = \frac{1}{2}\begin{bmatrix} 1\\1 \\ -1 \\-1\end{bmatrix}\). (Note here that \(v_2\) and \(u_1\) are already orthogonal. Thus \(u_2\) is obtained simply by normalizing \(v_2\).)

Now, \begin{eqnarray*} e_3 & = & v_3 - (u_1\cdot v_3) u_1 - (u_2\cdot v_3) u_2 \\ & = & \begin{bmatrix} 0\\-1 \\ 2 \\1\end{bmatrix} - u_1 - (-2)u_2 \\ & = & \begin{bmatrix} 0\\-1 \\ 2 \\1\end{bmatrix} - \frac{1}{2}\begin{bmatrix} 1\\1 \\1 \\1\end{bmatrix} +\begin{bmatrix} 1 \\ 1 \\ -1 \\-1\end{bmatrix} \\ & = & \begin{bmatrix} \frac{1}{2}\\ -\frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2}\end{bmatrix}. \end{eqnarray*} Since \(\|e_3\| = 1\), we have \(u_3 = \begin{bmatrix} \frac{1}{2}\\ -\frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2}\end{bmatrix}\). Thus, \(\{u_1,u_2,u_3\}\) is an orthonormal basis for \(\operatorname{span}(\{v_1,v_2,v_3\})\).