Let \(A\) be an \(n\times n\) matrix. \(A\) is said to be symmetric if \(A = A^\mathsf{T}\).
\(\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\), \(\begin{bmatrix} \pi & 1 \\ 1 & \sqrt{2} \end{bmatrix}\), \(\begin{bmatrix} 1 & 2 & 3 \\ 2 & 4 & 5 \\ 3 & 5 &6 \end{bmatrix}\)
Symmetric matrices are found in many applications such as control theory, statistical analyses, and optimization.
Real symmetric matrices have only real eigenvalues. We will establish the \(2\times 2\) case here. Proving the general case requires a bit of ingenuity.
Let \(A\) be a \(2\times 2\) matrix with real entries. Then \(A = \begin{bmatrix} a & b\\ b & c\end{bmatrix}\) for some real numbers \(a,b,c\). The eigenvalues of \(A\) are all values of \(\lambda\) satisfying \[ \left|\begin{array}{cc} a - \lambda & b \\ b & c - \lambda \end{array}\right | = 0.\] Expanding the left-hand-side, we get \[ \lambda^2 -(a+c)\lambda + ac - b^2 = 0.\] The left-hand side is a quadratic in \(\lambda\) with discriminant \( (a+c)^2 - 4ac + 4b^2 = (a-c)^2 + 4b^2\) which is a sum of two squares of real numbers and is therefore nonnegative for all real values \(a,b,c\). Hence, all roots of the quadratic are real and so all eigenvalues of \(A\) are real.
Real symmetric matrices not only have real eigenvalues, they are always diagonalizable. In fact, more can be said about the diagonalization.
We say that \(U \in \mathbb{R}^{n\times n}\) is orthogonal if \(U^\mathsf{T}U = UU^\mathsf{T} = I_n\). In other words, \(U\) is orthogonal if \(U^{-1} = U^\mathsf{T}\).
If we denote column \(j\) of \(U\) by \(u_j\), then the \((i,j)\)-entry of \(U^\mathsf{T}U\) is given by \(u_i\cdot u_j\). Since \(U^\mathsf{T}U = I\), we must have \(u_j\cdot u_j = 1\) for all \(j = 1,\ldots n\) and \(u_i\cdot u_j = 0\) for all \(i\neq j\). Therefore, the columns of \(U\) are pairwise orthogonal and each column has norm 1. We say that the columns of \(U\) are orthonormal. A vector in \(\mathbb{R}^n\) having norm 1 is called a unit vector.
The identity matrix is trivially orthogonal. Here are two nontrivial orthogonal matrices: \(\displaystyle\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}\), \(\displaystyle\frac{1}{9}\begin{bmatrix} -7 & 4 & 4 \\ 4 & -1 & 8 \\ 4 & 8 & -1 \end{bmatrix}\)
A real square matrix \(A\) is orthogonally diagonalizable if there exist an orthogonal matrix \(U\) and a diagonal matrix \(D\) such that \(A = UDU^\mathsf{T}\). Orthogonalization is used quite extensively in certain statistical analyses.
An orthogonally diagonalizable matrix is necessarily symmetric. Indeed, \(( UDU^\mathsf{T})^\mathsf{T} = (U^\mathsf{T})^\mathsf{T}D^\mathsf{T}U^\mathsf{T} = UDU^\mathsf{T}\) since the transpose of a diagonal matrix is the matrix itself.
The amazing thing is that the converse is also true: Every real symmetric matrix is orthogonally diagonalizable. The proof of this is a bit tricky. However, for the case when all the eigenvalues are distinct, there is a rather straightforward proof which we now give.
First, we claim that if \(A\) is a real symmetric matrix and \(u\) and \(v\) are eigenvectors of \(A\) with distinct eigenvalues \(\lambda\) and \(\gamma\), respectively, then \(u^\mathsf{T} v = 0\). To see this, observe that \(\lambda u^\mathsf{T} v = (\lambda u)^\mathsf{T} v = (Au)^\mathsf{T} v = u^\mathsf{T} A^\mathsf{T} v u^\mathsf{T} A v = \gamma u^\mathsf{T} v\). Hence, if \(u^\mathsf{T} v\neq 0\), then \(\lambda = \gamma\), contradicting that they are distinct. This proves the claim.
Now, let \(A\in\mathbb{R}^{n\times n}\) be symmmetric with distinct eigenvalues \(\lambda_1,\ldots,\lambda_n\). Then every eigenspace is spanned by a single vector; say \(u_i\) for the eigenvalue \(\lambda_i\), \(i = 1,\ldots, n\). We may assume that \(u_i \cdot u_i =1\) for \(i = 1,\ldots,n\). If not, simply replace \(u_i\) with \(\frac{1}{\|u_i\|}u_i\). This step is called normalization.
Let \(U\) be an \(n\times n\) matrix whose \(i\)th column is given by \(u_i\). Let \(D\) be the diagonal matrix with \(\lambda_i\) as the \(i\)th diagonal entry. Then, \(A = UDU^{-1}\).
To complete the proof, it suffices to show that \(U^\mathsf{T} = U^{-1}\). First, note that the \(i\)th diagonal entry of \(U^\mathsf{T}U\) is \(u_i^\mathsf{T}u_i = u_i \cdot u_i = 1\). Hence, all entries in the diagonal of \(U^\mathsf{T}U\) are 1.
Now, the \((i,j)\)-entry of \(U^\mathsf{T}U\), where \(i \neq j\), is given by \(u_i^\mathsf{T}u_j\). As \(u_i\) and \(u_j\) are eigenvectors with different eigenvalues, we see that this \(u_i^\mathsf{T}u_j = 0\).
Thus, \(U^\mathsf{T}U = I_n\). Since \(U\) is a square matrix, we have \(U^\mathsf{T} = U^{-1}\).
The above proof shows that in the case when the eigenvalues are distinct, one can find an orthogonal diagonalization by first diagonalizing the matrix in the usual way, obtaining a diagonal matrix \(D\) and an invertible matrix \(P\) such that \(A = PDP^{-1}\). Then normalizing each column of \(P\) to form the matrix \(U\), we will have \(A = U D U^\mathsf{T}\).
To see a proof of the general case, click here.
Give an orthogonal diagonalization of \(A = \begin{bmatrix} 3 & -2 \\ -2 & 3\end{bmatrix}\).