Up Main page

Lossy image compression

The following bitmap image is represented by a \(864\times 1,536\) matrix, call it \(A\), in which each entry is an 8-bit integer.

chess set (Click here to view the image if you are not using Apple Safari.)

The basic idea behind lossy image compression is to obtain a representation of a matrix close to \(A\) that requires less storage than what is needed to store all the entries of \(A\). Here, the matrix \(A\) is approximately equal to the product of two matrices (in a format compatible with Octave/MATLAB) \(P\) and \(Q\) with entries in 32-bit floats where \(P\) is \(864\times 100\) and \(Q\) is \(100 \times 1,536\). The image that corresponds to the product \(PQ\) is as follows:

chess set compressed (Click here to view the image if you are not using Apple Safari.)

Even though the second image shows degradation from the first image, the differences are not too pronounced, especially when viewed on a small screen. In fact, the second image maybe good enough for some practical purposes.

However, the main difference between the two is that storing all the entries of \(A\) requires \(864 \times 1,536 = 1,327,104\) bytes since each entry is an 8-bit integer; whereas storing the entries of \(P\) and \(Q\) require a total of \(4\times(864\times 100 + 100\times 1,536) = 960,000\) bytes since each entry is a 32-bit float. Hence, storing \(P\) and \(Q\) instead of \(A\) leads to a reduction of more than 27%. (In fact, one can use 24-bit floats instead of 32-bit floats. Doing so will result in more savings but will require a custom routine for saving and loading such matrices.)

This example raises two important questions:

  1. What do we mean when we say that a matrix is close to another matrix?

  2. How do we find the matrices \(P\) and \(Q\)?

We will give an answer to the first question here. The second question will be addressed later.

The Frobenius norm

To measure the “distance” between matrices, we make use of what is known as the Frobenius norm of a matrix.

Let \(A \in \mathbb{R}^{m \times n}\). The Frobenius norm of \(A\), denoted by \(\|A\|_F\), is the quantity \[\sqrt{ \sum_{i = 1}^m \sum_{j = 1}^n A_{i,j}^2}.\] In other words, it is the square root of the sum of the squares of the entries of \(A\).

With the Frobenius norm defined, if \(B \in \mathbb{R}^{m\times n}\), then the distance between \(A\) and \(B\), denoted by \(d(A,B)\), is the quantity \[\|A - B\|_F.\]

So we say that the smaller the value \(d(A,B)\) is, the closer the matrix \(B\) is to \(A\). Note that \(d(A,B) = d(B,A)\).

Example

If \(A = \begin{bmatrix} 1 & 0 \\ 0 & 1\end{bmatrix}\) and \(B = \begin{bmatrix} 1 & -1 \\ 2 & 0\end{bmatrix}\), then \[d(A,B) = \|A-B\|_F = \left\|\begin{bmatrix} 0 & 1 \\ -2 & 1\end{bmatrix}\right \|_F = \sqrt{ 0^2 + 1^2 + (-2)^2 + 1^2 } = \sqrt{6}. \]

Low-rank matrices for lossy compression

To compress images, we need to find good approximations that requires less storage. Matrices with low rank could be beneficial here.

To see why this could be so, suppose that \(B\) is a matrix of rank \(r\). Recall that we can obtain, via row reduction, an invertible matrix \(M \in \mathbb{R}^{m\times n}\) and a matrix \(R\) in RREF such that \(B = MR\).

Since \(B\) has rank \(r\), the bottom \(m-r\) rows of \(Q\) contain only 0's. So if we let \(P\) be the matrix consisting of the first \(m-r\) columns of \(M\) and let \(Q\) be the matrix consisting of the first \(m-r\) rows of \(R\), then \(PQ = MR = B\). Hence, if \(r\) is significantly less than \(\min(m, n)\), then storing \(P\) and \(Q\) instead of \(B\) will require much less storage. This is precisely the situation for the image example above.

In summary, one way to perform lossy compression for black-and-white bitmaps is to find a low-rank matrix whose distance to the matrix of the original image is small. We will use singular value decomposition to obtain such a matrix.