A Gentle Introduction to the Art of Mathematics

8.5 The continuum hypothesis

The word “continuum” in the title of this section is used to indicate sets of points that have a certain continuity property. For example, in a real interval it is possible to move from one point to another, in a smooth fashion, without ever leaving the interval. In a range of rational numbers this is not possible, because there are irrational values in between every pair of rationals. There are many sets that behave as a continuum — the intervals \((a, b)\) or \([a, b]\), the entire real line \({\mathbb R}\), the \(x\)-\(y\) plane \({\mathbb R}\times {\mathbb R}\), a volume in 3-dimensional space (or for that matter the entire space \({\mathbb R}^3\)). It turns out that all of these sets have the same size.

The cardinality of the continuum, denoted c, is the cardinality of all of the sets above.

In the previous section, we mentioned the continuum hypothesis and how angry Cantor became when someone (König) tried to prove it was false. In this section, we’ll delve a little deeper into what the continuum hypothesis says and even take a look at CH’s big brother, GCH. Before doing so, it seems like a good idea to look into the equivalences we’ve asserted about all those sets above which (if you trust us) have the cardinality c.

We’ve already seen that an interval is equivalent to the entire real line but the notion that the entire infinite Cartesian plane has no more points in it than an interval one inch long defies our intuition. Our conception of dimensionality leads us to think that things of higher dimension must be larger than those of lower dimension. This preconception is false as we can see by demonstrating that a \(1 \times 1\) square can be put in one-to-one correspondence with the unit interval. Let \(S = \{ (x, y) {\,:\,}0 < x < 1 \land 0 < y < 1 \}\) and let \(I\) be the open unit interval \((0, 1)\). We can use the Cantor-Bernstein-Schröder theorem to show that \(S\) and \(I\) are equinumerous — we just need to find injections from \(I\) to \(S\) and vice versa.

Given an element \(r\) in \(I\) we can map it injectively to the point \((r, r)\) in \(S\). To go in the other direction, consider a point \((a, b)\) in \(S\) and write out the decimal expansions of \(a\) and \(b\): \[\begin{align*} a & = 0.a_1a_2a_3a_4a_5\ldots \\ b & = 0.b_1b_2b_3b_4b_5\ldots \end{align*}\]

as usual, if there are two decimal expansions for \(a\) and/or \(b\) we will make a consistent choice — say the infinite one.

From these decimal expansions, we can create the decimal expansion of a number in \(I\) by interleaving the digits of \(a\) and \(b\). Let \[ s = 0.a_1b_1a_2b_2a_3b_3 \ldots \]

be the image of \((a, b)\). If two different points get mapped to the same value \(s\) then both points have \(x\) and \(y\) coordinates that agree in every position of their decimal expansion (so they must really be equal). It is a little bit harder to create a bijective function from \(S\) to \(I\) (and thus to show the equivalence directly, without appealing to C-B-S). The problem is that, once again, we need to deal with the non-uniqueness of decimal representations of real numbers. If we make the choice that, whenever there is a choice to be made, we will use the non-terminating decimal expansions for our real numbers there will be elements of \(I\) not in the image of the map determined by interleaving digits (for example \(0.15401050902060503\ldots\) is the interleaving of the digits after the decimal point in \(\pi = 3.141592653\ldots\) and \(1/2 = 0.5\), this is clearly an element of \(I\) but it can’t be in the image of our map since \(1/2\) should be represented by \(0.4\overline{9}\) according to our convention. If we try other conventions for dealing with the non-uniqueness it is possible to find other examples that show simple interleaving will not be surjective. A slightly more subtle approach is required.

Presume that all decimal expansions are non-terminating (as we can, WLOG) and use the following approach: Write out the decimal expansion of the coordinates of a point \((a, b)\) in \(S\). Form the digits into blocks with as many 0’s as possible followed by a non-zero digit. Finally, interleave these blocks.

For example if \[ a = 0.124520047019902 \ldots \] and \[ b = 0.004015648000031 \ldots \] we would separate the digits into blocks as follows: \[ a = 0.1 \quad 2 \quad 4 \quad 5\quad 2 \quad 004\quad 7\quad 01\quad 9 \quad 9 \quad 02 \ldots \] and \[ b = 0.004 \quad 01 \quad 5 \quad 6 \quad 4 \quad 8 \quad 00003 \quad 1 \ldots \] and the number formed by interleaving them would be \[ s = 0.10042014556240048 \ldots \]

We’ve shown that the unit square, \(S\), and the unit interval, \(I\), have the same cardinality. These arguments can be extended to show that all of \({\mathbb R}\times {\mathbb R}\) also has this cardinality (c).

So now let’s turn to the continuum hypothesis.

We mentioned earlier in this chapter that the cardinality of \({\mathbb N}\) is denoted \(\aleph_0\). The fact that that capital letter aleph is wearing a subscript ought to make you wonder what other aleph-sub-something-or-others there are out there. What is \(\aleph_1\)? What about \(\aleph_2\)? Cantor presumed that there was a sequence of cardinal numbers (which is itself, of course, infinite) that give all of the possible infinities. The smallest infinite set that anyone seems to be able to imagine is \({\mathbb N}\), so Cantor called that cardinality \(\aleph_0\). What ever the “next” infinite cardinal is, is called \(\aleph_1\). It’s conceivable that there actually isn’t a “next” infinite cardinal after \(\aleph_0\) — it might be the case that the collection of infinite cardinal numbers isn’t well-ordered! In any case, if there is a “next” infinite cardinal, what is it? Cantor’s theorem shows that there is a way to build some infinite cardinal bigger than \(\aleph_0\) — just apply the power set construction. The continuum hypothesis just says that this bigger cardinality that we get by applying the power set construction is that “next” cardinality we’ve been talking about.

To re-iterate, we’ve shown that the power set of \({\mathbb N}\) is equivalent to the interval \((0,1)\) which is one of the sets whose cardinality is c. So the continuum hypothesis, the thing that got Georg Cantor so very heated up, comes down to asserting that \(\aleph_1 =\) c.

There really should be a big question mark over that. A really big question mark. It turns out that the continuum hypothesis lives in a really weird world… To this day, no one has the least notion of whether it is true or false. But wait! That’s not all! The real weirdness is that it would appear to be impossible to decide. Well, that’s not so bad — after all, we talked about undecidable sentences way back in the beginning of Chapter 2. Okay, so here’s the ultimate weirdness. It has been proved that one can’t prove the continuum hypothesis. It has also been proved that one can’t disprove the continuum hypothesis.

Having reached this stage in a book about proving things, I hope that the last two sentences in the previous paragraph caused some thought along the lines of “well, ok, with respect to what axioms?” to run through your head. So, if you did think something along those lines, pat yourself on the back. And if you didn’t, then recognize that you need to start thinking that way — things are proved or disproved only in a relative way, it depends what axioms you allow yourself to work with. The usual axioms for mathematics are called ZFC; the Zermelo-Frankel set theory axioms together with the axiom of choice. The “ultimate weirdness” we’ve been describing about the continuum hypothesis is a result due to a gentleman named Paul Cohen that says “CH is independent of ZFC.” More pedantically, it is impossible to either prove or disprove the continuum hypothesis within the framework of the ZFC axiom system.

It would be really nice to end this chapter by mentioning Paul Cohen, but there is one last thing we’d like to accomplish — explain what GCH means.

The generalized continuum hypothesis says that the power set construction is basically the only way to get from one infinite cardinality to the next. In other words, GCH says that not only does \({\mathcal P}({\mathbb N})\) have the cardinality known as \(\aleph_1\), but every other aleph number can be realized by applying the power set construction a bunch of times. Some people would express this symbolically by writing \[ \forall n \in {\mathbb N}, \quad \aleph_{n+1} = 2^{\aleph_n}. \]