A Gentle Introduction to the Art of Mathematics

6.5 Functions

The concept of a function is one of the most useful abstractions in mathematics. In fact, it is an abstraction that can be further abstracted! For instance an operator is an entity which takes functions as inputs and produces functions as outputs, thus an operator is to functions as functions themselves are to numbers. There are many operators that you have certainly encountered already — just not by that name. One of the most famous operators is “differentiation,” when you take the derivative of some function; the answer you obtain is another function. If two different people are given the same differentiation problem and they come up with different answers, we know that at least one of them has made a mistake! Similarly, if two calculations of the value of a function are made for the same input, they must match.

The property we are discussing used to be captured by saying that a function needs to be “well-defined.” The old school definition of a function was:

Definition 6.11 A function $f$ is a well-defined rule, that, given any input value $x$ produces a unique output⁵² value $f(x)$.

A more modern definition of a function is the following.

Definition 6.12 A function is a binary relation which does not contain distinct pairs having the same initial element.

When we think of a function as a special type of binary relation, the pairs that are “in” the function have the form $(x, f(x))$, that is, they consist of an input and the corresponding output.

We have gotten relatively used to relations “on” a set, but recall that the more general situation is that a binary relation is a subset of $A \times B$. In this setting, if the relation is actually a function $f$, we say that $f$ is a function from $A$ to $B$. Now, quite often there are input values that simply don’t work for a given function (for instance the well-known “you can’t take the square root of a negative” rule). Also, it is often the case that certain outputs just can’t happen. So, when dealing with a function as a relation contained in $A \times B$ there are actually four sets that are of interest — the sets $A$ and $B$ (of course) but also some sets that we’ll denote by $A'$ and $B'$. The set $A'$ consists of those elements of $A$ that actually appear as the first coordinate of a pair in the relation $f$. The set $B'$ consists of those elements of $B$ that actually appear as the second coordinate of a pair in the relation $f$. A generic example of how these four sets might look is given in Figure 6.22.

Figure 6.22: The sets related to an arbitrary function.

Sadly, only three of the sets we have just discussed are known to the mathematical world. The set we have denoted $A'$ is called the domain of the function $f$. The set we have denoted $B'$ is known as the range of the function $f$. The set we have denoted $B$ is called the codomain of the function $f$. The set we have been calling $A$ does not have a name. In fact, the formal definition of the term “function” has been rigged so that there is no difference between the sets $A$ and $A'$. This seems a shame, if you think of range and domain as being primary, doesn’t it seem odd that we have a way to refer to a superset of the range (i.e. the codomain) but no way of referring to a superset of the domain?

Nevertheless, this is just the way it is… There is only one set on the input side — the domain of our function.

The domain of any relation is expressed by writing $\operatorname{Dom}(\mathsf{R})$. Which is defined as follows.

Definition 6.13 If $\mathsf{R}$ is a relation from $A$ to $B$ then $\operatorname{Dom}(\mathsf{R})$ is a subset of $A$ defined by \[ \operatorname{Dom}(\mathsf{R}) = \{a \in A {\,:\,}\exists b \in B, (a,b) \in \mathsf{R}\} \]

We should point out that the notation just given for the domain of a relation $\mathsf{R}$, ($\operatorname{Dom}(\mathsf{R})$) has analogs for the other sets that are involved with a relation. We write $\operatorname{Cod}(\mathsf{R})$ to refer the the codomain of the relation, and $\operatorname{Rng}(\mathsf{R})$ to refer to the range.

Since we are now thinking of functions as special classes of relations, it follows that a function is just a set of ordered pairs. This means that the identity of a function is tied up, not just with a formula that gives the output for a given input, but also with what values can be used for those inputs. Thus, the function $f(x)=2x$ defined on ${\mathbb R}$ is a completely different animal from the function $f(x)=2x$ defined on ${\mathbb N}$. If you really want to specify a function precisely you must give its domain as well as a formula for it. Usually, one does this by writing a formula, then a semicolon, then the domain. For example, \[f(x)=x^2; \quad x \geq 0.\]

Okay, so, finally, we are prepared to give the real definition of a function.

Definition 6.14 If $A$ and $B$ are sets, then $f$ is a function from $A$ to $B$ (which is expressed symbolically by $f:A\longrightarrow B$), if and only if $f$ is a subset of $A\times B$, $\operatorname{Dom}(f)=A$ and \[(a,b) \in f \; \land \; (a,c) \in f \; \implies \; b=c.\]

Recapping, a function must have its domain equal to the set $A$ where its inputs come from. This is sometimes expressed by saying that a function is defined on its domain. A function’s range and codomain may be different however. In the event that the range and codomain are the same (that is, $\operatorname{Cod}(\mathsf{R}) = \operatorname{Rng}(\mathsf{R})$), we have a rather special situation and the function is graced by the appellation “surjection.” The term “onto” is also commonly used to describe a surjective function.

Exercise 6.9 There is an expression in mathematics, “Every function is onto its range.” that really doesn’t say very much. Why not?

If one has elements $x$ and $y$, of the domain and codomain, (respectively) and $y = f(x)$⁵³ then one may say that “$y$ is the image of $x$” or that “$x$ is a preimage of $y$.” Take careful note of the articles used in these phrases — we say “$y$ is the image of $x$” but “$x$ is a preimage of $y$.” This is because $y$ is uniquely determined by $x$, but not vice versa. For example, since the squares of $2$ and $-2$ are both $4$, if we consider the function $f(x) = x^2$, the image of (say) $2$ is $4$, but a preimage for $4$ could be either $2$ or $-2$.

It would be pleasant if there were a nice way to refer to the preimage of some element, $y$, of the range. One notation that you have probably seen before is “$f^{-1}(y)$.” There is a major difficulty with writing down such a thing. By writing “$f^{-1}$” you are making a rather vast presumption — that there actually is a function that serves as an inverse for $f$. Usually, there is not.

One can define an inverse for any relation, the inverse is formed by simply exchanging the elements in the ordered pairs that make up $\mathsf{R}$.

Definition 6.15 The inverse relation of a relation $\mathsf{R}$ is denoted $\mathsf{R}^{-1}$ and \[ \mathsf{R}^{-1} = \{ (y,x) {\,:\,}(x,y) \in \mathsf{R}\}. \]

In terms of graphs, the inverse and the original relation are related by being reflections in the line $y=x$. It is possible for one, both, or neither of these to be functions. The canonical example to keep in mind is probably $f(x) = x^2$ and its inverse.

Figure 6.23: The graph of $y = x^2$ and its reflection in the line $y = x$.

The graph that we obtain by reflecting $y=f(x)=x^2$ in the line $y=x$ (see Figure 6.23) doesn’t pass the vertical line test and so it is the graph of (merely) a relation – not of a function. The function $g(x) = \sqrt{x}$ that we all know and love is not truly the inverse of $f(x)$. In fact this function is defined to make a specific (and natural) choice — it returns the positive square root of a number. But this leads to a subtle problem; if we start with a negative number (say $-3$) and square it we get a positive number ($9$) and if we then come along and take the square root we get another positive number ($3$). This is problematic since we didn’t end up where we started which is what ought to happen if we apply a function followed by its inverse.

We’ll try to handle the general situation in a bit, but for the moment let’s consider the nice case: when the inverse of a function is also a function. When exactly does this happen? Well, we have just seen that the inverse of a function doesn’t necessarily pass the vertical line test, and it turns out that that is the predominant issue. So, under what circumstances does the inverse pass the vertical line test? When the original function passes the so-called horizontal line test (every horizontal line intersects the graph at most once). Thinking again about $f(x)=x^2$, there are some horizontal lines that miss the graph entirely, but all horizontal lines of the form $y=c$ where $c$ is positive will intersect the graph twice.

There are many functions that do pass the horizontal line test, for instance, consider $f(x) = x^3$. Such functions are known as injections, this is the same thing as saying a function is “one-to-one.” Injective functions can be inverted — the domain of the inverse function of $f$ will only be the range, $\operatorname{Rng}(f)$, which as we have seen may fall short of the being the entire codomain, since $\operatorname{Rng}(f) \subseteq \operatorname{Cod}(f)$.

Let’s first define injections in a way that is divorced from thinking about their graphs.

Definition 6.16 A function $f(x)$ is an injection iff for all pairs of inputs $x_1$ and $x_2$, if $f(x_1) = f(x_2)$ then $x_1=x_2$.

This is another of those defining properties that is designed so that when it is true it is vacuously true. An injective function never takes two distinct inputs to the same output. Perhaps the cleanest way to think about injective functions is in terms of preimages — when a function is injective, preimages are unique. Actually, this is a good time to mention something about surjective functions and preimages — if a function is surjective, every element of the codomain has a preimage. So, if a function has both of these properties it means that every element of the codomain has one (and only one) preimage.

A function that is both injective and surjective (one-to-one and onto) is known as a bijection. Bijections are tremendously important in mathematics since they provide a way of perfectly matching up the elements of two sets. You will probably spend a good bit of time in the future devising maps between sets and then proving that they are bijections, so we will start practicing that skill now…

Ordinarily, we will show that a function is a bijection by proving separately that it is both a surjection and an injection.

To show that a function is surjective we need to show that it is possible to find a preimage for every element of the codomain. If we happen to know what the inverse function is, then it is easy to find a preimage for an arbitrary element. In terms of the taxonomy for proofs that was introduced in Chapter 3, we are talking about a constructive proof of an existential statement. A function $f$ is surjective iff \[\forall y \in \operatorname{Cod}(f), \exists x \in \operatorname{Dom}(f), y = f(x).\] So to prove surjectivity is to find the $x$ that “works” for an arbitrary $y$. If this is done by literally naming $x$, we have proved the statement constructively.

To show that a function is an injection, we traditionally prove that the property used in the definition of an injective function is true. Namely, we suppose that $x_1$ and $x_2$ are distinct elements of $\operatorname{Dom}(f)$ and that $f(x_1)=f(x_2)$ and then we show that actually $x_1 = x_2$. This is in the spirit of a proof by contradiction — if there were actually distinct elements that get mapped to the same value then $f$ would not be injective, but by deducing that $x_1=x_2$ we are contradicting that presumption and so, are showing that $f$ is indeed an injection.

Let’s start by looking at a very simple example, $f(x)=2x-1; \; x \in {\mathbb N}$. Clearly this function is not a surjection if we are thinking that $\operatorname{Cod}(f)={\mathbb N}$ since the outputs are always odd. Let \[{\mathcal O} = \{1, 3, 5, 7, \ldots \}\] be the set of odd naturals.

Theorem 6.2 The function $f:{\mathbb Z}_{> 0}\longrightarrow {\mathcal O}$ defined by $f(x) = 2x-1$ is a bijection from ${\mathbb Z}_{> 0}$ to ${\mathcal O}$.

Proof. First we will show that $f$ is surjective. Consider an arbitrary element $y$ of the set $\mathcal O$. Since $y \in {\mathcal O}$ it follows that $y$ is both positive and odd. Thus there is an integer $k$, such that $y=2k+1$, but also $y>0$. From this it follows that $2k+1 >0$ and so $k > -1/2$. Since $k$ is also an integer, this last inequality implies that $k \in {\mathbb Z}_{\geq 0}$. (Recall that ${\mathbb Z}_{\geq 0}= \{0,1,2,3, \ldots \}$.) We can easily verify that a preimage for $y$ is $k+1$, since $f(k+1) = 2(k+1)-1 = 2k+2-1 = 2k+1 = y$.

Next we show that $f$ is injective. Suppose that there are two input values, $x_1$ and $x_2$ such that $f(x_1) = f(x_2)$. Then $2x_1-1 = 2x_2-1$ and simple algebra leads to $x_1=x_2$.

For a slightly more complicated example consider the function from ${\mathbb Z}_{> 0}$ to ${\mathbb Z}$ defined by \[ f(x) = \left\{ \begin{array}{cl} x/2 & \mbox{if $x$ is even} \\ -(x-1)/2 & \mbox{if $x$ is odd} \end{array} \right. \]

This function does quite a handy little job, it matches up the positive integer and the integers in pairs. Every even positive integer gets matched with a positive integer and every odd positive integer (except 1) gets matched with a negative integer (1 gets paired with 0). This function is really doing something remarkable — common sense would seem to indicate that the set of integers must be set than the set of positive integers after all ${\mathbb Z}_{> 0}$ is completely contained inside of ${\mathbb Z}$), but the function $f$ defined above serves to show that these two sets are exactly the same size!

Theorem 6.3 The function $f$ defined above is bijective.

Proof. First we will show that $f$ is surjective.

It suffices to find a preimage for an arbitrary element of ${\mathbb Z}$. Suppose that $y$ is a particular but arbitrarily chosen integer. There are two cases to consider: $y\leq 0$ and $y>0$.

If $y>0$ then $x=2y$ is a preimage for $y$. This follows easily since $x=2y$ is obviously even and so $x$’s image will be defined by the first case in the definition of $f$. Thus $f(x) = f(2y) = (2y)/2 = y$.

If $y \leq 0$ then $x=1-2y$ is a preimage for $y$. Clearly, $1-2y$ is odd whenever $y$ is an integer, thus this value for $x$ will fall into the second case in the definition of $f$. So, $f(x) = f(1-2y) = -((1-2y)-1)/2 = -(-2y)/2 = y$.

Since the cases $y>0$ and $y\leq 0$ are exhaustive (that is, every $y$ in ${\mathbb Z}$ falls into one or the other of these cases), and we have found a preimage for $y$ in both cases, it follows that $f$ is surjective.

Next, we will show that $f$ is injective.

Suppose that $x_1$ and $x_2$ are positive integers such that $f(x_1)=f(x_2)$. Consider the following three cases: $x_1$ and $x_2$ are both even, both odd, or have opposite parity.

If $x_1$ and $x_2$ are both even, then by the definition of $f$ we have $f(x_1) = x_1/2$ and $f(x_2) = x_2/2$ and since these functional values are equal, we have $x_1/2 = x_2/2$. Doubling both sides of this leads to $x_1=x_2$.

If $x_1$ and $x_2$ are both odd, then by the definition of $f$ we have $f(x_1) = -(x_1-1)/2$ and $f(x_2) = -(x_2-1)/2$ and since these functional values are equal, we have $-(x_1-1)/2 = -(x_2-1)/2$. A bit more algebra (doubling, negating and adding one to both sides) leads to $x_1=x_2$.

If $x_1$ and $x_2$ have opposite parity, we will assume w.l.o.g. that $x_1$ is even and $x_2$ is odd. The equality $f(x_1)=f(x_2)$ becomes $x_1/2 = -(x_2-1)/2$. Note that $x_1 \geq 2$ so $f(x_1) = x_1/2 \geq 1$. Also, note that $x_2 \geq 1$ so \[\begin{gather*} x_2 - 1 \geq 0 \\ (x_2-1)/2 \geq 0 \\ -(x_2-1)/2 \leq 0 \\ f(x_2) \leq 0. \end{gather*}\]

Therefore, we have a contradiction since it is impossible for the two values $f(x_1)$ and $f(x_2)$ to be equal while $f(x_1) \geq 1$ and $f(x_2) \leq 0$.

Since the last case under consideration leads to a contradiction, it follows that $x_1$ and $x_2$ never have opposite parities, and so the first two cases are exhaustive — in both of those cases we reached the desired conclusion that $x_1 = x_2$ so it follows that $f$ is injective.

We’ll conclude this section by mentioning that the ideas of “image” and “preimage” can be extended to sets. If $S$ is a subset of $\operatorname{Dom}(f)$ then the image of $S$ under $f$ is denoted $f(S)$ and \[ f(S) = \{ y {\,:\,}\exists x \in \operatorname{Dom}(f), x \in S \land y = f(x) \}. \]

Similarly, if $T$ is a subset of of $\operatorname{Rng}(f)$ we can define something akin to the preimage. The inverse image of the set $T$ under the function $f$ is denoted $f^{-1}(T)$ and \[ f^{-1}(T) = \{ x {\,:\,}\exists y \in \operatorname{Cod}(f), y \in T \land y=f(x) \}.\]

Essentially, we have extended the function $f$ so that it goes between the power sets of its codomain and range! This new notion gives us some elegant ways of restating what it means to be surjective and injective.

A function $f$ is surjective iff $f(\operatorname{Dom}(f)) = \operatorname{Cod}(f)$.

A function $f$ is injective iff the inverse images of singletons are always singletons. That is, \[ \forall y \in \operatorname{Rng}(f), \exists x \in \operatorname{Dom}(f), f^{-1}(\{y\}) = \{x\}. \]

6.5.1 Exercises

For each of the following functions, give its domain, range and a possible codomain.
1. $f(x) = \sin{(x)}$
2. $g(x) = e^x$
3. $h(x) = x^2$
4. $m(x) = \frac{x^2+1}{x^2-1}$
5. $n(x) = \lfloor x \rfloor$
Find a bijection from the set of odd squares, $\{1, 9, 25, 49, \ldots\}$, to the non-negative integers, ${\mathbb Z}_{\geq 0}= \{0,1,2,3, \ldots\}$. Prove that the function you just determined is both injective and surjective. Find the inverse function of the bijection above.
The natural logarithm function $\ln (x)$ is defined by a definite integral with the variable $x$ in the upper limit.

\[ \ln (x) = \int_{t=1}^{x} \frac{1}{t} \, \mbox{d}t. \]

From this definition we can deduce that $\ln (x)$ is strictly increasing on its entire domain, $(0, \infty)$. Why is this true?

We can use the above definition with $x=2$ to find the value of $\ln (2) \approx .693$. We will also take as given the following rule (which is valid for all logarithmic functions).

\[ \ln(a^b) = b \ln(a) \]

Use the above information to show that $\operatorname{Rng}(\ln) = {\mathbb R}$.
Georg Cantor developed a systematic way of listing the rational numbers. By “listing” a set one is actually developing a bijection from ${\mathbb N}$ to that set. The method known as “Cantor’s Snake” creates a bijection from the naturals to the non-negative rationals. First, we create an infinite table whose rows are indexed by positive integers and whose columns are indexed by non-negative integers — the entries in this table are rational numbers of the form “column index” / “row index.” We then follow a snake-like path that zig-zags across this table — whenever we encounter a rational number that we haven’t seen before (in lower terms) we write it down. This is indicated in Figure 6.24 by circling the entries.

Figure 6.24: Traversal of Cantor’s Snake.

Effectively this gives us a function $f$ which produces the rational number that would be found in a given position in this list. For example $f(0) = 0/1, f(1) = 1/1$ and $f(4) = 1/3$.

What is $f(25)$? What is $f(29)$? What is $f^{-1}(3/4)$? What is $f^{-1}(6/7)$?

The use of the notation $f(x)$ to indicate the output of function $f$ associated with input $x$ was instituted by Leonhard Euler, and so it is known as Euler notation.↩
Or, equivalently, $(x,y) \in f$.↩