Representing an integer in a negative base

Kevin Cheung

December 10, 2022

Base-\(b\) representation

Most of us are exposed to representing integers in base 10 at an early age. Eventually, we might get exposed to base-2, base-8, and base-16 representations, especially in an introductory computer science course. Other positive integer bases are of course possible. The most trivial base, base 1, leads to the unary representation which is simply a numeral of ones (with a negative sign in front if the integer is negative) having length equal to the absolute value of the integer that we want to represent.

A recent YouTube video with over 120,000 views talks about negative bases. It is rather fascinating. In this article, we will fill in the theoretical details and implement a function in Haskell that gives the representation of an integer in a negative integer base. It is probably beneficial to watch the video before reading on.

Existence

Let \(b\) be an integer with \(|b| \geq 2\). Let \(n\) be an integer. If there exists an integer \(k \geq 0\) and \(0 \leq a_0,\ldots,a_k < |b|\) such that \[n = a_k b^k + a_{k-1} b^{k-1} + \cdots + a_1 b^1 + a_0b^0,\] then \((a_k,a_{k-1},\ldots,a_0)\) is a base-\(b\) representation of \(n\). (Note of course that \(b^1 = b\) and \(b^0 = 1\).)

It is an easy, though not a trivial, exercise to to show by induction that if \((a_k,\ldots,a_0)\) is a base-\(b\) representation of \(0\), then \(a_k=\cdots=a_0=0\). The complication here is that the odd powers of \(b\) are negative.

If \(n \neq 0\), we say that the representation is proper if \(a_k \neq 0\). For example, \((1,1,0,1,0,1)\) is a proper base-\((-2)\) representation of \(-11\) since \[-11 = -32 + 16 + 4 + 1 = (-2)^5 + (-2)^4 + (-2)^2 + (-2)^0.\]

To establish the existence of base-\(b\) representations of nonzero integers, we first prove the following:

Lemma. If \((a_k,\ldots,a_0)\) and \((a'_k,\ldots,a'_0)\) are base-\(b\) representations of the same integer, then the two representations are identical.

We prove by induction on \(k\) the following statement:

If \((a_k,\ldots,a_0)\) and \((a'_k,\ldots,a'_0)\) are base-\(b\) of the same integer, then \(a_i = a'_i\) for \(i = 0,\ldots,k\).

If \(k = 0\), then \(n = a_0\) and \(n = a'_0\), giving \(a_0 = a'_0\). This establishes the base case.

For the induction hypothesis, assume that the statement holds for some \(k \geq 0\).

Suppose that \((a_{k+1},\ldots,a_0)\) and \((a'_{k+1},\ldots,a'_0)\) are base-\(b\) representations of the same integer \(n\). Then \(n \equiv a_0 {\ (\operatorname{mod}\ {|b|})}\) and \(n \equiv a'_0 {\ (\operatorname{mod}\ {|b|})}\). Hence, \(|b|\) divides \(a_0 - a'_0\). Since \(0 \leq a_0, a'_0 < |b|\), we must have \(a_0 = a'_0 = r\) for some \(r \in \{0,\ldots,|b|-1\}.\) It follows that \((a_{k+1},\ldots,a_1)\) and \((a'_{k+1},\ldots,a'_1)\) are base-\(b\) representations of \((n - r)/b\). By the induction hypothesis, we have that \(a_{i+1} = a'_{i+1}\) for \(i = 0,\ldots,k\). Combined with \(a_0 = a'_0 = r\), we have \(a_i = a'_i\) for \(i = 0,\ldots,k+1\). This completes the induction.

For each positive integer \(k\), define \(u_k := -\frac{m(m^{2k}-1)}{m+1}\) and \(v_k := \frac{m^{2k}-1}{m+1}\) where \(m = -b\).

Note that \(u_k\) and \(v_k\) are integers. To see this, recall the Factor Theorem:

If \(p(x)\) is a polynomial such that \(p(a) = 0\), then \(x-a\) divides \(p(x)\).

Let \(p(x) = x^{2k}-1\). Then \(p(-1) = (-1)^{2k} - 1 = 0\). Hence, \(x-(-1)\) divides \(p(x)\). Setting \(x\) to \(m\) gives us that \(m+1\) divides \(m^{2k}-1\).

Proposition. Let \(k\) be a positive integer. Every integer \(n \in \{u_k,\ldots,v_k\}\) has a unique base-\(b\) representation \((a_{2k-1},\ldots,a_0)\)

Let \(Z_k\) denote the set of integers expressible as \[a_{2k-1}b^{2k-1} + \cdots + a_1b^1 + a_0b^0\] where \(a_0,\ldots,a_{2k-1} \in \{0,\ldots,m-1\}.\) Note that \(|Z_k| = m^{2k}\) since no two choices of \(a_0,\ldots,a_{2k-1}\) give the same integer by the lemma above.

The greatest integer in \(Z_k\) is obtained by setting \(a_{2i}= m-1\) and \(a_{2i+1} = 0\) for \(i = 0,\ldots,k-1\), giving the value \[(m-1)\sum_{i=0}^{k-1} b^{2i} =(m-1)\sum_{i=0}^{k-1} \left(m^2\right)^{i} = (m-1)\frac{m^{2k}-1}{m^2-1} = v_k.\]

The least integer in \(Z_k\) is obtained by setting \(a_{2i+1}= m-1\) and \(a_{2i} = 0\) for \(i = 0,\ldots,k-1\), giving the value \[(m-1)\sum_{i=0}^{k-1} b^{2i+1} = -(m-1)\sum_{i=0}^{k-1} m^{2i+1} = -m(m-1)\frac{m^{2k}-1}{m^2-1} = u_k.\]

Hence, all the elements of \(Z_k\) lie in the range \(u_k,\ldots,v_k\). But the number of integers in this range is exactly \[\begin{align*} v_k - u_k + 1 &= \frac{m^{2k}-1}{m+1} - \left(-\frac{m(m^{2k}-1)}{m+1}\right) + 1 \\ & = (m^{2k} -1) + 1 \\ & = |Z_k|. \end{align*}\] It follows that every integer \(n\) in the range is in \(Z_k\) and therefore has a unique base-\(b\) representation \((a_{2k-1},\ldots,a_0)\)

Theorem. Every integer has a base-\(b\) representation.

It is easy to see that for any given integer \(n\), there exists \(k \geq 1\) such that \(n \in \{u_k,\ldots,v_k\}\). The result now follows from the proposition.

Uniqueness

The proposition also gives us uniquness of proper base-\(b\) representations.

Corollary. Two proper base-\(b\) representations of the same nonzero integer are identical.

Suppose that \((c_\ell,\ldots,c_0)\) and \((c'_{\ell'},\ldots,c'_0)\) are two proper base-\(b\) of the same nonzero integer \(n\). Choose \(k \geq \max\{\ell, \ell'\}\) such that \(n \in \{u_k, v_k\}\). With \(a_i := c_i\) for \(i = 1,\ldots,\ell\), \(a_i := 0\) for \(i = \ell+1,\ldots,2k-1\), \(a'_i := c'_i\) for \(i = 1,\ldots,\ell'\), and \(a'_i := 0\) for \(i = \ell'+1,\ldots,2k-1\), we have that \((a_{2k-1},\ldots,a_0)\) and \((a'_{2k-1},\ldots,a'_0)\) are base-\(b\) representations of \(n\). By the proposition, they must be identical and so \(\ell = \ell'\) and \(c_i = c'_i\) for \(i = 0,\ldots,\ell\).

Algorithm and implementation in Haskell

Now that we have established existence and uniqueness of the base-\(b\) representation of an integer, we can obtain a rather straightforward way to compute it as in the case when \(b\) is positive.

Let \(n\) be a nonzero integer. Suppose that its proper base-\(b\) representation is \((a_k,\ldots, a_0)\). That is, \(n = a_kb^k + \cdots + a_0b^0\). Then \(a_0\) must be the nonnegative remainder when \(n\) is divided by \(|b|\). It follows that \[(n-a_0)/b = a_kb^{k-1} + \cdots + a_1b^0.\] We can now repeat the process to obtain \(a_1\) and so on. Below is a Haskell function that outputs the digits as a list so that element \(i\) is \(a_i\).

negBaseRep :: Integral a => a -> a -> [a]
negBaseRep b n =
  if 0 <= n && n < -b
  then [n]
  else
    let (q, r) = n `divMod` (-b)
    in r:negBaseRep b ((n - r) `div` b)

We use divMod instead of quotRem because we want a nonnegative remainder. (Note that it is not necessarily true that \(|(n-r)/b| < |n|\) (e.g. \(n = -1\) with \(b = -2\)). Nevertheless, termination is guaranteed because we already know that the representation requires no power of \(b\) higher than \(b^k\) where \(k\) is such that \(n \in \{u_k,\ldots,v_k\}\).)

We can now test this in ghci:

λ> negBaseRep (-2) (-11)
[1,0,1,0,1,1]
λ> sum $ zipWith (*) it (iterate (*(-2)) 1)
-11
λ> negBaseRep (-3) (21)
[0,2,0,2,1]
λ> sum $ zipWith (*) it (iterate (*(-3)) 1)
21

What’s next

Near the end of the video, a number of extensions are mentioned. One is to extend the representation to include negative powers of \(b\) so that all real numbers can be represented. Another is to look at bases that are complex numbers like \(2i\). Since this article is getting long, we will have to leave the discussion of these extensions to the future.