Square-free numbers from given primes

Kevin Cheung

September 16, 2022

Generating square-free numbers from a given list of primes

A positive integer is called square-free if it is a product of distinct primes. For example, 1222 is square-free since its prime factorization is 2 × 13 × 47. However, 1204 is not square-free since its prime factorization is 2 × 2 × 7 × 43 with the prime factor 2 appearing twice.

In this article, we compare two ways of generating a sorted list of all positive square-free numbers containing only prime factors from a given list of primes. For simplicity, we assume that the given list of primes is sorted in ascending order.

Project setup

To streamline experimentations, we create a cabal project called squarefree with the following folder structure

squarefree
├── CHANGELOG.md
├── app
│   └── Main.hs
└── squarefree.cabal

where squarefree.cabal contains, among other things, the following

executable squarefree
    main-is:          Main.hs
    build-depends:    base ^>=4.14.3.0
                    , criterion
                    , arithmoi
    hs-source-dirs:   app
    default-language: Haskell2010

In other words, we the packages criterion and arithmoi as dependencies.

The initial content of Main.hs is

module Main where

import Criterion.Main (bench, bgroup, defaultMain, nf)
import Data.List (sort, subsequences)
import Math.NumberTheory.Primes (factorise, primes, unPrime)

naive :: [Integer] -> [Integer]
naive = undefined

mergeBased :: [Integer] -> [Integer]
mergeBased = undefined

numPrimes :: Int
numPrimes = 10

main :: IO ()
main = do
  let ps = map unPrime $ take numPrimes primes
  let n = 2^numPrimes - 1
  let n1 = 100
  let n2 = floor $ fromIntegral n * 0.50
  let n3 = floor $ fromIntegral n * 0.99
  defaultMain 
    [
      bgroup "naive" 
        [ bench (show n1) $ nf (\xs -> naive xs !! n1) ps
        , bench (show n2) $ nf (\xs -> naive xs !! n2) ps
        , bench (show n3) $ nf (\xs -> naive xs !! n3) ps
        ]
    , bgroup "mergeBased"
        [ bench (show n1) $ nf (\xs -> mergeBased xs !! n1) ps
        , bench (show n2) $ nf (\xs -> mergeBased xs !! n2) ps
        , bench (show n3) $ nf (\xs -> mergeBased xs !! n3) ps
        ]
    ]

We will gradually fill in the details for the functions naive and mergeBased.

Essentially, we want to see how fast accessing elements at various indices is. The speed clearly depends on how fast such elements are computed. The indices are chosen to be near the front, around the middle, and near the end.

Naïve method

For the first method, we generate a list of all possible nonemtpy sublists of the given list of primes, compute the product of the elements of each sublist, and then sort the resulting list. This is implemented in the following:

naive :: [Integer] -> [Integer]
naive ps = sort $ map product (tail $ subsequences ps)

Let us give this a try in ghci by issuing the command cabal repl in the squarefree folder:

λ> naive [2,5,11]
[2,5,10,11,22,55,110]

The result is as expected.

Merge-based method

One main drawback of naive is that the entire list must be constructed. If we are only interested in the entries near the front, then a lot of computation is wasted. Is there a way to generate the entries in the order they should appear? The following merge-based method does the trick.

mergeBased :: [Integer] -> [Integer]
mergeBased [] = []
mergeBased (p:ps) = let ys = mergeBased ps in p:merge ys (map (p*) ys)

merge :: (Ord a) => [a] -> [a] -> [a]
merge xs@(x:_) [] = xs
merge [] ys = ys
merge (x:xs) (y:ys) = case compare x y of
                        LT -> x:merge xs (y:ys)
                        GT -> y:merge (x:xs) ys
                        EQ -> x:merge xs ys

The function merge merges two sorted list into one sorted list and is used in merge sort. The reasoning behind mergeBased is that given an ascending list of primes p:ps, to obtain an ascending list of square-free numbers from these primes, we can first obtain a list of square-free numbers, which we call ys, from the primes in ps, merge it with the list with every element in ys multiplied by p, and put p in front of the result.

Results

With cabal run, we obtain the following results:

benchmarking naive/100
time                 101.7 μs   (100.7 μs .. 102.7 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 101.0 μs   (100.4 μs .. 101.6 μs)
std dev              1.990 μs   (1.551 μs .. 2.523 μs)
variance introduced by outliers: 15% (moderately inflated)

benchmarking naive/511
time                 155.3 μs   (154.3 μs .. 156.3 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 154.6 μs   (154.0 μs .. 155.2 μs)
std dev              2.087 μs   (1.815 μs .. 2.439 μs)

benchmarking naive/1012
time                 208.9 μs   (207.2 μs .. 210.7 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 208.8 μs   (207.5 μs .. 210.0 μs)
std dev              4.064 μs   (3.473 μs .. 4.871 μs)
variance introduced by outliers: 13% (moderately inflated)

benchmarking mergeBased/100
time                 4.672 μs   (4.651 μs .. 4.692 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 4.651 μs   (4.627 μs .. 4.668 μs)
std dev              68.44 ns   (57.62 ns .. 81.46 ns)
variance introduced by outliers: 13% (moderately inflated)

benchmarking mergeBased/511
time                 22.22 μs   (22.14 μs .. 22.31 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 22.24 μs   (22.19 μs .. 22.29 μs)
std dev              192.6 ns   (159.3 ns .. 237.1 ns)

benchmarking mergeBased/1012
time                 41.83 μs   (41.60 μs .. 42.08 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 41.66 μs   (41.56 μs .. 41.78 μs)
std dev              393.6 ns   (292.2 ns .. 521.1 ns)

Changing numPrimes to 20, we obtain

benchmarking naive/100
time                 204.5 ms   (195.9 ms .. 213.8 ms)
                     0.998 R²   (0.992 R² .. 1.000 R²)
mean                 211.4 ms   (206.9 ms .. 215.3 ms)
std dev              5.542 ms   (3.585 ms .. 8.460 ms)
variance introduced by outliers: 14% (moderately inflated)

benchmarking naive/524287
time                 814.7 ms   (784.7 ms .. 843.9 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 814.6 ms   (807.2 ms .. 822.0 ms)
std dev              9.378 ms   (4.381 ms .. 12.05 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking naive/1038089
time                 1.348 s    (1.286 s .. 1.439 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 1.365 s    (1.353 s .. 1.386 s)
std dev              20.47 ms   (1.835 ms .. 25.43 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking mergeBased/100
time                 4.989 μs   (4.967 μs .. 5.011 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.005 μs   (4.994 μs .. 5.017 μs)
std dev              38.32 ns   (29.76 ns .. 50.07 ns)

benchmarking mergeBased/524287
time                 47.19 ms   (45.64 ms .. 48.69 ms)
                     0.997 R²   (0.995 R² .. 0.999 R²)
mean                 47.18 ms   (46.56 ms .. 47.75 ms)
std dev              1.175 ms   (989.0 μs .. 1.397 ms)

benchmarking mergeBased/1038089
time                 89.15 ms   (86.89 ms .. 90.85 ms)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 89.65 ms   (88.88 ms .. 90.40 ms)
std dev              1.272 ms   (921.1 μs .. 1.703 ms)

The results are overwhelmingly in favour of mergeBased.

Generating all square-free numbers

We could have ended the article if there weren’t the problem of generating all square-free positive integers without restrictions on the primes used. A lot can be said about these numbers. (See the Wikipedia article for more.)

There is no simple way to modify naive to accept an infinite list of primes. But mergeBased works fine without any modification:

λ> sf = 1:(mergeBased $ map unPrime primes)
λ> take 28 sf
[1,2,3,5,6,7,10,11,13,14,15,17,19,21,22,23,26,29,30,31,33,34,35,37,38,39,41,42]
λ> sf !! 999 -- the 1000th square-free integer
1637

One can check that these results are indeed correct. (By convention, 1 is regarded as a square-free integer.) Unfortunately, trying to get to the 50,000th square-free number takes more than a few seconds.

λ> sf !! 49999
82257

Surprisingly, the following code lifted from Rosetta Code is almost instantaneous on my 2020 27-inch iMac with a 3.8GHz 8-Core Intel Core i7 processor running macOS Monterey 12.6:

λ> sf' = filter (all ((== 1) . snd) . factorise) [1..]
λ> sf' !! 49999
82257

Remarkably, the following takes less than a second

λ> sf' !! 499999
822463

whereas the following didn’t terminate even after 10 minutes with peak memory usage by ghc over 60GB!

λ> sf !! 499999
^CInterrupted.

On the surface, running through every positive integer and filtering out those that are not square-free by looking at its prime factorization does not seem as elegant as the merge-based approach. Two immediate questions follow:

What exactly is making mergeBased so slow?
Is there an even faster way to generate square-free integers?

As of writing, I have no answers to this. If you have ideas, feel free to contact me.