random>=1.2
August 30, 2022
random
version 1.2There are a number of Haskell libraries for pseudo-random number
generation. Among these, perhaps random
is
the most popular based on a search on the term “random” on Hackage.
The random
library has a rather long history. Version 1.0
was available since 2007 and its usage has been covered in numerous
tutorials and books.
In 2020, version 1.2.0 was released. It was a massive effort by a number of individuals, solving the long-standing performance issue and introducing new tools for users of this library. Those interested to know a bit about the history surrouding version 1.2.0 can read Alexey Kuleshevich’s blog post or watch his talk at Zurich Haskell meetup. The talk contains an excellent introduction to what is new. The slides for the talk are hosted on Github. There is also a podcast with Leonhard Markert and Dominic Steinitz that is quite informative. The new features add tremendous power to the library, making it a great choice for situations where pseudo-random number generation is required.
The purpose of this article is to illustrate the basic usage of
random
version 1.2. It is meant to help Haskell programmers
new to pseudo-random number generation in Haskell to get started. The
readers are assumed to be familiar with monadic and stateful
computations in Haskell.
Working with random numbers in Haskell is arguably not as
straightforward as in imperative languages such as C/C++, Python, Julia,
etc. In Julia, for example, we can create a vector v
of
random numbers between 0 and 1 of length 100 as follows:
using Random
= rand(100) v
For reproducibility, we can specify a random seed before generating pseudo-random numbers.
In Haskell, the dependence on a generator is often quite explicit unless one uses an interface that is wrapped in a context where mutations can happen behind the scenes. The need to specify a generator even for the simplest pseudo-random number generation functions seems like an annoyance but such is the price to pay for pure computations. Fortunately, there are patterns that one can follow to make the experience as painless as possible.
In any case, random>=1.2
provides a pure
pseudo-random number interface through System.Random
and a monadic one through System.Random.Stateful
.
We will look at each in turn.
System.Random
System.Random
contains the functions
uniform :: (RandomGen g, Uniform a) => g -> (a, g)
uniformR :: (RandomGen g, UniformRange a) => (a, a) -> g -> (a, g)
The latter takes a range specified as a pair of (inclusive) end-values and a random generator, and returns an element from the range along with a new generator.
The type class RandomGen
requires at least the implementation of split
and one of
genWord32
, genWord64
, or next
together with genRange
. Implementing next
and
genRange
seems to be discouraged but is left in the
interface for backward compatibility.
The type class UniformRange
has been defined for a wide range of types which include types such as
Int
,
Word
,
and Double
. The
full list can be found in the documentation.
For the code that follows, we need to import System.Random
.
We can simulate a four-sided die and a six-sided die via the
functions
fourSidedDie :: (RandomGen g) => g -> (Int, g)
= uniformR (1, 4) fourSidedDie
and
sixSidedDie :: (RandomGen g) => g -> (Int, g)
= uniformR (1, 6) sixSidedDie
Casting simultaneously a four-sided die and a six-sided can then be simulated via the function
rollFourAndSix :: (RandomGen g) => g -> ((Int,Int), g)
= let (r4, g') = fourSidedDie g
rollFourAndSix g = sixSidedDie g'
(r6, g'') in ((r4, r6), g'')
“Threading through” the new generators in the let bindings looks a
bit tedious and suggests sequencing the computations monadically which
we will see later on. In the meantime, we can put the function into
action in ghci
:
> rollFourAndSix (mkStdGen 134)
λ3,3),StdGen {unStdGen = SMGen 15797293889117377061 16743392734764332447}) ((
The function mkStdGen
takes an integer random seed
(134
in this case) and returns an object of type StdGen
which
is an instance of RandomGen
. (In
the above, SMGen
is from
splitmix
.)
Unless the value of the seed to mkStdGen
is changed, we
will get the same result every time the code above is run with no sign
of randomness. This of course is to be expected since
rollFourAndSix
is a pure function and it will always
returns the same output on the same input.
Aside: The custom prompt λ>
for the
REPL in the above can be set via :set prompt "λ> "
It
can also be added to .ghci
in the home folder to set it as
the default.
We can simulate a sequence of pairs of rolls until the sum of the values is at least eight with
untilAtLeastEight :: (RandomGen g) => g -> ([(Int,Int)], g)
untilAtLeastEight g| uncurry (+) r >= 8 = ([r], g')
| otherwise = let (rs, g'') = untilAtLeastEight g'
in (r:rs, g'')
where
= rollFourAndSix g (r, g')
Let us give this a run in ghci
:
> untilAtLeastEight (mkStdGen 134)
λ3,3),(4,3),(3,1),(1,5),(2,6)],StdGen {unStdGen = SMGen 2170483177555623709 16743392734764332447}) ([(
Implement the function
untilFourSidedBigger :: (RandomGen g) => g -> ([Int], g)
that alternately rolls a 6-sided die and a 4-sided die, starting with a 6-sided die, until the value of the 4-sided die roll is strictly bigger than the value of immediately preceding 6-sided die roll.
We saw in the above that every time we generate a pseudo-random
number, we need a new generator. Hence, if we want to obtain a sequence
of pseudo-random numbers, we can wrap functions that return random
numbers in some sort of context (monad) that mutates the generator and
sequence our computations monadically. There are a number of ways
towards achieving this goal. In fact, System.Random.Stateful
provides all the necessary machinery for monadic pseudo-random number
generation. However, we opt to first roll our own “stateful” solution to
develop better intuition.
State
One option for our purpose is the State
monad.
Let us begin by creating a file named stateful.hs
:
-- stateful.hs
module Stateful where
import Control.Monad.State
import System.Random
fourSidedDie' :: (RandomGen g) => State g Int
= do
fourSidedDie' <- gets $ uniformR (1,4)
(val, g)
put greturn val
Note that fourSidedDie'
is not a function; it is an
Int
wrapped in State g
. To
obtain a die roll and a new generator, we perform
runState fourSidedDie' g
for some g
of type that is an instance of RandomGen
. Let
us try this in ghci
using mkStdGen :: Int -> StdGen
to create a generator:
> :l stateful.hs
λ1 of 1] Compiling Stateful ( stateful.hs, interpreted )
[Ok, one module loaded.
> runState fourSidedDie' (mkStdGen 134)
λ3,StdGen {unStdGen = SMGen 17500645228062596230 16743392734764332447}) (
We can use the function
state :: MonadState s m => (s -> (a, s)) -> m a
to simplify fourSidedDie'
to
fourSidedDie'' :: (RandomGen g) => State g Int
= state $ uniformR (1,4) fourSidedDie''
Adding this to stateful.hs
and trying it out in
ghci
, we obtain the same result:
> :l stateful.hs
λ1 of 1] Compiling Stateful ( stateful.hs, interpreted )
[> runState fourSidedDie'' (mkStdGen 134)
λ3,StdGen {unStdGen = SMGen 17500645228062596230 16743392734764332447}) (
Using State
makes it
easy to generate a list of pseudo-random numbers as illustrated by the
following function that simulates rolling a k-sided die n times:
kSidedRolls :: (RandomGen g) => Int -> Int -> g -> ([Int], g)
= let kSided = state $ uniformR (1, k)
kSidedRolls k n g in runState (replicateM n kSided) g
We can now simulate rolling a 12-sided die 10 times in
ghci
:
> :l stateful.hs
λ1 of 1] Compiling Stateful ( stateful.hs, interpreted )
[> kSidedRolls (mkStdGen 134) 12 10
λ3,3,12,11,3,9,10,8,6,2],StdGen {unStdGen = SMGen 13803821895484298649 16743392734764332447}) ([
ST
Another option is to wrap a generator within an STRef
and
sequence the generation of pseudo-random numbers in ST s
. Note
that the type s
here is a phantom type and is not
the type of a random generator. Working with ST
can feel a
bit magical. However, no attempt will be made in this article to
demystify it as some good answers are already available on Stack
Overflow. Meanwhile, we illustrate using ST
with a
concrete example.
Let us add the following imports to stateful.hs
:
import Control.Monad.ST
import Data.STRef
We define an alternate version of kSidedRolls
, using
ST
under
the hood:
kSidedRolls' :: (RandomGen g) => Int -> Int -> g -> ([Int], g)
=
kSidedRolls' k n g $ do
runST <- newSTRef g
gRef <- replicateM n (kSided gRef)
rs <- readSTRef gRef
g' return (rs, g')
where
= do
kSided gRef <- uniformR (1, k) <$> readSTRef gRef
(val, g')
writeSTRef gRef g'return val
> :l stateful.hs
λ1 of 1] Compiling Stateful ( stateful.hs, interpreted )
[> kSidedRolls' (mkStdGen 134) 12 10
λ3,3,12,11,3,9,10,8,6,2],StdGen {unStdGen = SMGen 13803821895484298649 16743392734764332447}) ([
Note that the output is exactly the same as that of
kSidedRolls
above. But we might ask, is there any practical
difference between kSidedRolls
and
kSidedRolls'
? To find out, let us do a bit of benchmarking
using criterion
.
(Those new to criterion
could consult this
tutorial.)
To manage the growing codebase and accommodate future examples, we
create a cabal
project called randomthings
with the following folder structure
randomthings
├── CHANGELOG.md
├── app
│ └── Main.hs
├── lib
│ └── stateful.hs
└── randomthings.cabal
where randomthings.cabal
contain the lines
library stateful
exposed-modules: Stateful
hs-source-dirs: lib
build-depends: base ^>=4.14.3.0
, random
, mtl
, vector
ghc-options: -Wall -O2
default-language: Haskell2010
executable randomthings
main-is: Main.hs
-- Modules included in this executable, other than Main.
-- other-modules:
-- LANGUAGE extensions used by modules in this package.
-- other-extensions:
build-depends: base ^>=4.14.3.0
, criterion
, random
, stateful
hs-source-dirs: app
default-language: Haskell2010
and the content of Main.hs
is
module Main where
import Criterion.Main (bench, bgroup, defaultMain, nf)
import System.Random (mkStdGen)
import Stateful
main :: IO ()
= do
main let g = mkStdGen 134
let k = 6
let n1 = 10000
let n2 = 100000
let n3 = 1000000
defaultMain
["State" [ bench (show n1) $ nf (kSidedRolls k n1) g
bgroup show n2) $ nf (kSidedRolls k n2) g
, bench (show n3) $ nf (kSidedRolls k n3) g
, bench (
]"ST" [ bench (show n1) $ nf (kSidedRolls' k n1) g
, bgroup show n2) $ nf (kSidedRolls' k n2) g
, bench (show n3) $ nf (kSidedRolls' k n3) g
, bench (
] ]
On my machine, cabal -v0 run
gives the following
results:
benchmarking State/10000
time 706.8 μs (702.2 μs .. 712.0 μs)
1.000 R² (0.999 R² .. 1.000 R²)
mean 716.2 μs (711.3 μs .. 724.7 μs)
std dev 20.84 μs (12.31 μs .. 32.71 μs)
variance introduced by outliers: 20% (moderately inflated)
benchmarking State/100000
time 7.297 ms (7.255 ms .. 7.344 ms)
1.000 R² (0.999 R² .. 1.000 R²)
mean 7.236 ms (7.202 ms .. 7.270 ms)
std dev 96.04 μs (78.23 μs .. 120.2 μs)
benchmarking State/1000000
time 72.95 ms (72.34 ms .. 73.76 ms)
1.000 R² (0.999 R² .. 1.000 R²)
mean 72.59 ms (72.33 ms .. 72.96 ms)
std dev 554.9 μs (345.3 μs .. 741.5 μs)
benchmarking ST/10000
time 279.9 μs (278.0 μs .. 281.8 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 279.6 μs (278.3 μs .. 280.6 s)
std dev 3.767 μs (3.366 μs .. 4.438 μs)
benchmarking ST/100000
time 5.023 ms (5.006 ms .. 5.040 ms)
1.000 R² (0.999 R² .. 1.000 R²)
mean 4.999 ms (4.976 ms .. 5.031 ms)
std dev 80.93 μs (58.87 μs .. 106.1 μs)
benchmarking ST/1000000
time 56.60 ms (55.52 ms .. 57.56 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 56.75 ms (56.07 ms .. 58.09 ms)
std dev 1.757 ms (740.4 μs .. 3.160 ms)
Judging from the mean times, the ST
version is
speedier. However, we can notice that as the number of rolls increases,
the relative performance gap decreases. It might be that the overhead in
list creation begins to dominate the computation time as the list
becomes longer. To test this, let us use vectors instead of lists.
We add the following to stateful.hs
:
import Data.Vector.Unboxed (Vector)
import qualified Data.Vector.Unboxed as V
kSidedRollsV :: (RandomGen g) => Int -> Int -> g -> (Vector Int , g)
= let kSided = state $ uniformR (1, k)
kSidedRollsV k n g in runState (V.replicateM n kSided) g
kSidedRollsV' :: (RandomGen g) => Int -> Int -> g -> (Vector Int, g)
=
kSidedRollsV' k n g $ do
runST <- newSTRef g
gRef <- V.replicateM n (kSided gRef)
rs <- readSTRef gRef
g' return (rs, g')
where
= do
kSided gRef <- uniformR (1, k) <$> readSTRef gRef
(val, g')
writeSTRef gRef g'return val
We also replace the defaultMain
section in
Main.hs
with
defaultMain
[ "State/Vector" [ bench (show n1) $ nf (kSidedRollsV k n1) g
bgroup show n2) $ nf (kSidedRollsV k n2) g
, bench (show n3) $ nf (kSidedRollsV k n3) g
, bench (
]"ST/Vector" [ bench (show n1) $ nf (kSidedRolls' k n1) g
, bgroup show n2) $ nf (kSidedRollsV' k n2) g
, bench (show n3) $ nf (kSidedRollsV' k n3) g
, bench (
] ]
Here are the results:
benchmarking State/Vector/10000
time 516.3 μs (513.4 μs .. 519.4 μs)
0.999 R² (0.998 R² .. 1.000 R²)
mean 529.3 μs (522.5 μs .. 539.7 μs)
std dev 26.82 μs (18.43 μs .. 35.02 μs)
variance introduced by outliers: 45% (moderately inflated)
benchmarking State/Vector/100000
time 4.974 ms (4.923 ms .. 5.033 ms)
0.999 R² (0.999 R² .. 0.999 R²)
mean 5.262 ms (5.196 ms .. 5.365 ms)
std dev 241.3 μs (151.7 μs .. 396.2 μs)
variance introduced by outliers: 24% (moderately inflated)
benchmarking State/Vector/1000000
time 49.52 ms (48.76 ms .. 50.27 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 51.15 ms (50.62 ms .. 52.00 ms)
std dev 1.314 ms (821.0 μs .. 2.094 ms)
benchmarking ST/Vector/10000
time 162.5 μs (161.4 μs .. 163.4 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 161.3 μs (160.7 μs .. 162.0 μs)
std dev 2.126 μs (1.831 μs .. 2.480 μs)
benchmarking ST/Vector/100000
time 1.677 ms (1.669 ms .. 1.685 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 1.679 ms (1.674 ms .. 1.684 ms)
std dev 16.41 μs (14.24 μs .. 20.49 μs)
benchmarking ST/Vector/1000000
time 15.93 ms (15.83 ms .. 16.01 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 16.03 ms (15.97 ms .. 16.10 ms)
std dev 165.3 μs (124.5 μs .. 222.1 μs)
For each vector lenght, the mean time of the ST
version is
just over 30% of the mean time of the State
version.
Also, the ST
version
with vectors is substantially faster than ST
version
with lists. In particular, for 1 million entries, the mean time for the
ST
version with vectors is 28.2% of the mean time for the ST
version
with lists. And for 100,000 entries, it is 33.6%.
Indeed, there is a lot of overhead in creating long lists. For ideas on generating large vectors of random numbers even faster, see this response on Github.
System.Random.Stateful
The System.Random.Stateful
interface can be a bit overwhelming at first sight. There are a number
of available options designed for different purposes. We will begin by
walking through the types and discussing some of the nuances.
First of all, the monadic counterpart to uniform
is
uniformM :: (Uniform a, StatefulGen g m) => g -> m a
The type class StatefulGen
is
multi-parametered and therefore requires the language extension
MultiParamTypeClasses
. Let us get some info in
ghci
:
> import System.Random.Stateful
λ> :i StatefulGen
λtype StatefulGen :: * -> (* -> *) -> Constraint
class Monad m => StatefulGen g m where
uniformWord32R :: GHC.Word.Word32 -> g -> m GHC.Word.Word32
uniformWord64R :: GHC.Word.Word64 -> g -> m GHC.Word.Word64
uniformWord8 :: g -> m GHC.Word.Word8
uniformWord16 :: g -> m GHC.Word.Word16
uniformWord32 :: g -> m GHC.Word.Word32
uniformWord64 :: g -> m GHC.Word.Word64
uniformShortByteString :: Int
-> g
-> m bytestring-0.10.12.0:Data.ByteString.Short.Internal.ShortByteString
uniformShortByteString :: MonadIO m =>
defaultInt
-> g
-> m bytestring-0.10.12.0:Data.ByteString.Short.Internal.ShortByteString
{-# MINIMAL (uniformWord32 | uniformWord64) #-}
-- Defined in ‘System.Random.Internal’
instance RandomGen g => StatefulGen (STGenM g s) (ST s)
-- Defined in ‘System.Random.Stateful’
instance (RandomGen g, MonadIO m) => StatefulGen (IOGenM g) m
-- Defined in ‘System.Random.Stateful’
instance (RandomGen g, MonadIO m) => StatefulGen (AtomicGenM g) m
-- Defined in ‘System.Random.Stateful’
instance (RandomGen g, MonadState g m) =>
StatefulGen (StateGenM g) m
-- Defined in ‘System.Random.Internal’
We can see four defined instances already. If we add import Control.Concurrent.STM
,
we will also see instance RandomGen g => StatefulGen (TGenM g) STM
which is available from version 1.2.1 onwards. In particular, the
instance
instance (RandomGen g, MonadState g m) =>
StatefulGen (StateGenM g) m
means that there exists, for example, an implementation of uniformWord32::StatefulGen g m => g -> m Word32
that takes an input of type StateGenM g
and returns something of type m Word32
.
These instances bring into light the existence of various monadic
adapters: StateGenM
,
AtomicGenM
,
IOGenM
,
STGenM
and TGenM
. The
documentation describes how they differ in performance and safety in the
presence of exceptions and concurrency. The following functions help
create the appropriate adapter from from a pure pseudo-random
generator:
newAtomicGenM :: MonadIO m => g -> m (AtomicGenM g)
newIOGenM :: MonadIO m => g -> m (IOGenM g)
newSTGenM :: g -> ST s (STGenM g s)
newTGenM :: g -> STM (TGenM g)
StateGenM g
is
a bit special as it is a phantom type with constructor StateGenM
:
> :t StateGenM
λStateGenM :: StateGenM g
Recall the type signature of uniformM
:
uniformM :: (Uniform a, StatefulGen h m) => h -> m a
(We have renamed the type of the input argument to h
to
minimize confusion in what follows.) The input argument type
h
can be anything provided that there exists an instance
definition for StatefulGen h m
.
Hence, the function
= uniformM StateGenM f
is a specialization of uniformM
with type
Uniform a, StatefulGen (StateGenM g) m) => m a (
The instance definition that matches StatefulGen (StateGenM g) m)
is
instance (RandomGen g, MonadState g m) => StatefulGen (StateGenM g) m
Hence, the type of f
simplifies to
Uniform a, RandomGen g, MonadState g m) => m a (
which agrees with what we get in ghci
:
> :t f
λf :: (Uniform a, RandomGen g, MonadState g m) => m a
To obtain an element of type that is an instance of Uniform
(e.g. Int
), we apply
runState
to f
and an initial generator:
> fst $ runState f (mkStdGen 137)::Int
λ7879794327570578227
Instead, we could have used runStateGen_
provided by the
interface:
> runStateGen_ (mkStdGen 137) uniformM::Int
λ7879794327570578227
We can use runStateGen
instead if also want the new
generator:
> runStateGen (mkStdGen 137) uniformM::(Int, StdGen)
λ7879794327570578227,StdGen {unStdGen = SMGen 11285859549637045894 7641485672361121627}) (
We mention in passing that the library mwc-random
since version 0.15.0 has the following instance definition:
import qualified System.Random.Stateful as Random
⋮instance (s ~ PrimState m, PrimMonad m) => Random.StatefulGen (Gen s) m where
= uniformR (0, u)
uniformWord32R u {-# INLINE uniformWord32R #-}
= uniformR (0, u)
uniformWord64R u {-# INLINE uniformWord64R #-}
= uniform
uniformWord8 {-# INLINE uniformWord8 #-}
= uniform
uniformWord16 {-# INLINE uniformWord16 #-}
= uniform
uniformWord32 {-# INLINE uniformWord32 #-}
= uniform
uniformWord64 {-# INLINE uniformWord64 #-}
= unsafeSTToPrim (Random.genShortByteStringST n (uniform g))
uniformShortByteString n g {-# INLINE uniformShortByteString #-}
The constraint PrimMonad m
essentially says that m
is IO
or ST s
. PrimState m
is
an associated
data type. Those interested in learning more can see the discussion
on Stack
Overflow.
For this example, we need random
and
mwc-random
instsalled. Let us define a couple of
functions
> import System.Random.Stateful
λ> :{
λkSidedDieM :: (StatefulGen g m) => Int -> g -> m Int
= uniformRM (1, k)
kSidedDieM k
kSidedRollsM :: (StatefulGen g m) => Int -> Int -> g -> m [Int]
= replicateM n . kSidedDieM k
kSidedRollsM k n :}
We can now simulate rolling a 6-sided die 12 times in a few different ways using different adapters:
> pureGen = mkStdGen 134
λ> :{
λgo :: (StatefulGen g m) => g -> m [Int]
= kSidedRollsM 6 12
go :}
> runStateGen_ pureGen go
λ3,3,4,3,3,1,5,5,2,6,6,6]
[> newIOGenM pureGen >>= go
λ3,3,4,3,3,1,5,5,2,6,6,6]
[> newAtomicGenM pureGen >>= go
λ3,3,4,3,3,1,5,5,2,6,6,6]
[> runST $ newSTGenM pureGen >>= go
λ3,3,4,3,3,1,5,5,2,6,6,6] [
We can also use a generator from mwc-random
:
> import qualified System.Random.MWC as MWC
λ> MWC.create >>= go
λ3,4,3,1,4,6,1,6,1,4,2,2] [
Write benchmarking code comparing the performance of the different
monadic adapters and the generator from mwc-random
.
We have covered the basics of random>=1.2
. The
improved performance and the new features help make the library become a
unifying framework for pseudo-random number generation in Haskell. The
flexibility of the inferface offers fine-grained control on computing
with generate pseudo-random numbers depending on the needs of an
application. Consequently, the library has an inherent complexity that
takes time to master. In future articles, we will look at more examples
of usage and other advanced features of the library.