Zero-dependency random number generation in Rust

12 minute read Published: 2023-01-03

Let's investigate how to generate random numbers without external dependencies in Rust.

https://xkcd.com/221/

Random numbers are very interesting. It feels like magic that we can generate such unpredictable entropies from deterministic sources.

But how does this happen? Before jumping into the generation of random numbers in Rust, let's understand the process of random number generation and how true randomness could never be created without special hardware.

Is random really random?

First of all, nothing is truly random when we consider the modern computers that we use on daily basis. This is because they are designed to be deterministic. What this means is that, given some initial state and an operation to perform, we can predict exactly how the machine will evolve. Simply put, for the given inputs 2 and 2, we know the operation 2+2 will be outputting 4. This makes total sense since we don't want to wonder what an addition will do, we just want to 'add'. However, this situation of certainty is incompatible with generating truly random numbers. From the definition of "truly", we understand that there should be absolutely no way to predict it, regardless of the information we have. Although it seems impossible to predict a random number without certain quantum-mechanical processes, since it would require knowing the state of every molecule in the atmosphere, we still cannot say that what we generate is truly random.

But then, how do we generate 'random' numbers? Well, the answer is, in most cases, random is not so random. That's when pseudorandomness steps into the scene.

Pseudo-random sequences typically exhibit statistical randomness while being generated by an entirely deterministic causal process.

Cryptographically secure pseudorandom number generators (CSPRNGs) are very important in computing. Without them, we wouldn't have encryption algorithms. They emulate randomness by taking a statistically random sequence and using formulas to make it "random enough".

The math behind PRNG might be complex but in general, it requires two steps:

  1. Provide the PRNG with an arbitrary seed.
  2. Ask for the next random number.

The seed is a starting point for creating random numbers. If the seed changes, the generated numbers also change. If the same seed is used, the same number is generated. The current timestamp is often used as a unique seed value. That precise time never occurs again, so a PRNG with that seed should produce a unique set of random numbers.

Now that we understand how random is not truly random and we can use pseudorandom with a seed to generate "random enough" numbers, let's see the ways of using PRNG in Rust and whether it is possible to do without extra dependencies. Unfortunately, there isn't a widely accepted and conventional way to generate random numbers by only using the core/standard library so we will have to be a little bit creative.

But hey, it will be fun!


Using rand

Probably the most straightforward of generating random numbers is using the rand crate.

Here are some examples:

use rand::Rng;

fn main() {
    let mut rng = rand::thread_rng();

    let n1: u8 = rng.gen();
    let n2: u16 = rng.gen();

    // Prints 57, 31141, 1371217128, 1901352909, 0.5792015593471697, etc.
    println!("Random u8: {}", n1);
    println!("Random u16: {}", n2);
    println!("Random u32: {}", rng.gen::<u32>());
    println!("Random i32: {}", rng.gen::<i32>());
    println!("Random float: {}", rng.gen::<f64>());
}

Generates random numbers with help of random-number generator rand::Rng obtained via rand::thread_rng. Each thread has an initialized generator. Integers are uniformly distributed over the range of the type, and floating point numbers are uniformly distributed from 0 up to but not including 1.

If we want to give a seed and generate the same number every time we run the program, then we can change the PRNG implementation as follows:

use rand::{Rng, SeedableRng};
use rand_pcg::Pcg64;

const SEED: u64 = 42;

fn main() {
    // Uses PCG random number generator (XSL RR 128/64 (LCG) variant).
    let mut rng = Pcg64::seed_from_u64(SEED);

    // Prints "Random u8: 9" every time we run.
    println!("Random u8: {}", rng.gen::<u8>());
}

However, we're still far from our objective, which is generating random numbers without dependencies. Even rand with default features has several transitive dependencies:

$ cargo tree

testapp v0.1.0
└── rand v0.8.5
    ├── libc v0.2.139
    ├── rand_chacha v0.3.1
    │   ├── ppv-lite86 v0.2.17
    │   └── rand_core v0.6.4
    │       └── getrandom v0.2.8
    │           ├── cfg-if v1.0.0
    │           └── libc v0.2.139
    └── rand_core v0.6.4 (*)

Maybe we can look into reducing the transitive dependency count somehow before trying to achieve our 0-dependency goal.

I'm not sure how usable rand will be in this case, but we can simply remove the default features:

[dependencies]
rand = { version = "0.8.5", default-features = false }

Then we have:

$ cargo tree

testapp v0.1.0
└── rand v0.8.5
    └── rand_core v0.6.4

Still not looking good? Well, luckily, we have an alternative crate to use.


Using fastrand

fastrand is a simple and fast random number generator that uses wyhash. This hash function is also simple and fast but not cryptographically secure.

So our previous examples with rand become the following:

fn main() {
    // Prints 119, 67, 233, etc.
    println!("Random u8: {}", fastrand::u8(..));

    // Pick an arbitrary number as seed.
    fastrand::seed(42);

    // Prints e.g. 52 every time we run.
    println!("Random u8 with seed: {}", fastrand::u8(..));
}

How about our dependency count?

$ cargo tree

testapp v0.1.0
└── fastrand v1.8.0

Well, it's good, but still not quite there. Technically, this approach is not 0-deps since fastrand itself is a dependency to our program.

Our goal is to have a single function that is simple enough for getting a random number every time we call it.

Hmm, let's see what we can do.


Zero-dependency Random Number Generation

There is more than one way to simulate entropy to generate random numbers and I will be mentioning the most suggested/interesting ways that I came across on Reddit and Rust users forum.


Nanoseconds

Here is the poor man's random number generator which only takes the nanoseconds of the current time:

use std::error::Error;
use std::time::{SystemTime, UNIX_EPOCH};

fn main() -> Result<(), Box<dyn Error>> {
    let nanos = SystemTime::now().duration_since(UNIX_EPOCH)?.subsec_nanos();

    // Prints 864479511, 455850730, etc.
    println!("Random number: {nanos}");
    Ok(())
}

In this example, we take the current time which is expressed as duration since epoch, and extract the number of nanoseconds elapsed in the current second. For example, if the current time is "10:20:30.123456789" then our 'random' number will be 123456789. The problem here is that the number will be always between zero and 1 billion. We need to use a bit of modulo arithmetic to get a random number in the desired range.

Well, this was suboptimal. Let's see what else we can do.


/dev/urandom

In Unix-like operating systems, /dev/random and /dev/urandom are special files that serve as cryptographically secure pseudorandom number generators. They allow access to environmental noise collected from device drivers and other sources.

The difference between /dev/random and /dev/urandom is that /dev/random only returns random bytes within the estimated number of bits of noise in the entropy pool whereas /dev/urandom does not block waiting for more entropy. Thus /dev/urandom is theoretically vulnerable to a cryptographic attack on the algorithms used by the driver and less random than /dev/random. /dev/random is more suitable for very high-quality randomness applications such as one-time pad or key generation.

(If you would like to learn more about these magical files, see Myths about /dev/urandom)

In our case, we don't want to wait for the entropy pool until additional environmental noise is gathered so we can use /dev/urandom.

(Edit after Foxboron's comment: Apparently since the March of 2022 / Kernel 5.6, there is no difference between /dev/random and /dev/urandom. See this article.)

An example usage of /dev/urandom which generates a random string is the following:

$ cat /dev/urandom | base64 | head -c 10

66Dphl5MAh

In Rust, we can utilize /dev/urandom like so:

use std::fs::File;
use std::io::{Read, Result};

fn main() -> Result<()> {
    let mut rng = File::open("/dev/urandom")?;

    let mut buffer = [0u8; 1];
    rng.read_exact(&mut buffer)?;

    // Prints 92, 119, 122, etc.
    println!("Random u8: {}", buffer[0]);
    Ok(())
}

Although this solution will work perfectly fine on Unix-based operating systems, we won't be able to run this on other operating systems such as Microsoft Windows since /dev/urandom doesn't exist.

But hey, we got our RNG without extra dependencies!

$ cargo tree

testapp v0.1.0

Interestingly enough, OsRng implementation of rand also depends on this approach:

The implementation is provided by the getrandom crate.

And getrandom crate reads /dev/urandom under the hood:

Uses getrandom system call if available, otherwise /dev/urandom after successfully polling /dev/random.

Wait, did I just read "getrandom system call"? I wonder if we can we utilize that for generating random numbers.


getrandom(2)

#include <sys/random.h>

ssize_t getrandom(void *buf, size_t buflen, unsigned int flags);

The getrandom() system call fills the buffer pointed to by buf with up to buflen random bytes. These bytes can be used to seed user-space random number generators or for cryptographic purposes. By default, getrandom() draws entropy from the urandom source (i.e., the same source as the /dev/urandom device). This behavior can be changed via the flags argument.

In order to use this system call in Rust, we need to define own our extern function or use the libc crate since Unix systems expose the syscalls through libc.

// https://man7.org/linux/man-pages/man2/getrandom.2.html
#[cfg(not(target_os = "windows"))]
#[link(name = "c")]
extern "C" {
    fn getrandom(buf: *const u8, buflen: usize, flags: u32) -> usize;
}

fn main() {
    let buffer = [0u8; 1];
    unsafe { getrandom(buffer.as_ptr(), 1, 2) };

    // Prints 78, 51, 241, etc.
    println!("Seed: {:?}", buffer[0]);
}

In our example, we defined an extern function with the same name as our system call (getrandom) to link to C library on the system so that we can interoperate with the C code. When we run this program, we can see that random seeds are generated using the getrandom() system call.

The downside of this approach is that it will only work on systems that getrandom() system call exists and it is inescapably using unsafe code.

But yeah, it is 0-deps on the surface, except the system call itself.


rand() / srand()

#include <stdlib.h>

int rand(void);
void srand(unsigned int seed);

Another external way of generating random numbers is to call the C library function rand(). Along with srand(), you can set the seed to use for a pseudorandom number generator. If srand() is not called, the rand() seed is set as if srand(1) was called at the program start.

We can implement this in Rust similar to getrandom() system call. For the seed, we can use the nanoseconds:

use std::error::Error;
use std::time::{SystemTime, UNIX_EPOCH};

#[link(name = "c")]
extern "C" {
    fn rand() -> i32;
    fn srand(seed: u32);
}

fn main() -> Result<(), Box<dyn Error>> {
    let seed = SystemTime::now().duration_since(UNIX_EPOCH)?.subsec_nanos();

    unsafe {
        srand(seed);

        // Prints 1741856104, 1321400234, etc.
        println!("Random number: {}", rand());
    }

    Ok(())
}

This has the same disadvantages as getrandom, it requires the C library for linking and has unsafe code.


getauxval(3)

#include <sys/auxv.h>

unsigned long getauxval(unsigned long type);

The getauxval() function retrieves values from the auxiliary vector, a mechanism that the kernel's ELF binary loader uses to pass certain information to user space when a program is executed.

Each entry in the auxiliary vector consists of a pair of values: a type that identifies what this entry represents, and a value for that type. Given the argument type, getauxval() returns the corresponding value.

// https://man7.org/linux/man-pages/man3/getauxval.3.html
#[cfg(not(target_os = "windows"))]
#[link(name = "c")]
extern "C" {
    fn getauxval(r#type: u64) -> u64;
}

fn main() {
    let rnd = unsafe { *(getauxval(25) as *const [u8; 1]) };

    // Prints 206, 120, 76, etc.
    println!("Random number: {:?}", rnd[0]);
}

The type argument of getauxval() ("25" in this case) is defined in libc as follows:

// The address of sixteen bytes containing a random value.
pub const AT_RANDOM: ::c_ulong = 25;

This solution was originally suggested by mina86ng on Reddit.


Raw pointers

Let's go for our last attempt.

The bottom line is, if we allocate something, take the pointer to it, retrieve the address, and cast it to an integer; we will get a random number. The address in the memory is not totally random, but probably random enough to do stuff™.

fn main() {
    let pointer = Box::into_raw(Box::new(42));

    // Prints 94560791661472, 94796207967136, etc.
    println!("Random number: {}", pointer as usize);
}

The problem with that approach is, apart from using a memory address for a random number, the numbers are always even since they are a memory address aligned to the size of an integer.

Also, the code above has memory leaks due to not handling the memory after calling Box::into_raw.

$ valgrind -v target/debug/testapp

==985609== HEAP SUMMARY:
==985609==     in use at exit: 4 bytes in 1 blocks
==985609==   total heap usage: 12 allocs, 11 frees, 3,185 bytes allocated
==985609==
==985609== Searching for pointers to 1 not-freed blocks
==985609== Checked 110,200 bytes
==985609==
==985609== LEAK SUMMARY:
==985609==    definitely lost: 4 bytes in 1 blocks
==985609==    indirectly lost: 0 bytes in 0 blocks
==985609==      possibly lost: 0 bytes in 0 blocks
==985609==    still reachable: 0 bytes in 0 blocks
==985609==         suppressed: 0 bytes in 0 blocks

So we can edit the code as follows for manual cleanup by explicitly running the destructor and deallocating the memory:

use std::alloc::{dealloc, Layout};

fn main() {
    let pointer = Box::into_raw(Box::new(42));

    // Prints 94560791661472, 94796207967136, etc.
    println!("Random number: {}", pointer as usize);

    unsafe {
        std::ptr::drop_in_place(pointer);
        dealloc(pointer as *mut u8, Layout::new::<u32>());
    }
}
$ valgrind -v target/debug/testapp

==986501== HEAP SUMMARY:
==986501==     in use at exit: 0 bytes in 0 blocks
==986501==   total heap usage: 12 allocs, 12 frees, 3,185 bytes allocated
==986501==
==986501== All heap blocks were freed -- no leaks are possible

Yay! No leaks.

But... Damn. Why are we even dealing with this? Let's just use a raw pointer via *const T:

fn main() {
    let num = 42u8;
    let address = &num as *const u8;

    // Prints 140736885020951, 140731585258103, etc.
    println!("{}", address as usize);
}

In this example, obtaining an address like this is completely safe. But we will need to use unsafe Rust if we want to do anything with it.

However, thanks to spacejam's comment, we can see that it is not a good entropy source.

ASLR (address space layout randomization) implementations tend to have weaknesses that manifest in ways that might not be obvious just looking at the scrambled output.

fn main() {
    // Prints:
    // 0x000055a7a8b55410
    // 0x000055a4d561b410
    // 0x000055c683ec3410
    // 0x0000560f7b369410
    // etc.
    dbg![main as *const u8];
}

More succinctly:

fn main() {
    // Prints:
    //
    // 1088
    // 1088
    // 1088
    // etc.
    dbg![main as *const u8 as usize % 4096];
}

While the stack in the previous example gets a different distribution due to its reliance on a stack frame’s placement in the address space, there are still sharp edges about every system’s ASLR implementation that it’s kind of frustrating that only exploit writers seem to pay any attention to, despite their significant impacts on performance.

Another comment about how bad this approach would be is the following:

The "randomness" comes from ASLR, which is extremely limited on 32b systems (8~16 bits of entropy). 64b systems have more room, but it's still absolutely awful as an rng, IIRC on x86_64 Linux it defaults to 28 bits (you can check /proc/sys/vm/mmap_rnd_bits for the value on your system, last I checked it could be set from 28 to 32 inclusive).

If the allocator uses brk instead of mmap, it's worse, you get 13 bits of entropy out of the brk ASLR. Or less if you use huge pages.


std::collections::hash_map::RandomState

This solution was suggested by several people after I shared the blog post so thanks!

Simply, we can use the RandomState of the standard library's HashMap implementation and generate a random number from the hasher:

use std::collections::hash_map::RandomState;
use std::hash::{BuildHasher, Hasher};

fn main() {
    let hasher = RandomState::new().build_hasher();

    // Prints 938021294471563529, 17633394154785464152, etc.
    println!("Random number: {}", hasher.finish());
}

It probably makes more sense to use these random values as seed rather than random number. For example, we can have the following function for generating a random seed and combine it with a suggestion by matklad:

use std::collections::hash_map::RandomState;
use std::hash::{BuildHasher, Hasher};

pub fn random_seed() -> u64 {
    RandomState::new().build_hasher().finish()
}

// Pseudorandom number generator from the "Xorshift RNGs" paper by George Marsaglia.
//
// https://github.com/rust-lang/rust/blob/1.55.0/library/core/src/slice/sort.rs#L559-L573
pub fn random_numbers() -> impl Iterator<Item = u32> {
    let mut random = 92u32;
    std::iter::repeat_with(move || {
        random ^= random << 13;
        random ^= random >> 17;
        random ^= random << 5;
        random
    })
}

Final thoughts

Of course, there might be other ways to generate a random number without dependencies. This article aims to show the simplest ways without going too much into mathematical details.

Thanks to everyone who commented in the forums thread, it was definitely helpful to read all the replies!

Let me know via the comments below if there are any other cool ways for generating random numbers.

See you next time! 🐻