The Mersenne-Twister is a pseudo random number generator (PRNG) used by the numpy library. This article explains all you need to know before using it.
As a data scientist, I need to generate random numbers for my algorithms. It puzzled me to know that deterministic processes in my computer could generate random sequences. I had a deeper look into it, and here is what I’ve learned:
Your computer does not generate truly random numbers but uses algorithm such as the Mersenne-Twister to get pseudo random sequences.
It is important to know when and how the seed of your pseudo random generator is set, otherwise you might have bad surprises, especially when developing multiprocess applications.
Standard random generators are highly predictable and should not be used for cryptographic or game gambling purposes.
With numpy
, the quickest way to obtain random numbers between 0 and 1 is to use the following:
A first random number: 0.2332029758567754
A second random number: 0.7277750980801885
Or (almost) equivalently:
A first random number: 0.8492693008307766
A second random number: 0.9858307170084044
In both ways, we are using what we call a pseudo random number generator or PRNG. Indeed, whenever we call a python function, such as np.random.rand()
the output can only be deterministic and cannot be truly random. Hence, numpy
has to come up with a trick to generate sequences of numbers that look like random and behave as if they came from a purely random source, and this is what PRNG are.
The most spread PRNG is the Mersenne Twister (MT) generator that was designed in 1997 by M. Matsumoto and T. Nishimura. It works schematically as the following:
Step 1: We initialize our random generator with random_state= np.random.RandomState()
and we generate an internal state of 624 integers.
Step 2: Each time we call random_state.rand()
we apply two operations: first, we perform a twist on the internal state to get a new internal state. Then we apply a tempering function on the newly obtained state that generates a float between 0 and 1 which is return by the random_state.rand()
. These two operations are performed each time we ask for a new random number.
The twister
and the temper
functions are totally deterministic. Hence, the value of the first occurrence of random_state.rand()
only depends on the initial internal state which therefore should be generated… randomly. In practice, RandomState()
generates the initial internal state from a seed
value which is taken from the internal clock of your computer. Otherwise, you can also specify the seed
value by doing RandomState(seed = your_favorite_seed_value)
which can be useful in order to have a deterministic code.
The twister and tempering functions are made of elementary bitwise and matrix operations. You can find their definitions in the original paper of the Mersenne Twister, or more simply in their C implementation in the numpy github repository:
rk_state
is defined as C structure whose only relevant fields for our case are a list of 624 integers representing the state stored in rk_state.key
and a position whose value is rk_state.pos
. The twister function from line 14 to 29 is only applied once every 624 called when the state->pos
reaches 624. In fact, the internal state stored in state->key
contains enough information to generate 624 different random numbers. The tempering
function consists of the bitwise operations lines 33 to 36.
The Mersenne Twister is used so much for two simple reasons:
It is fast to evaluate: few nanoseconds on a recent laptop according to time it.
It produces sequences of numbers that look pretty random. In order to evaluate the randomness of a sequence, we can perform different statistical tests such as the (have a look at it!) that are passed by the Mersenne Twister PGNR.
MT PRNG should not be used as such for cryptography purposes! In fact, the temper
function can be easily invertible. It means that for each output of the rk_random
function we can infer one column of state->key
and after 624 iterations of rk_random
we know perfectly the state
of the generator and can predict the remaining of the pseudo random sequence. In conclusion, do not use numpy.random
in any cryptography or gambling game development, but look rather for specialized modules such as the secrets, or Crypto.Random.
Be careful when you use numpy.random
in multiprocess application as it can lead to misleading behaviors as shown in the following:
Output:
'Bad practice: '
[array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9])]
'Good practice: '
[array([8, 9, 4, 5, 1, 0, 8, 1, 5, 4]),
array([5, 1, 3, 3, 3, 0, 0, 1, 0, 8]),
array([1, 9, 9, 9, 2, 9, 4, 3, 2, 1]),
array([4, 3, 6, 2, 6, 1, 2, 9, 5, 2]),
array([6, 3, 5, 9, 7, 1, 7, 4, 8, 5])]
You can see that the different calls to bad_practice
in our multiprocess template always generate the same output. In fact, the RandomState
is not re-initialized for each thread and all the threads share the same initial seed and initial internal state. While in the good multiprocess practice case, the random state is initialized for each thread differently.
In one of the square, I have randomly drawn in red 100 000 couples of points (x,y) with a uniform distribution in [0,1], while the other square contains 100 000 points coming from another stochastic process. Can you guess which one is the uniformly distributed one?
Actually, the uniformly random one is the left one! It is quite misleading as uniformly random processes might create more clusters and gaps than what we think. The image on the right was generated by first putting red dots on a regular grid and shifting each dots by random small steps (the output is therefore stochastic but it is not a uniform distribution). In fact, the human brain is quite bad for generating and/or detecting random processes, but that would be the topic of another post!
GAN with Keras: Application to Image Deblurring
A Generative Adversarial Networks tutorial applied to Image Deblurring with the Keras library.
How to Perform Fraud Detection with Personalized Page Rank
This article shows how to perform fraud detection with Graph Analysis.
Image Registration: From SIFT to Deep Learning
How the field has evolved from OpenCV to Neural Networks.