Update

2024-06-13 20:56:20 +02:00 · 2024-06-13 20:56:20 +02:00 · 757bcb3cf1
commit 757bcb3cf1
parent bc0419bd2b
19 changed files with 1827 additions and 1796 deletions
--- a/pseudorandomness.md
+++ b/pseudorandomness.md
@ -114,6 +114,10 @@ void randomSeed(uint32_t seed)
 - **Inverse transform method**: In this method we take the inverse function of our desired distribution's cumulative distribution function (function that for any *x* says the probability the outcome will be lower than or equal to *x*) and we pass to it a number in range 0 to 1 generated with the vanilla uniform distribution -- this gives us the number in the desired distribution. However for this we have to know the cumulative distribution function's inverse function.
 - ...

+{ If you want to look at some C code that tries to generate good pseudorandom numbers, take a look at gsl (GNU Scientific Library). ~drummyfish }
+
+TODO: some algorithms from the gsl library described at https://www.gnu.org/software/gsl/doc/html/rng.html
+
 ## Quality/Testing Of Pseudorandom Sequences/Generators

 This topic alone can be extremely complex, you could probably do a [PhD](phd.md) on it, let's just do some overview for mere noobs like us.
@ -128,7 +132,7 @@ However the core of a pseudorandom generator is the quality of the sequence itse
 - **Try to [compress](compression.md) the sequence**: Truly random data should be basically impossible to compress -- you can exploit this fact and try to compress your sequence with some compression programs. It is ideal if the compression programs end up enlarging the file.
 - **Statistical tests**: Here you use objective mathematical tests -- there exist many very advanced tests, we'll only mention the very simple ones.
  - **[Histogram](histogram.md)**: Generate many numbers (but not more than the generator's period) and make a histogram (i.e. for every number count how many times it appeared) -- all numbers should appear roughly with the same frequency. If you make a nice generator, you should even see exactly the same count for every value generated -- this is explained above. Also count 1 and 0 bits in the whole sequence -- again there should be about the same number of them (exactly the same if you do it correctly). But keep in mind this only checks if you have correct frequencies of numbers, it says nothing about their distribution. Even a sequence 1, 2, 3, 4, 5, .... will pass this.
-  - **Averaging any non-short interval should be close to middle value**: In a random sequence it should hold that if you take any interval that's not too short -- let's say at leas 100 numbers in a row -- the average value should very likely be close to the middle value (the longer the interval, the closer it should be). You can test your sequence like this. This already takes into account even the distribution of the numbers.
+  - **Averaging any non-short interval should be close to middle value**: In a random sequence it should hold that if you take any interval that's not too short -- let's say at least 100 numbers in a row -- the average value should very likely be close to the middle value (the longer the interval, the closer it should be). You can test your sequence like this. This already takes into account even the distribution of the numbers.
  - **[Fourier transform](fourier_transform.md)** (and similar methods that give you the spectrum) -- the spectrum of the data should have equal amount of all frequencies, just like white noise.
  - **[Correlation](correlation.md) coefficients**: This is kind of the real proof of randomness, ideally no values should be correlated in your data, so you can try to compute some correlation coefficients, for example try to compute how much correlation there is between consecutive numbers (it's similar to plotting the data as coordinates and seeing if they form a line or not) -- you should ideally find no significant correlations.
  - **Chi square test**: Very common test for this kind of thing, see also *poker test*.