TrueRNG fails the Chi-square Test - ubld.it - TrueRNG and Electronic Kits

Tagged: TrueRNG RNG tests OTP One Time Pad

This topic has 5 replies, 5 voices, and was last updated 6 years, 4 months ago by alexander.shen.

Viewing 6 posts - 1 through 6 (of 6 total)

Author

Posts
April 6, 2014 at 7:24 am #603
Idiope
Member
Hi,

I’ve been playing with the TrueRNG now for a few days, and I noticed that TrueRNG’s output seems to almost always fail the Chi-square Test of

To demonstrate that I made two 1GB files, one from TrueRNG’s output and one from my /dev/urandom.

The results of the Chi-square test for the files:
```
:~$ ent 1GB.TrueRNG
Entropy = 7.999961 bits per byte.

Optimum compression would reduce the size
of this 1000000000 byte file by 0 percent.

Chi square distribution for 1000000000 samples is 53486.30, and randomly
would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 127.4841 (127.5 = random).
Monte Carlo value for Pi is 3.142127773 (error 0.02 percent).
Serial correlation coefficient is -0.000082 (totally uncorrelated = 0.0).

:~$ ent 1GB.urandom
Entropy = 8.000000 bits per byte.

Optimum compression would reduce the size
of this 1000000000 byte file by 0 percent.

Chi square distribution for 1000000000 samples is 251.49, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4978 (127.5 = random).
Monte Carlo value for Pi is 3.141621229 (error 0.00 percent).
Serial correlation coefficient is -0.000000 (totally uncorrelated = 0.0).
```
From the manual of ent:

The chi-square test is the most commonly used test for the randomness of data, and is extremely sensitive to errors in pseudorandom sequence generators. The chi-square distribution is calculated for the stream of bytes in the file and expressed as an absolute number and a percentage which indicates how frequently a truly random sequence would exceed the value calculated. We interpret the percentage as the degree to which the sequence tested is suspected of being non-random. If the percentage is greater than 99% or less than 1%, the sequence is almost certainly not random. If the percentage is between 99% and 95% or between 1% and 5%, the sequence is suspect. Percentages between 90% and 95% and 5% and 10% indicate the sequence is “almost suspect”.

So the result should be close to 50% and a result of 0.01% is really bad.

To demonstrate it better I split the files to 100MB chunks and ran the tests again:
```
:~$ split 1GB.TrueRNG -n 10 --filter=ent |grep exceed
would exceed this value 0.01 percent of the times.
would exceed this value 0.01 percent of the times.
would exceed this value 0.01 percent of the times.
would exceed this value 0.01 percent of the times.
would exceed this value 0.01 percent of the times.
would exceed this value 0.01 percent of the times.
would exceed this value 0.01 percent of the times.
would exceed this value 0.01 percent of the times.
would exceed this value 0.01 percent of the times.
would exceed this value 0.01 percent of the times.

:~$ split 1GB.urandom -n 10 --filter=ent |grep exceed
would exceed this value 10.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 90.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 97.50 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 75.00 percent of the times.
would exceed this value 5.00 percent of the times.
would exceed this value 90.00 percent of the times.
```
I still think that TrueRNG is good and really cheap for what you get. But I think that it should only be used as a source of entropy for PRNGs. So I would suggest you to work on your whitening algorithms. Meanwhile you could remove the suggestions of using TrueRNG’s output in raw and the mentioning that it passes all the industry standard tests.

The random data can then be used to fill the entropy pool in an operating system, or used directly in a custom application.

Passes all the industry standard tests (Dieharder, ENT, Rngtest, etc.)

And while you’re making changes update the PDF Instructions. Currently on the dieharder test you haven’t really tested the TrueRNG output, but used the MT19937 PRNG build-in to the dieharder.

The right command would be:
dieharder -g 201 -a -f random.dat

Then the rng_name changes from MT19937 to file_input_raw

I made the same mistake at first, until I started to wonder why different tests fail on the same file on different runs, and even after that figuring out the right command wasn’t too apparent.
April 7, 2014 at 7:55 pm #608

Ubld.it Staff
Moderator

Thanks for the feedback on the TrueRNG.

Being a hardware random number generator, the TrueRNG has a tiny amount of analog-to-digital converter (ADC) bias that is inherent in physical devices. In your example, this bias is about 0.01247%. This has a negligible affect on the entropy and shouldn’t be an issue for most applications.

For the dieharder test, we missed the lack of the -g 201 command line argument in the published test. Future tests will be run with this option.

In general, the dieharder tests often fail for hardware random number generators for a variety of reasons. From the dieharder man page:

“The null hypothesis for random number generator testing is “This
generator is a perfect random number generator, and for any choice of
seed produces a infinitely long, unique sequence of numbers that have
all the expected statistical properties of random numbers, to all
orders”. Note well that we *know* that this hypothesis is technically
false for all software generators as they are periodic and do not have
the correct entropy content for this statement to ever be true.*
However, many hardware generators fail a priori as well, as they
contain subtle bias or correlations due to the deterministic physics
that underlies them.* Nature is often *unpredictable* but it is rarely
*random* and the two words don’t (quite) mean the same thing!”

The current version of TrueRNG has a minimal XOR whitening algorithm, just enough to remove some of the bias without impacting throughput. We do have an experimental version that changes the whitening algorithm to null out most of the ADC bias, with a small decrease on throughput. Although this has minimal effect on the entropy of the stream, it does help to pass the Chi Square and some of the Dieharder tests (with the right command line options this time!)

With the changed whitener, We’ve captured two 1.8GB files and ran the ent, rngtest, and dieharder tests on the data. We have duplicated your ent test on 100MB chunks via “split test1.8GB.txt -n 18 –filter ent”. When grep’d for the Chi Square value, we got:

For test1.8G-1.dat
would exceed this value 50.00 percent of the times.
would exceed this value 75.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 25.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 75.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 25.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 99.50 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 10.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 25.00 percent of the times.
would exceed this value 75.00 percent of the times.

For test1.8G-2.dat
would exceed this value 25.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 99.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 75.00 percent of the times.
would exceed this value 2.50 percent of the times.
would exceed this value 25.00 percent of the times.
would exceed this value 75.00 percent of the times.
would exceed this value 10.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 25.00 percent of the times.
would exceed this value 50.00 percent of the times.

For comparison, we reran the dieharder test on 1GB of data using the original whitener and got a few weak and a failed test (dab_bytedistrib – which uses Chi squared internally). The results are somewhat better with the changed whitener but still has a single failure in the dab_bytedistrib test. We’re guessing that the small amount of bias that remains from the physical device accumulates over such a large dataset causing the failure.

After testing, we plan to incorporate the updated whitener into the baseline. We will also be correcting our manual, and thanks for pointing that out.

If you would be interested in a version with the updated whitening, email us at info@ubld.it and we will work something out.

August 11, 2014 at 8:53 am #805

gerryb4160
Member

Hi,

I can see that for some reasons, like statistical analysis, etc., TRNGs that pass all the RNG tests is a must. However, my for purposes, which is mainly creating OTP (One Time Pad) files, I really only need an unreproducible stream of RNGs. So the TrueRNG meets all of my needs.

I may buy more TrueRNG device, so I’m curious when the new versions will be out.

Regards,

August 12, 2014 at 3:34 pm #806

Ubld.it Staff
Moderator

Hey there,

So in regard to the statistically analysis, the updated whitener we (which is technically version 1.01 if you are checking your usb descriptors) passes dieharder tests and fips-140 just fine, this specific test for chi-squared test we are really happy about the performance (of the updated whitener), you will still get some occasional < 5%'s depending on how you chunk the data, and that is due to the nature of the test. It doesn't mean anything is wrong with it statistically and is less of a concern with a TRNG. Most of these tests were made for PRNG's and don't really apply to TRNG's, but we like to try and meet them if we can. As for availablility we should have stock real soon, we've made some slight improvements to the hardware design and we are still running it through its paces but it wont be long.

February 6, 2020 at 7:05 pm #2262

DavidJohnston
Member

A few years has passed, but no one has corrected this thread. It is based on misconceptions of the Chi-Sq Test of Randomness.

The ‘result’ from the test given uniform random data should vary uniformly between 0.0 and 1.0.
When testing fresh, fully uniform data from an RNG and your criteria is say between 1% and 99%, then you would expect 2% of samples to fail even when fed uniform random data.

The Chi-Sq test is not a test with a pass/fail result and a P-value. It’s a transformation of the bias of the data (that follows a chi-sq distribution) to a uniform distribution. If the result of multiple tests are hard up against the endpoint, then you have some bias.

If you see this from an entropy source, that is fine. Bias is expected from an entropy source and you should expect to be feeding that into an entropy extractor to get uniform data.

Now what is completely out of whack is this bit…
would exceed this value 25.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 99.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 75.00 percent of the times.
would exceed this value 2.50 percent of the times.
would exceed this value 25.00 percent of the times.
would exceed this value 75.00 percent of the times.
would exceed this value 10.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 50.00 percent of the times.
would exceed this value 25.00 percent of the times.
would exceed this value 50.00 percent of the times.

What is that? Why are most results quantised to 75.00, 50.00 or 25.00? Why do none have anything in the bottom two significant digits. That is fantastically unlikely in uniform random data. It suggests skulduggery. I reference section 8.11, page 170 in my book “Random Number Generators, Principles and practices” titled ‘Results That are “Too Good”‘ for a description of the statistics of such results.

Example runs of the chi-sq test over 10 samples of 1MiByte uniform random data (from RdRand) give this..
58.49%, 64.21%, 1.26%, 72.63%, 40.34%, 55.45%, 78.88%, 52.52%, 52.12%, 32.92%

At 4 digits of precision there are 10,000 uniformly likely results from random data. The quoted results above seem to be stuck on 5 of those results. (5/10000)^18 is a very very small probability. The result is bunk.

I recommend my improved version of ent – djent (https://github.com/dj-on-github/djent) for testing such data since you have better control over symbol sizes which you could match to the ADC symbol size for correct analysis.

I suggest posting actual data samples so we can run out own tests.

March 3, 2020 at 7:07 am #2265

alexander.shen
Member

The generator that I have (v3) does not show this strange behavior (using ent function from ubuntu 18.04) – typical result looks like
racaf@shen:/lacie/bigrandomdata/truerng$ ent random2.bin
Entropy = 8.000000 bits per byte.

Optimum compression would reduce the size
of this 4437100000 byte file by 0 percent.

Chi square distribution for 4437100000 samples is 265.26, and randomly
would exceed this value 31.64 percent of the times.

Arithmetic mean value of data bytes is 127.5004 (127.5 = random).
Monte Carlo value for Pi is 3.141643794 (error 0.00 percent).
Serial correlation coefficient is -0.000010 (totally uncorrelated = 0.0).

—-
For 100mb chunks:

Chi square distribution for 104857600 samples is 295.96, and randomly
would exceed this value 3.97 percent of the times.

Chi square distribution for 104857600 samples is 232.56, and randomly
would exceed this value 84.00 percent of the times.

Chi square distribution for 104857600 samples is 276.14, and randomly
would exceed this value 17.35 percent of the times.

Chi square distribution for 104857600 samples is 236.02, and randomly
would exceed this value 79.74 percent of the times.

Chi square distribution for 104857600 samples is 271.25, and randomly
would exceed this value 23.14 percent of the times.

Chi square distribution for 104857600 samples is 231.29, and randomly
would exceed this value 85.42 percent of the times.

Chi square distribution for 104857600 samples is 303.83, and randomly
would exceed this value 1.94 percent of the times.

Chi square distribution for 104857600 samples is 264.75, and randomly
would exceed this value 32.43 percent of the times.

Chi square distribution for 104857600 samples is 268.44, and randomly
would exceed this value 26.95 percent of the times.

Chi square distribution for 104857600 samples is 285.71, and randomly
would exceed this value 9.05 percent of the times.

Chi square distribution for 104857600 samples is 263.80, and randomly
would exceed this value 33.91 percent of the times.

Chi square distribution for 104857600 samples is 260.11, and randomly
would exceed this value 39.97 percent of the times.

Chi square distribution for 104857600 samples is 285.44, and randomly
would exceed this value 9.22 percent of the times.

Chi square distribution for 104857600 samples is 237.08, and randomly
would exceed this value 78.32 percent of the times.

Chi square distribution for 104857600 samples is 306.80, and randomly
would exceed this value 1.45 percent of the times.

Chi square distribution for 104857600 samples is 228.14, and randomly
would exceed this value 88.57 percent of the times.

Chi square distribution for 104857600 samples is 248.90, and randomly
would exceed this value 59.59 percent of the times.

Chi square distribution for 104857600 samples is 265.29, and randomly
would exceed this value 31.59 percent of the times.

Chi square distribution for 104857600 samples is 263.25, and randomly
would exceed this value 34.79 percent of the times.

—-
But still the question: is there any more detailed information about whitening algorithms, except from the documentation quotes: “entropy mixing algorithm takes in $20$ bits of entropy and outputs $8$ bit to ensure that maximum entropy is maintained. The algorithm uses multiplication in a Galois field similar to a cyclic redundancy check to mix the ADC inputs thoroughly while spreading the entropy evenly across all bits” and “We split the data into multiple streams and use the XOR method to reduce bias (whiten) the output data… A significant amount of time was spent getting the whitening correct without reducing the throughput too much. Currently, the random data is XORed/downselected at about a 20:1 rate to whiten while keeping maximum entropy”
Author

Posts

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.