Diehard tests

The diehard tests are a battery of statistical tests for measuring the quality of a random number generator (RNG). They were developed by George Marsaglia over several years and first published in 1995 on a CD-ROM of random numbers.[1] In 2006, the original diehard tests were extended into the dieharder tests.[2]

Most of the tests in DIEHARD return a p-value, which should be uniform on [0,1) if the input file contains truly independent random bits. Those p-values are obtained by p = F(X), where F is the assumed distribution of the sample random variable X – often normal. But that assumed F is just an asymptotic approximation, for which the fit will be worst in the tails. Thus you should not be surprised with occasional p-values near 0 or 1, such as 0.0012 or 0.9983. When a bit stream really FAILS BIG, you will get ps of 0 or 1 to six or more places. Since there are many tests, it is not unlikely that a p < 0.025 or p > 0.975 means that the RNG has "failed the test at the 0.05 level". We expect a number of such events ps happen among the hundreds of events DIEHARD produces, even conditioned on the random number generator being perfect.

History

An initial battery of randomness tests for RNGs was suggested in the 1969 first edition of The Art of Computer Programming by Donald Knuth (Volume 2, Chapter 3.3: Statistical tests). Knuth's tests were then supplanted by George Marsaglia's Diehard tests (1995) consisting of fifteen different tests. The inability to modify the test parameters or add new tests led to the development of the TestU01 library, introduced in 2007 by Pierre L’Ecuyer and Richard Simard of the Université de Montréal.

Birthday spacings

Choose random points on a large interval. The spacings between the points should be asymptotically exponentially distributed.[3] The name is based on the birthday paradox.

Choose m birthdays in a year of n days. List the spacings between the birthdays. If j is the number of values that occur more than once in that list, then j is asymptotically Poisson-distributed with mean m3 / (4n). Experience shows n must be quite large, say n ≥ 218, for comparing the results to the Poisson distribution with that mean. This test uses n = 224 and m = 29, so that the underlying distribution for j is taken to be Poisson with λ = 227 / 226 = 2. A sample of 500 js is taken, and a chi-square goodness of fit test provides a p value. The first test uses bits 1–24 (counting from the left) from integers in the specified file. Then the file is closed and reopened. Next, bits 2–25 are used to provide birthdays, then 3–26 and so on to bits 9–32. Each set of bits provides a p-value, and the nine p-values provide a sample for a KSTEST.

Overlapping permutations

Analyze sequences of five consecutive random numbers. The 120 possible orderings should occur with statistically equal probability.

This is the OPERM5 test. It looks at a sequence of one million 32-bit random integers. Each set of five consecutive integers can be in one of 120 states, for the 5! possible orderings of five numbers. Thus the 5th, 6th, 7th, ... numbers each provide a state. As many thousands of state transitions are observed, cumulative counts are made of the number of occurrences of each state. Then the quadratic form in the weak inverse of the 120×120 covariance matrix yields a test equivalent to the likelihood ratio test that the 120 cell counts came from the specified (asymptotically) normal distribution with the specified 120×120 covariance matrix (with rank 99). This version uses 1000000 integers, twice. This test may have unresolved bugs resulting in consistently poor p-values.[4]

Ranks of matrices

Select some number of bits from some number of random numbers to form a matrix over {0,1}, then determine the rank of the matrix. Count the ranks.

31×31

The leftmost 31 bits of 31 random integers from the test sequence are used to form a 31×31 binary matrix over the field {0,1}. The rank is determined. That rank can be from 0 to 31, but ranks < 28 are rare, and their counts are pooled with those for rank 28. Ranks are found for 40000 such random matrices and a chi-square test is performed on counts for ranks 31, 30, 29 and ≤ 28.

32×32

A random 32×32 binary matrix is formed, each row a 32-bit random integer. The rank is determined. That rank can be from 0 to 32, ranks less than 29 are rare, and their counts are pooled with those for rank 29. Ranks are found for 40000 such random matrices and a chi square test is performed on counts for ranks 32, 31, 30 and ≤ 29.

6×8

From each of six random 32-bit integers from the generator under test, a specified byte is chosen, and the resulting six bytes form a 6×8 binary matrix whose rank is determined. That rank can be from 0 to 6, but ranks 0, 1, 2, 3 are rare; their counts are pooled with those for rank 4. Ranks are found for 100000 random matrices, and a chi square test is performed on counts for ranks 6, 5 and ≤ 4.

Monkey tests

Treat sequences of some number of bits as "words". Count the overlapping words in a stream. The number of "words" that do not appear should follow a known distribution. The name is based on the infinite monkey theorem.

Count the 1s

Count the 1 bits in each of either successive or chosen bytes. Convert the counts to "letters", and count the occurrences of five-letter "words".

Successive bytes

Consider the file under test as a stream of bytes (four per 32-bit integer). Each byte can contain from none to eight 1s, with probabilities 1, 8, 28, 56, 70, 56, 28, 8, 1 over 256. Now let the stream of bytes provide a string of overlapping 5-letter words, each "letter" taking values A, B, C, D, E. The letters are determined by the number of 1s in a byte 0, 1, or 2 yield A, 3 yields B, 4 yields C, 5 yields D and 6, 7 or 8 yield E. Thus we have a monkey at a typewriter hitting five keys with various probabilities (37, 56, 70, 56, 37 over 256). There are 55 possible 5-letter words, and from a string of 256000 (overlapping) 5-letter words, counts are made on the frequencies for each word. The quadratic form in the weak inverse of the covariance matrix of the cell counts provides a chisquare test Q5–Q4, the difference of the naive Pearson sums of (OBS-EXP)2 / EXP on counts for 5- and 4-letter cell counts.

Chosen bytes

Consider the file under test as a stream of 32-bit integers. From each integer, a specific byte is chosen, say the leftmost bits 1 to 8. Each byte can contain from 0 to 8 1s, with probabilities 1, 8, 28, 56, 70, 56, 28, 8, 1 over 256. Now let the specified bytes from successive integers provide a string of (overlapping) 5-letter words, each "letter" taking values A, B, C, D, E. The letters are determined by the number of 1s, in that byte 0, 1, or 2 → A, 3 → B, 4 → C, 5 → D, and 6, 7 or 8 → E. Thus we have a monkey at a typewriter hitting five keys with various probabilities 37, 56, 70, 56, 37 over 256. There are 55 possible 5-letter words, and from a string of 256000 (overlapping) 5-letter words, counts are made on the frequencies for each word. The quadratic form in the weak inverse of the covariance matrix of the cell counts provides a chisquare test Q5 – Q4, the difference of the naive Pearson sums of (OBS − EXP)2 / EXP on counts for 5- and 4-letter cell counts.

Parking lot test

Randomly place unit circles in a 100×100 square. A circle is successfully parked if it does not overlap an existing successfully parked one. After 12,000 tries, the number of successfully parked circles should follow a certain normal distribution.

In a square of side 100, randomly "park" a car – a circle of radius 1. Then try to park a 2nd, a 3rd, and so on, each time parking "by ear". That is, if an attempt to park a car causes a crash with one already parked, try again at a new random location. (To avoid path problems, consider parking helicopters rather than cars.) Each attempt leads to either a crash or a success, the latter followed by an increment to the list of cars already parked. If we plot n: the number of attempts, versus k the number successfully parked, we get a curve that should be similar to those provided by a perfect random number generator. Theory for the behavior of such a random curve seems beyond reach, and as graphics displays are not available for this battery of tests, a simple characterization of the random experiment is used: k, the number of cars successfully parked after n = 12000 attempts. Simulation shows that k should average 3523 with sigma 21.9 and is very close to normally distributed. Thus (k − 3523) / 21.9 should be a standard normal variable, which, converted to a uniform variable, provides input to a KSTEST based on a sample of 10.

Minimum distance test

Randomly place 8000 points in a 10000×10000 square, then find the minimum distance between the pairs. The square of this distance should be exponentially distributed with a certain mean.

It does this 100 times choose n = 8000 random points in a square of side 10000. Find d, the minimum distance between the (n2n) / 2 pairs of points. If the points are truly independent uniform, then d2, the square of the minimum distance should be (very close to) exponentially distributed with mean 0.995. Thus 1 − exp(−d2 / 0.995) should be uniform on [0,1) and a KSTEST on the resulting 100 values serves as a test of uniformity for random points in the square. Test numbers = 0 mod 5 are printed but the KSTEST is based on the full set of 100 random choices of 8000 points in the 10000×10000 square.

Random spheres test

Randomly choose 4000 points in a cube of edge 1000. Center a sphere on each point, whose radius is the minimum distance to another point. The smallest sphere's volume should be exponentially distributed with a certain mean.

Choose 4000 random points in a cube of edge 1000. At each point, center a sphere large enough to reach the next closest point. Then the volume of the smallest such sphere is (very close to) exponentially distributed with mean 120π / 3. Thus the radius cubed is exponential with mean 30. (The mean is obtained by extensive simulation). The 3D spheres test generates 4000 such spheres 20 times. Each min radius cubed leads to a uniform variable by means of 1 − exp(−r3 / 30), then a KSTEST is done on the 20 p-values.

Squeeze test

Multiply 231 by random floats on (0,1) until you reach 1. Repeat this 100000 times. The number of floats needed to reach 1 should follow a certain distribution.

Random integers are floated to get uniforms on [0,1). Starting with k = 231 = 2147483648, the test finds j, the number of iterations necessary to reduce k to 1, using the reduction k = ceiling(k×U), with U provided by floating integers from the file being tested. Such js are found 100000 times, then counts for the number of times j was ≤ 6, 7, ..., 47, ≥ 48 are used to provide a chi-square test for cell frequencies.

Overlapping sums test

Generate a long sequence of random floats on (0,1). Add sequences of 100 consecutive floats. The sums should be normally distributed with characteristic mean and variance.

Integers are floated to get a sequence U(1), U(2), ... of uniform [0,1) variables. Then overlapping sums, S(1) = U(1) + ... + U(100), S(2) = U(2) + ... + U(101), ... are formed. The Ss are virtually normal with a certain covariance matrix. A linear transformation of the Ss converts them to a sequence of independent standard normals, which are converted to uniform variables for a KSTEST. The p-values from ten KSTESTs are given still another KSTEST.

Wald–Wolfowitz runs test

Generate a long sequence of random floats on (0,1). Count ascending and descending runs. The counts should follow a certain distribution.

It counts runs up, and runs down, in a sequence of uniform [0,1) variables, obtained by floating the 32-bit integers in the specified file. This example shows how runs are counted: 0.123, 0.357, 0.789, 0.425, 0.224, 0.416, 0.95 contains an up-run of length 3, a down-run of length 2 and an up-run of (at least) 2, depending on the next values. The covariance matrices for the runs-up and runs-down are well known, leading to chi-square tests for quadratic forms in the weak inverses of the covariance matrices. Runs are counted for sequences of length 10000. This is done ten times. Then repeated.

Craps test

Play 200000 games of craps, counting the wins and the number of throws per game. Each count should follow a certain distribution.

It plays 200000 games of craps, finds the number of wins and the number of throws necessary to end each game. The number of wins should be (very close to) a normal with mean 200000p and variance 200000p(1 − p), with p = 244 / 495. Throws necessary to complete the game can vary from 1 to infinity, but counts for all > 21 are lumped with 21. A chi-square test is made on the no.-of-throws cell counts. Each 32-bit integer from the test file provides the value for the throw of a die, by floating to [0,1), multiplying by 6 and taking 1 plus the integer part of the result.

Bitstream test

The file under test is viewed as a stream of bits. Call them b1, b2, ... . Consider an alphabet with two "letters", 0 and 1, and think of the stream of bits as a succession of 20-letter "words", overlapping. Thus the first word is b1b2...b20, the second is b2b3...b21, and so on. The bitstream test counts the number of missing 20-letter (20-bit) words in a string of 221 overlapping 20-letter words. There are 220 possible 20-letter words. For a truly random string of 221 + 19 bits, the number of missing words j should be (very close to) normally distributed with mean 141,909 and sigma 428. Thus (j−141909) / 428 should be a standard normal variate (z score) that leads to a uniform [0,1) p value. The test is repeated twenty times.

OPSO, OQSO and DNA tests

OPSO means overlapping-pairs-sparse-occupancy. The OPSO test considers 2-letter words from an alphabet of 1024 letters. Each letter is determined by a specified ten bits from a 32-bit integer in the sequence to be tested. OPSO generates 221 (overlapping) 2-letter words (from 221 + 1 "keystrokes") and counts the number of missing words – that is 2-letter words which do not appear in the entire sequence. That count should be very close to normally distributed with mean 141909, sigma 290. Thus (missingwrds-141909) / 290 should be a standard normal variable. The OPSO test takes 32 bits at a time from the test file and uses a designated set of ten consecutive bits. It then restarts the file for the next designated 10 bits, and so on. OQSO means overlapping-quadruples-sparse-occupancy. The test OQSO is similar, except that it considers 4-letter words from an alphabet of 32 letters, each letter determined by a designated string of 5 consecutive bits from the test file, elements of which are assumed 32-bit random integers. The mean number of missing words in a sequence of 221 four-letter words, (221 + 3 "keystrokes"), is again 141909, with sigma = 295. The mean is based on theory; sigma comes from extensive simulation. The DNA test considers an alphabet of 4 letters C, G, A, T, determined by two designated bits in the sequence of random integers being tested. It considers 10-letter words, so that as in OPSO and OQSO, there are 220 possible words, and the mean number of missing words from a string of 221 (overlapping) 10-letter words (221 + 9 "keystrokes") is 141909. The standard deviation sigma = 339 was determined as for OQSO by simulation. (Sigma for OPSO, 290, is the true value (to three places), not determined by simulation.

References

  1. ^ "The Marsaglia Random Number CDROM including the Diehard Battery of Tests of Randomness". Florida State University. 1995. Archived from the original on 2016-01-25.
  2. ^ Brown, Robert G. "dieharder". Retrieved 2023-09-25.
  3. ^ Renyi, 1953, p194
  4. ^ "Robert G. Brown's General Tools Page". Archived from the original on 2017-07-03.

Further reading

Read other articles:

Herstal Bendera Senjata api FN berukir Herstal (pengucapan bahasa Prancis: [ɛʁstal]; bahasa Wallonia: Hèsta), sebelumnya dinamakan Heristal atau Héristal, adalah sebuah kota madya di Provinsi Liège, Belgia. Terletak di sepanjang sungai Meuse, di wilayah Wallonia. Herstal diaglomerasi ke dalam Liège Raya. Kota madya Herstal mencakup bekas komune Milmort, Vottem, dan sebagian dari Liers (bagian lainnya termasuk ke dalam Juprelle). Pabrik senjata Fabrique Nationale atau biasa dise...

 

Claudia LlosaFestival Film GuadalajaraLahirClaudia Llosa Bueno15 November 1976 (umur 47)Lima, PeruPekerjaanSutradara Claudia Llosa (kelahiran 15 November 1976)[1] adalah seorang sutradara, penulis dan produser Peru. Kehidupan dan karier Ia lahir di Lima, belajar di Kolese Newton dan meraih sebuah gelar dalam bidang pembelajaran komunikasi di Universitas Lima. Ia merupakan kemenakan dari penulis Peru Mario Vargas Llosa dan sutradara Luis Llosa. Pada akhir 1990an, ia berpindah ke ...

 

Julien Faubert Faubert, 2010Informasi pribadiNama lengkap Julien Alex Thomas Faubert[1]Tanggal lahir 1 Agustus 1983 (umur 40)Tempat lahir Le Havre, PrancisTinggi 180 m (590 ft 7 in)[2]Posisi bermain Winger / Bek kananInformasi klubKlub saat ini Fréjus Saint-RaphaëlKarier junior1998–2002 CannesKarier senior*Tahun Tim Tampil (Gol)2002–2004 Cannes 45 (4)2004–2007 Bordeaux 99 (10)2007–2012 West Ham United 89 (2)2009 → Real Madrid (loan) 2 (0)2012 E...

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada Januari 2023. Anjing gembala Pirenia Anjing gembala Pirenia bermuka licin Nama lain Anjing gembala kecil Nama panggilan Pyr Shep Negara asal Prancis/Spanyol Ciri-ciri Klasifikasi & standar FCI Grup 1 Seksi 1 #141 standar AKC Gembala standar CKC Grup 7 (Gembala) ...

 

Pour les articles homonymes, voir Moscovici. Pierre Moscovici Pierre Moscovici en 2017. Fonctions Premier président de la Cour des comptes En fonction depuis le 3 juin 2020(3 ans, 10 mois et 2 jours) Prédécesseur Didier MigaudSophie Moati (intérim) Commissaire européen aux Affaires économiques et monétaires, à la Fiscalité et à l'Union douanière 1er novembre 2014 – 30 novembre 2019 (5 ans et 29 jours) Président Jean-Claude Juncker Commission Juncker Pr�...

 

Warren CookCook (kiri) dalam The Immigrant (1917)Lahir(1878-05-23)23 Mei 1878Boston, MassachusettsMeninggal2 Mei 1939(1939-05-02) (umur 60)New York, New YorkPekerjaanPemeranTahun aktif1914-1927 Warren Cook (23 Mei 1878 – 2 Mei 1939) adalah seorang pemeran film Amerika Serikat pada era film bisu. Cook lahir di Boston, Massachusetts. Pada 1901, ia tampil dalam The Shaughraun di Castle Square Theatre, Boston. Ia merupakan bagian dari perusahaan saham yang berbasis di C...

Синелобый амазон Научная классификация Домен:ЭукариотыЦарство:ЖивотныеПодцарство:ЭуметазоиБез ранга:Двусторонне-симметричныеБез ранга:ВторичноротыеТип:ХордовыеПодтип:ПозвоночныеИнфратип:ЧелюстноротыеНадкласс:ЧетвероногиеКлада:АмниотыКлада:ЗавропсидыКласс:Пт�...

 

Part of a series onBritish law Acts of Parliament of the United Kingdom Year      1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 ...

 

Simaroubaceae Ailanthus altissima Klasifikasi ilmiah Kerajaan: Plantae (tanpa takson): Angiospermae (tanpa takson): Eudikotil (tanpa takson): Rosidae Ordo: Sapindales Famili: SimaroubaceaeDC.[1] Genera lihat teks. Peta persebarana anggota famili Simaroubaceae Sinonim Ailanthaceae J.Agardh Castelaceae J.Agardh Holacanthaceae Jadin, nom. inval. Leitneriaceae Benth. & Hook.f., nom. cons. Simabaceae Horan. Soulameaceae Endl.[1] Simaroubaceae atau Suku pasakbumi-pasakbumian ad...

Exegesis of the Quran For other uses, see Tafsir (disambiguation). Quran History Waḥy First revelation Asbab al-Nuzul Historicity Manuscripts Samarkand Kufic Quran Sanaa manuscript Topkapi manuscript Birmingham manuscript Divisions Surah List Meccan Medinan Āyah Juz' Muqatta'at Content Prophets Women Animals Legends Miracles Parables Science Eschatology God Reading Qāriʾ Hifz Tajwid Tarteel Ahruf Qira'at Translations List English Ahmadiyya Exegesis List Hermeneutics Esotericism Abrogatio...

 

Pakistani cricketer Salman IrshadIrshad in 2022Personal informationBorn (1995-12-03) 3 December 1995 (age 28)Rawalakot, Azad Kashmir, PakistanHeight6 ft 2 in (188 cm)[1]BattingRight-handedBowlingRight-arm fast-mediumRoleBowlerDomestic team information YearsTeam2015-2016AJK Jaguars2017Hawkesbury Cricket Club2018; 2020Lahore Qalandars (squad no. 99)2019–2021Northern2021-presentMirpur Royals (squad no. 99)2022-presentPeshawar Zalmi (squad no...

 

Edith Holman Nazionalità  Regno Unito Tennis Carriera Singolare1 Vittorie/sconfitte Titoli vinti Miglior ranking Risultati nei tornei del Grande Slam  Australian Open -  Roland Garros -  Wimbledon SF (1912, 1913)  US Open - Altri tornei  Giochi olimpici (1920) Doppio1 Vittorie/sconfitte Titoli vinti Miglior ranking Risultati nei tornei del Grande Slam  Australian Open  Roland Garros  Wimbledon SF (1919, 1922)  US Open Altri tornei  Gioc...

此条目序言章节没有充分总结全文内容要点。 (2019年3月21日)请考虑扩充序言,清晰概述条目所有重點。请在条目的讨论页讨论此问题。 哈萨克斯坦總統哈薩克總統旗現任Қасым-Жомарт Кемелұлы Тоқаев卡瑟姆若马尔特·托卡耶夫自2019年3月20日在任任期7年首任努尔苏丹·纳扎尔巴耶夫设立1990年4月24日(哈薩克蘇維埃社會主義共和國總統) 哈萨克斯坦 哈萨克斯坦政府...

 

British political party The subject of this article is participating in the 2024 general election to the House of Commons of the United Kingdom on 4 July, and has had no MPs in the House of Commons since Parliament was dissolved on 30 May. Some parts of this article may be out of date during this period. Please feel free to improve this article (but note that updates without valid and reliable references will be removed) or discuss changes on the talk page. Conservative and Unionist...

 

كسوف حلقي للشمس. صلاة الكسوف سنة مؤكدة، وعدد ركعاتها: «ركعتان» ويُسن الخروج لأدائها في جماعة، بلا أذان ولا إقامة. صلاة خسوف القمر جهرية لأنها صلاة ليلية، وصلاة كسوف الشمس سرّية لأنها صلاة نهارية. وقتها من ابتداء الكسوف إلى ذهابه ولا تصلى حتى يرى الناس الكسوف لقوله - صَلَّى ا...

Ibex siberia Status konservasi Risiko Rendah  (IUCN 3.1)[1] Klasifikasi ilmiah Kerajaan: Animalia Filum: Chordata Kelas: Mamalia Ordo: Artiodactyla Famili: Bovidae Genus: Capra Spesies: C. sibirica Nama binomial Capra sibiricaPallas, 1776 Ibex siberia (Capra sibirica) adalah fauna khas Asia Tengah. Tadinya, hewan ini diklasifikasikan sebagai salah satu varietas dari ibex alpen. Tetapi akhirnya dipisahkan menjadi spesies tersendiri. Ibex siberia merupakan spesies genus Capra...

 

2020年夏季奥林匹克运动会阿根廷代表團阿根廷国旗IOC編碼ARGNOC阿根廷奧林匹克委員會網站www.coarg.org.ar(西班牙文)2020年夏季奥林匹克运动会(東京)2021年7月23日至8月8日(受2019冠状病毒病疫情影响推迟,但仍保留原定名称)運動員189參賽項目26个大项旗手開幕式:圣地亚哥·兰赫和塞西莉亞·卡蘭薩(帆船)[1]閉幕式:佩德羅·伊巴拉(曲棍球)[2]獎牌榜排名第...

 

South African court that handles labour law cases Labour Court of South AfricaEstablished1995LocationJohannesburg, Cape Town, Durban, Port ElizabethComposition methodPresidential appointment on the advice of the JSC and NEDLACAuthorized byLabour Relations Act, 1995Appeals toLabour Appeal CourtWebsitewww.justice.gov.za/labourcourt/Judge PresidentCurrentlyD MlamboDeputy Judge PresidentCurrentlyB Waglay The Labour Court is a South African court that handles labour law cases, that is, disputes ar...

Russian politician and economist You can help expand this article with text translated from the corresponding article in Russian. (February 2024) Click [show] for important translation instructions. Machine translation, like DeepL or Google Translate, is a useful starting point for translations, but translators must revise errors as necessary and confirm that the translation is accurate, rather than simply copy-pasting machine-translated text into the English Wikipedia. Do not translate ...

 

Алюминий← Магний | Кремний → 13 B↑Al↓Ga Периодическая система элементов13Al Внешний вид простого вещества Образец алюминия Свойства атома Название, символ, номер Алюминий / Aluminium (Al), 13 Группа, период, блок 13 (устар. 3), 3, p-элемент Атомная масса (молярная масса) 26,9815386(8)[1]...