Anderson–Darling test

The Anderson–Darling test is a statistical test of whether a given sample of data is drawn from a given probability distribution. In its basic form, the test assumes that there are no parameters to be estimated in the distribution being tested, in which case the test and its set of critical values is distribution-free. However, the test is most often used in contexts where a family of distributions is being tested, in which case the parameters of that family need to be estimated and account must be taken of this in adjusting either the test-statistic or its critical values. When applied to testing whether a normal distribution adequately describes a set of data, it is one of the most powerful statistical tools for detecting most departures from normality.[1][2] K-sample Anderson–Darling tests are available for testing whether several collections of observations can be modelled as coming from a single population, where the distribution function does not have to be specified.

In addition to its use as a test of fit for distributions, it can be used in parameter estimation as the basis for a form of minimum distance estimation procedure.

The test is named after Theodore Wilbur Anderson (1918–2016) and Donald A. Darling (1915–2014), who invented it in 1952.[3]

The single-sample test

The Anderson–Darling and Cramér–von Mises statistics belong to the class of quadratic EDF statistics (tests based on the empirical distribution function).[2] If the hypothesized distribution is , and empirical (sample) cumulative distribution function is , then the quadratic EDF statistics measure the distance between and by

where is the number of elements in the sample, and is a weighting function. When the weighting function is , the statistic is the Cramér–von Mises statistic. The Anderson–Darling (1954) test[4] is based on the distance

which is obtained when the weight function is . Thus, compared with the Cramér–von Mises distance, the Anderson–Darling distance places more weight on observations in the tails of the distribution.

Basic test statistic

The Anderson–Darling test assesses whether a sample comes from a specified distribution. It makes use of the fact that, when given a hypothesized underlying distribution and assuming the data does arise from this distribution, the cumulative distribution function (CDF) of the data can be transformed to what should follow a uniform distribution. The data can be then tested for uniformity with a distance test (Shapiro 1980). The formula for the test statistic to assess if data (note that the data must be put in order) comes from a CDF is

where

The test statistic can then be compared against the critical values of the theoretical distribution. In this case, no parameters are estimated in relation to the cumulative distribution function .

Tests for families of distributions

Essentially the same test statistic can be used in the test of fit of a family of distributions, but then it must be compared against the critical values appropriate to that family of theoretical distributions and dependent also on the method used for parameter estimation.

Test for normality

Empirical testing has found[5] that the Anderson–Darling test is not quite as good as the Shapiro–Wilk test, but is better than other tests. Stephens[1] found to be one of the best empirical distribution function statistics for detecting most departures from normality.

The computation differs based on what is known about the distribution:[6]

  • Case 0: The mean and the variance are both known.
  • Case 1: The variance is known, but the mean is unknown.
  • Case 2: The mean is known, but the variance is unknown.
  • Case 3: Both the mean and the variance are unknown.

The n observations, , for , of the variable must be sorted such that and the notation in the following assumes that Xi represent the ordered observations. Let

The values are standardized to create new values , given by

With the standard normal CDF , is calculated using

An alternative expression in which only a single observation is dealt with at each step of the summation is:

A modified statistic can be calculated using

If or exceeds a given critical value, then the hypothesis of normality is rejected with some significance level. The critical values are given in the table below for values of .[1] [7]

Note 1: If = 0 or any (0 or 1) then cannot be calculated and is undefined.

Note 2: The above adjustment formula is taken from Shorack & Wellner (1986, p239). Care is required in comparisons across different sources as often the specific adjustment formula is not stated.

Note 3: Stephens[1] notes that the test becomes better when the parameters are computed from the data, even if they are known.

Note 4: Marsaglia & Marsaglia[7] provide a more accurate result for Case 0 at 85% and 99%.

Case n 15% 10% 5% 2.5% 1%
0 ≥ 5 1.621 1.933 2.492 3.070 3.878
1 0.908 1.105 1.304 1.573
2 ≥ 5 1.760 2.323 2.904 3.690
3 10 0.514 0.578 0.683 0.779 0.926
20 0.528 0.591 0.704 0.815 0.969
50 0.546 0.616 0.735 0.861 1.021
100 0.559 0.631 0.754 0.884 1.047
0.576 0.656 0.787 0.918 1.092

Alternatively, for case 3 above (both mean and variance unknown), D'Agostino (1986) [6] in Table 4.7 on p. 123 and on pages 372–373 gives the adjusted statistic:

and normality is rejected if exceeds 0.631, 0.754, 0.884, 1.047, or 1.159 at 10%, 5%, 2.5%, 1%, and 0.5% significance levels, respectively; the procedure is valid for sample size at least n=8. The formulas for computing the p-values for other values of are given in Table 4.9 on p. 127 in the same book.

Tests for other distributions

Above, it was assumed that the variable was being tested for normal distribution. Any other family of distributions can be tested but the test for each family is implemented by using a different modification of the basic test statistic and this is referred to critical values specific to that family of distributions. The modifications of the statistic and tables of critical values are given by Stephens (1986)[2] for the exponential, extreme-value, Weibull, gamma, logistic, Cauchy, and von Mises distributions. Tests for the (two-parameter) log-normal distribution can be implemented by transforming the data using a logarithm and using the above test for normality. Details for the required modifications to the test statistic and for the critical values for the normal distribution and the exponential distribution have been published by Pearson & Hartley (1972, Table 54). Details for these distributions, with the addition of the Gumbel distribution, are also given by Shorack & Wellner (1986, p239). Details for the logistic distribution are given by Stephens (1979). A test for the (two parameter) Weibull distribution can be obtained by making use of the fact that the logarithm of a Weibull variate has a Gumbel distribution.

Non-parametric k-sample tests

Fritz Scholz and Michael A. Stephens (1987) discuss a test, based on the Anderson–Darling measure of agreement between distributions, for whether a number of random samples with possibly different sample sizes may have arisen from the same distribution, where this distribution is unspecified.[8] The R package kSamples and the Python package Scipy implements this rank test for comparing k samples among several other such rank tests.[9][10]

For samples the statistic can be computed as follows under the assumption that the distribution function of -th sample is continuous

where

  • is the number of observations in the -th sample
  • is the total number of observations in all samples
  • is the pooled ordered sample
  • is the number of observations in the -th sample that are not greater than .[8]

See also

References

  1. ^ a b c d Stephens, M. A. (1974). "EDF Statistics for Goodness of Fit and Some Comparisons". Journal of the American Statistical Association. 69 (347): 730–737. doi:10.2307/2286009. JSTOR 2286009.
  2. ^ a b c M. A. Stephens (1986). "Tests Based on EDF Statistics". In D'Agostino, R. B.; Stephens, M. A. (eds.). Goodness-of-Fit Techniques. New York: Marcel Dekker. ISBN 0-8247-7487-6.
  3. ^ Anderson, T. W.; Darling, D. A. (1952). "Asymptotic theory of certain "goodness-of-fit" criteria based on stochastic processes". Annals of Mathematical Statistics. 23 (2): 193–212. doi:10.1214/aoms/1177729437.
  4. ^ Anderson, T.W.; Darling, D.A. (1954). "A Test of Goodness-of-Fit". Journal of the American Statistical Association. 49 (268): 765–769. doi:10.2307/2281537. JSTOR 2281537.
  5. ^ Razali, Nornadiah; Wah, Yap Bee (2011). "Power comparisons of Shapiro–Wilk, Kolmogorov–Smirnov, Lilliefors and Anderson–Darling tests". Journal of Statistical Modeling and Analytics. 2 (1): 21–33.
  6. ^ a b Ralph B. D'Agostino (1986). "Tests for the Normal Distribution". In D'Agostino, R.B.; Stephens, M.A. (eds.). Goodness-of-Fit Techniques. New York: Marcel Dekker. ISBN 0-8247-7487-6.
  7. ^ a b Marsaglia, G. (2004). "Evaluating the Anderson-Darling Distribution". Journal of Statistical Software. 9 (2): 730–737. CiteSeerX 10.1.1.686.1363. doi:10.18637/jss.v009.i02.
  8. ^ a b Scholz, F. W.; Stephens, M. A. (1987). "K-sample Anderson–Darling Tests". Journal of the American Statistical Association. 82 (399): 918–924. doi:10.1080/01621459.1987.10478517.
  9. ^ "kSamples: K-Sample Rank Tests and their Combinations". R Project.
  10. ^ "The Anderson-Darling test for k-samples. Scipy package".

Further reading

  • Corder, G.W., Foreman, D.I. (2009).Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach Wiley, ISBN 978-0-470-45461-9
  • Mehta, S. (2014) Statistics Topics ISBN 978-1499273533
  • Pearson E.S., Hartley, H.O. (Editors) (1972) Biometrika Tables for Statisticians, Volume II. CUP. ISBN 0-521-06937-8.
  • Shapiro, S.S. (1980) How to test normality and other distributional assumptions. In: The ASQC basic references in quality control: statistical techniques 3, pp. 1–78.
  • Shorack, G.R., Wellner, J.A. (1986) Empirical Processes with Applications to Statistics, Wiley. ISBN 0-471-86725-X.
  • Stephens, M.A. (1979) Test of fit for the logistic distribution based on the empirical distribution function, Biometrika, 66(3), 591–5.

Read other articles:

Miss Supranational 2011Monika Lewczuk, Miss Supranational 2012Tanggal26 Agustus 2011Tempat Amphitheater, Plock, PolandiaPeserta70Finalis/Semifinalis20DebutAfrika Selatan, Amerika Serikat, Bonaire, Curacao, El Salvador, Ethiopia, Guinea Khatulistiwa, Filipina, Hong Kong, India, Israel, Kep. Virgin AS, Libanon, Mesir, Namibia, Polinesia Prancis, Rwanda, Singapura, Suriname, Togo, ZimbabweTidak tampilArmenia, China, Gambia, Guatemala, Guinea, Haiti, Irak, Irlandia, Italia, Jepang, Jerman, K...

 

2010 studio album by GonjasufiA Sufi and a KillerStudio album by GonjasufiReleasedMarch 8, 2010 (2010-03-08)GenreAlternative hip hopexperimentalpsychedelic rocklo-fiLength58:49LabelWarpProducerThe Gaslamp KillerFlying LotusMainframeGonjasufi chronology A Sufi and a Killer(2010) The Caliph's Tea Party(2012) A Sufi and a Killer is the debut studio album by Gonjasufi (Sumach Ecks). It was released by Warp on March 8, 2010.[1] The album was produced by Flying Lotus,...

 

Untuk aktivis-aktivis dari Bundelkhand, lihat Gulabi Gang. Untuk film dokumenter 2012 pemenang Penghargaan Film Nasional, lihat Gulabi Gang (film). Gulaab GangPoster rilis teatrikalSutradaraSoumik SenProduserAnubhav SinhaAlumbra EntertainmentAbhinay DeoDitulis olehSoumik SenAnubhav SinhaPemeranMadhuri DixitJuhi ChawlaPenata musikSoumik SenDistributorSahara Movie StudiosTanggal rilis 7 Maret 2014 (2014-03-07) Durasi128 menitNegaraIndiaBahasaHindiAnggaran12 Kror [1] Gulaab Ga...

Pemisahan Polandia KetigaDampak Pemisahan Polandia KetigaPopulasi yang dilepasKepada Austria1.2 jutaKepada Prusia1 jutaKepada Rusia1.2 jutaWilayah yang lepasKepada PrusiaPodlachiaKepada AustriaGalisia Barat dan Masovia SelatanKepada RusiaWilayah Polandia yang tersisa Pemisahan Polandia Ketiga (1795) adalah pembagian wilayah Persemakmuran Polandia-Lituania yang terakhir yang menghapus keberadaan negara Polandia dari peta Eropa. Pemisahan ini dilakukan setelah berakhirnya Pemberontakan Kosciusz...

 

TrueVisionsJenisPublik: (SET:TRUE)IndustriTelevisi berbayarDidirikan1 Juli 1998Kantorpusat118/1 Tipco Building, Rama XI Road, Sam Sen Nai, Payathai, Bangkok 10400,  ThailandTokohkunciSoopakij Chearavanont (Chairman) Supachai Chearavanont (CEO)ProdukSatelit digital dan TV KabelSitus webwww.truevisions.co.th TrueVisions adalah operator televisi satelit kabel terkemuka di Thailand. Kini dimiliki oleh True Corporation, perusahaan ini sebelumnya dikenal sebagai United Broadcasting Corporation...

 

Tiziano Vecellio, Ritratto di Giulio Romano, Mantova, collezioni provinciali, 1536 Disegno per una saliera in argento Giulio Pippi de' Jannuzzi, o Giannuzzi, detto Giulio Romano (Roma, 1492 o 1499[1] – Mantova, 1º novembre 1546), è stato un pittore e architetto italiano, importante e versatile personalità del Rinascimento e del Manierismo. Fu un artista completo come era normale per un artista di corte che doveva occuparsi di ogni aspetto legato alla residenza e alla vita di rapp...

هذه المقالة تحتاج للمزيد من الوصلات للمقالات الأخرى للمساعدة في ترابط مقالات الموسوعة. فضلًا ساعد في تحسين هذه المقالة بإضافة وصلات إلى المقالات المتعلقة بها الموجودة في النص الحالي. (مارس 2023) منتخب موزمبيق لكرة اليد للسيدات البلد ؟؟ الزي الأساسي الزي الإحتياطي بطولة أفري...

 

American judge For other people named William Potter, see William Potter (disambiguation). William W. PotterMember of the Michigan SenateIn office1899–1900Preceded byRebekah WarrenConstituency15th districtAttorney General of MichiganIn office1927–1928Preceded byClare RetanSucceeded byWilber M. Brucker Personal detailsBorn(1869-08-01)August 1, 1869Maple Grove Township, Barry County, Michigan, U.S.DiedJuly 21, 1940(1940-07-21) (aged 70)Alma materUniversity of Michigan Law School Willia...

 

كأس أوكرانيا 1999–2000 تفاصيل الموسم كأس أوكرانيا  النسخة 9  البلد أوكرانيا  التاريخ بداية:11 مارس 2000  نهاية:27 مايو 2000  المنظم اتحاد أوكرانيا لكرة القدم  البطل دينامو كييف  عدد المشاركين 32   كأس أوكرانيا 1998–99  كأس أوكرانيا 2000–01  تعديل مصدري - تعديل   ك...

باريس-نيس 2013 طواف العالم للدراجات 2013 السباق 2 من 28 السلسلة طواف العالم للدراجات 2013  رقم السباق 2 سباقات الموسم 28 التاريخ 3–10 مارس 2013 التاريخ بداية:3 مارس 2013  نهاية:10 مارس 2013  عدد المراحل 8 عدد الرياضيين 184 (نقطة البداية)،  و151 (نقطة النهاية)  المسافة 1174 كم الزمن 29 ساع�...

 

Державний комітет телебачення і радіомовлення України (Держкомтелерадіо) Приміщення комітетуЗагальна інформаціяКраїна  УкраїнаДата створення 2003Керівне відомство Кабінет Міністрів УкраїниРічний бюджет 1 964 898 500 ₴[1]Голова Олег НаливайкоПідвідомчі ор...

 

لوار أتلانتيك الكلاسيكي 2021 تفاصيل السباقسلسلة21. لوار أتلانتيك الكلاسيكيمنافسةطواف أوروبا للدراجات 2021 1.1‏التاريخ2 أكتوبر 2021المسافات182٫8 كمالبلد فرنسانقطة البدايةLa Haie-Fouassière [الإنجليزية]‏نقطة النهايةLa Haie-Fouassière [الإنجليزية]‏المنصةالفائز ألان ريو [الإنجليز...

Artikel ini membutuhkan rujukan tambahan agar kualitasnya dapat dipastikan. Mohon bantu kami mengembangkan artikel ini dengan cara menambahkan rujukan ke sumber tepercaya. Pernyataan tak bersumber bisa saja dipertentangkan dan dihapus.Cari sumber: Kopra – berita · surat kabar · buku · cendekiawan · JSTOR (Juli 2014) Kelapa yang akan diolah menjadi kopra daging kelapa mentah (kopra murni)Nilai nutrisi per 100 g (3,5 oz)Energi354 kcal (1.4...

 

Edelgard Bulmahn Edelgard Bulmahn en 2015. Fonctions Vice-présidente du BundestagDésignée par le groupe SPD 22 octobre 2013 – 24 octobre 2017(4 ans et 2 jours) Avec Ulla Schmidt Président Norbert Lammert Législature 18e Prédécesseur Wolfgang Thierse Successeur Thomas Oppermann Ministre fédérale de l'Éducation et de la Recherche d'Allemagne 27 octobre 1998 – 18 octobre 2005(6 ans, 11 mois et 21 jours) Chancelier Gerhard Schröder Gouvernement Schröder I...

 

 本表是動態列表,或許永遠不會完結。歡迎您參考可靠來源來查漏補缺。 潛伏於中華民國國軍中的中共間諜列表收錄根據公開資料來源,曾潛伏於中華民國國軍、被中國共產黨聲稱或承認,或者遭中華民國政府調查審判,為中華人民共和國和中國人民解放軍進行間諜行為的人物。以下列表以現今可查知時間為準,正確的間諜活動或洩漏機密時間可能早於或晚於以下所歸�...

Rural Municipality in Sudurpashchim Province, Nepal Rural Municipality in Sudurpashchim Province, NepalJorayal Rural Municipality जोरायल गाउँपालिकाRural MunicipalityJorayal Rural Municipalityजोरायल गाउँपालिकाShow map of Sudurpashchim ProvinceJorayal Rural MunicipalityJorayal Rural Municipality (Nepal)Show map of NepalCoordinates: 29°06′25″N 80°42′00″E / 29.107°N 80.700°E / 29.107; 80.700Country...

 

Disambiguazione – Se stai cercando altri significati, vedi Uruguay (disambigua). Questa voce o sezione sull'argomento Uruguay non cita le fonti necessarie o quelle presenti sono insufficienti. Commento: intere sezioni prive di fonti a supporto, in particolare gran parte di quelle dedicate alla storia. Puoi migliorare questa voce aggiungendo citazioni da fonti attendibili secondo le linee guida sull'uso delle fonti. Uruguay (dettagli) (dettagli) Uruguay - Localizzazione Dati amministra...

 

Semi-solid white pork fat product For other uses, see Lard (disambiguation). LardWet-rendered lard, from pork fatbackFat compositionSaturated fatsTotal saturated38–43%:Palmitic acid: 25–28%Stearic acid: 12–14%Myristic acid: 1%Unsaturated fatsTotal unsaturated56–62%Monounsaturated47–50%:Oleic acid: 44–47%Palmitoleic acid: 3%PolyunsaturatedLinoleic acid: 6–10%[1][2]PropertiesFood energy per 100 g (3.5 oz)3,770 kJ (900 kcal)Melting pointbackfat: ...

County in Alabama, United States County in AlabamaHouston CountyCountyHouston County courthouse in DothanLocation within the U.S. state of AlabamaAlabama's location within the U.S.Coordinates: 31°09′08″N 85°17′36″W / 31.152222222222°N 85.293333333333°W / 31.152222222222; -85.293333333333Country United StatesState AlabamaFoundedFebruary 9, 1903Named forGeorge S. HoustonSeatDothanLargest cityDothanArea • Total582 sq mi (1,510&#...

 

Штат БразилииЭспириту-Сантупорт. Espírito Santo Флаг Герб[вд] Anthem of Espírito Santo[вд] 20°19′08″ ю. ш. 40°20′16″ з. д.HGЯO Страна  Бразилия Адм. центр Витория Губернатор Паулу Хартунг История и география Дата образования 1889 Площадь 46 098,6 км² (23-е место) Высота 756 м Часов�...