Studentized residual

In statistics, a studentized residual is the dimensionless ratio resulting from the division of a residual by an estimate of its standard deviation, both expressed in the same units. It is a form of a Student's t-statistic, with the estimate of error varying between points.

This is an important technique in the detection of outliers. It is among several named in honor of William Sealey Gosset, who wrote under the pseudonym "Student" (e.g., Student's distribution). Dividing a statistic by a sample standard deviation is called studentizing, in analogy with standardizing and normalizing.

Motivation

The key reason for studentizing is that, in regression analysis of a multivariate distribution, the variances of the residuals at different input variable values may differ, even if the variances of the errors at these different input variable values are equal. The issue is the difference between errors and residuals in statistics, particularly the behavior of residuals in regressions.

Consider the simple linear regression model

Given a random sample (XiYi), i = 1, ..., n, each pair (XiYi) satisfies

where the errors , are independent and all have the same variance . The residuals are not the true errors, but estimates, based on the observable data. When the method of least squares is used to estimate and , then the residuals , unlike the errors , cannot be independent since they satisfy the two constraints

and

(Here εi is the ith error, and is the ith residual.)

The residuals, unlike the errors, do not all have the same variance: the variance decreases as the corresponding x-value gets farther from the average x-value. This is not a feature of the data itself, but of the regression better fitting values at the ends of the domain. It is also reflected in the influence functions of various data points on the regression coefficients: endpoints have more influence. This can also be seen because the residuals at endpoints depend greatly on the slope of a fitted line, while the residuals at the middle are relatively insensitive to the slope. The fact that the variances of the residuals differ, even though the variances of the true errors are all equal to each other, is the principal reason for the need for studentization.

It is not simply a matter of the population parameters (mean and standard deviation) being unknown – it is that regressions yield different residual distributions at different data points, unlike point estimators of univariate distributions, which share a common distribution for residuals.

Background

For this simple model, the design matrix is

and the hat matrix H is the matrix of the orthogonal projection onto the column space of the design matrix:

The leverage hii is the ith diagonal entry in the hat matrix. The variance of the ith residual is

In case the design matrix X has only two columns (as in the example above), this is equal to

In the case of an arithmetic mean, the design matrix X has only one column (a vector of ones), and this is simply:

Calculation

Given the definitions above, the Studentized residual is then

where hii is the leverage, and is an appropriate estimate of σ (see below).

In the case of a mean, this is equal to:

Internal and external studentization

The usual estimate of σ2 is the internally studentized residual

where m is the number of parameters in the model (2 in our example).

But if the i th case is suspected of being improbably large, then it would also not be normally distributed. Hence it is prudent to exclude the i th observation from the process of estimating the variance when one is considering whether the i th case may be an outlier, and instead use the externally studentized residual, which is

based on all the residuals except the suspect i th residual. Here is to emphasize that for suspect i are computed with i th case excluded.

If the estimate σ2 includes the i th case, then it is called the internally studentized residual, (also known as the standardized residual [1]). If the estimate is used instead, excluding the i th case, then it is called the externally studentized, .

Distribution

If the errors are independent and normally distributed with expected value 0 and variance σ2, then the probability distribution of the ith externally studentized residual is a Student's t-distribution with n − m − 1 degrees of freedom, and can range from to .

On the other hand, the internally studentized residuals are in the range , where ν = n − m is the number of residual degrees of freedom. If ti represents the internally studentized residual, and again assuming that the errors are independent identically distributed Gaussian variables, then:[2]

where t is a random variable distributed as Student's t-distribution with ν − 1 degrees of freedom. In fact, this implies that ti2 /ν follows the beta distribution B(1/2,(ν − 1)/2). The distribution above is sometimes referred to as the tau distribution;[2] it was first derived by Thompson in 1935.[3]

When ν = 3, the internally studentized residuals are uniformly distributed between and . If there is only one residual degree of freedom, the above formula for the distribution of internally studentized residuals doesn't apply. In this case, the ti are all either +1 or −1, with 50% chance for each.

The standard deviation of the distribution of internally studentized residuals is always 1, but this does not imply that the standard deviation of all the ti of a particular experiment is 1. For instance, the internally studentized residuals when fitting a straight line going through (0, 0) to the points (1, 4), (2, −1), (2, −1) are , and the standard deviation of these is not 1.

Note that any pair of studentized residual ti and tj (where ), are NOT i.i.d. They have the same distribution, but are not independent due to constraints on the residuals having to sum to 0 and to have them be orthogonal to the design matrix.

Software implementations

Many programs and statistics packages, such as R, Python, etc., include implementations of Studentized residual.

Language/Program Function Notes
R rstandard(model, ...) internally studentized. See [2]
R rstudent(model, ...) externally studentized. See [3]


See also

References

  1. ^ Regression Deletion Diagnostics R docs
  2. ^ a b Allen J. Pope (1976), "The statistics of residuals and the detection of outliers", U.S. Dept. of Commerce, National Oceanic and Atmospheric Administration, National Ocean Survey, Geodetic Research and Development Laboratory, 136 pages, [1], eq.(6)
  3. ^ Thompson, William R. (1935). "On a Criterion for the Rejection of Observations and the Distribution of the Ratio of Deviation to Sample Standard Deviation". The Annals of Mathematical Statistics. 6 (4): 214–219. doi:10.1214/aoms/1177732567.

Further reading

Read other articles:

BoundariesPoster rilis teatrikalSutradaraShana FesteProduser Brian Kavanaugh-Jones Chris Ferguson Ditulis olehShana FestePemeran Vera Farmiga Christopher Plummer Lewis MacDougall Bobby Cannavale Kristen Schaal Christopher Lloyd Peter Fonda Penata musikMichael PennSinematograferSara MisharaPenyunting Marie-Hélène Dozo Dorian Harris Perusahaanproduksi Automatik Entertainment Oddfellows Entertainment Stage 6 Films Distributor Sony Pictures Classics Mongrel Media Tanggal rilis 12 Maret 20...

 

Salah satu model yang menggambarkan status individu dalam penyakit menular, yang terdiri atas individu rentan, terpapar,terinfeksi, pulih, dan mati. Dalam epidemiologi, individu rentan adalah seseorang (atau seekor hewan) dalam populasi yang mudah dan lebih berpeluang terinfeksi penyakit.[1] Lawan dari individu rentan adalah individu yang telah memiliki kekebalan atau ketahanan. Kekebalan tubuh dikuatkan dengan imunisasi, yang dalam konteks penyakit menular dapat berupa paparan terhad...

 

Megantara Air IATA ICAO Kode panggil 9M MKE MEGANTARA Didirikan2007Berhenti beroperasiMei 2009PenghubungBandar Udara Internasional Soekarno-HattaArmada2Perusahaan indukTransmile GroupKantor pusatTangerang, IndonesiaTokoh utamaSofyan Danu Siswantoro (Direktur)Situs webhttp://www.megantara.co.id/ Megantara Air adalah maskapai penerbangan kargo Indonesia yang berbasis di Bandar Udara Internasional Soekarno-Hatta.[1] Operasi maskapai penerbangan ini dimulai pada Mei 2007, maskapai ini men...

American politician, diplomat, and scholar (1761–1849) For the artist, see Albert Eugene Gallatin. For the governor of Mississippi, see Albert Gallatin Brown. Albert GallatinGallatin by Gilbert Stuart, c. 1803United States Minister to the United KingdomIn officeSeptember 1, 1826 – October 4, 1827PresidentJohn Quincy AdamsPreceded byRufus KingSucceeded byWilliam Beach Lawrence (acting)United States Minister to FranceIn officeJuly 16, 1816 – May 16, 1823PresidentJa...

 

Way Down EastPoster rilis teatrikalSutradaraD. W. GriffithProduserD. W. Griffith (tak disebutkan)Ditulis oleh Anthony Paul Kelly D. W. Griffith (tak disebutkan) Joseph R. Grismer (adaptasi) BerdasarkanWay Down Eastoleh Lottie Blair ParkerPemeran Lillian Gish Richard Barthelmess Lowell Sherman Burr McIntosh Penata musik Louis Silvers William Frederick Peters Sinematografer G.W. Bitzer Penyunting James Smith Rose Smith DistributorUnited ArtistsTanggal rilis 3 September 1920 (1920-09-...

 

Peta Kabupaten Banggai Kepulauan di Sulawesi Tengah Berikut adalah daftar kecamatan dan kelurahan di Kabupaten Banggai Kepulauan, Provinsi Sulawesi Tengah, Indonesia. Kabupaten Banggai Kepulauan terdiri dari 12 Kecamatan, 3 Kelurahan dan 141 Desa dengan luas wilayah 2.488,79 km² dan jumlah penduduk sebesar 117.526 jiwa dengan sebaran penduduk 47 jiwa/km².[1][2] Daftar kecamatan dan kelurahan di Kabupaten Banggai Kepulauan, adalah sebagai berikut: Kode Kemendagri Kecamatan Ju...

Renato Ghezzi Nazionalità  Italia Calcio Ruolo Attaccante Termine carriera 1948 Carriera Squadre di club1 1932-1934 Rhodense? (?)1937-1938 Casale29 (13)1939-1941 Alessandria57 (23)1941-1942 Brescia7 (0)1945-1946 Gladiator? (?)1946-1948 Torrese57 (16) 1 I due numeri indicano le presenze e le reti segnate, per le sole partite di campionato.Il simbolo → indica un trasferimento in prestito.   Modifica dati su Wikidata · Manuale Renato Ghezzi (Rho,...

 

Voce principale: Forlì Football Club. Calcio ForlìStagione 1995-1996Sport calcio Squadra Forlì Allenatore Franco Bonavita Presidente Valdimiro Panzavolta Serie C28º posto nel girone B Coppa ItaliaOttavi di finale Coppa Italia Serie COttavi di finale Maggiori presenzeCampionato: Belletti (34) Miglior marcatoreCampionato: Belletti (8) StadioStadio Tullo Morgagni 1994-1995 1996-1997 Si invita a seguire il modello di voce Questa pagina raccoglie le informazioni riguardanti il Calcio For...

 

2011 FINA World ChampionshipsDivingIndividual1 mmenwomen3 mmenwomen10 mmenwomenSynchronised3 mmenwomen10 mmenwomenOpen water swimmingSingle5 kmmenwomen10 kmmenwomen25 kmmenwomenTeam5 kmmixedSwimmingFreestyle50 mmenwomen100 mmenwomen200 mmenwomen400 mmenwomen800 mmenwomen1500 mmenwomenBackstroke50 mmenwomen100 mmenwomen200 mmenwomenBreaststroke50 mmenwomen100 mmenwomen200 mmenwomenButterfly50 mmenwomen100 mmenwomen200 mmenwomenIndividual medley200 mmenwomen400 mmenwomenFreestyle relay4×1...

2016年美國總統選舉 ← 2012 2016年11月8日 2020 → 538個選舉人團席位獲勝需270票民意調查投票率55.7%[1][2] ▲ 0.8 %   获提名人 唐納·川普 希拉莉·克林頓 政党 共和黨 民主党 家鄉州 紐約州 紐約州 竞选搭档 迈克·彭斯 蒂姆·凱恩 选举人票 304[3][4][註 1] 227[5] 胜出州/省 30 + 緬-2 20 + DC 民選得票 62,984,828[6] 65,853,514[6]...

 

مرحبا بكم في بوابة عقد 1970   عقد 1970 بدأ في الأول من يناير 1970 وأنقضى في آخر يوم من ديسمبر 1979 عقد 19701979-1970 تصفَّح بوَّابات أُخرى تحديث مُحتويات هذه الصفحة   حدث بارز ⇧ ✎  👈 أيلول الأسود هو الاسم الذي يشار به إلى حرب بدأت في شهر سبتمبر (أيلول) من عام 1970 والذي يطلق عليه أف�...

 

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada Februari 2023. SMK Wikrama 1 JeparaLogo SMK Wikrama 1 JeparaInformasiDidirikanJuni 2011JenisSekolah Mengengah KejuruanKepala SekolahSholikhin, S.AgJurusan atau peminatanTeknik Komputer dan Jaringan & Rekayasa Perangkat LunakStatusSwastaAlamatLokasiJl. Kelet...

Частина серії проФілософіяLeft to right: Plato, Kant, Nietzsche, Buddha, Confucius, AverroesПлатонКантНіцшеБуддаКонфуційАверроес Філософи Епістемологи Естетики Етики Логіки Метафізики Соціально-політичні філософи Традиції Аналітична Арістотелівська Африканська Близькосхідна іранська Буддій�...

 

Marilyn Strickland Marilyn Strickland (lahir 25 September 1962) adalah seorang politikus dan pengusaha Amerika Serikat kelahiran Korea Selatan yang menjadi anggota DPR. Sebagai anggota Partai Demokrat, ia memulai masa jabatan pertamanya pada 3 Januari 2021. Strickland sebelumnya menjabat sebagai Walikota Tacoma ke-38 dari 2010 sampai 2018. Ia menjadi anggota pertama Kongres Amerika Serikat yang berdarah belateran Korea dan Afrika Amerika. Strickland juga merupakan salah satu dari tiga wanita ...

 

Scottish actor/ singer/ performer. For the Scottish-born Australian actress, see Elaine Smith (actress). Elaine C. SmithSmith in 2015BornElaine Constance Smith (1958-08-02) 2 August 1958 (age 65)Newarthill, Lanarkshire, ScotlandAlma materRoyal Conservatoire of Scotland & Queen Margaret UniversityOccupationActressPolitical partyScottish National PartySpouse Bob Morton ​(m. 1988)​Children2 Elaine Constance Smith (born 2 August 1958) is a Scottish actre...

Pour les articles homonymes, voir Thomas Wentworth et Strafford. Thomas WentworthPortrait de Thomas Wentworth vers 1639, par van Dyck.FonctionsLord Deputy d'Irlande1633-1640Lord-lieutenant du Yorkshire1628-1641Membre du parlement d'Angleterre de 1621-1622Yorkshire (d)Membre du parlement d'Angleterre de 1614Yorkshire (d)Membre du parlement d'Angleterre de 1624-1625Pontefract (d)Membre du Parlement d'AngleterreTitre de noblesseComte de StraffordBiographieNaissance 13 avril 1593LondresDécès 1...

 

هذه المقالة يتيمة إذ تصل إليها مقالات أخرى قليلة جدًا. فضلًا، ساعد بإضافة وصلة إليها في مقالات متعلقة بها. (يوليو 2019) جاي كيركي   معلومات شخصية الميلاد 16 يونيو 1888   فليشمانس  الوفاة 31 أغسطس 1968 (80 سنة)   نيو أورلينز  مواطنة الولايات المتحدة  الحياة العملية المهنة ...

 

يفتقر محتوى هذه المقالة إلى الاستشهاد بمصادر. فضلاً، ساهم في تطوير هذه المقالة من خلال إضافة مصادر موثوق بها. أي معلومات غير موثقة يمكن التشكيك بها وإزالتها. (ديسمبر 2018) 7° خط عرض 7 شمال خريطة لجميع الإحداثيات من جوجل خريطة لجميع الإحداثيات من بينغ تصدير جميع الإحداثيات من كي...

Orde Mahkota EkOrdre de la Couronne de chêne (Prancis)Eichenlaubkronenorden (Jerman)Eechelaafkrounenuerden (Luksemburg)Patra Salib Agung Orde Mahkota EkDianugerahkan oleh Adipati Agung LuksemburgTipeOrde kekesatriaan dengan lima tingkatanDibentuk29 Desember 1841MottoJe maintiendrai('Saya akan menjaga')KelayakanAnggota pemerintah; deputi; anggota dewan; pegawai negeri sipil; wakil rakyat terpilih dan personel administrasi perkotaan, tokoh utama sektor ekonomi, sosial, budaya, atau olahraga; j...

 

Pemilihan umum Bupati Purworejo 20242020202927 November 2024Kandidat Peta persebaran suara Bupati & Wakil Bupati petahanaAgus Bastian dan Yuli Hastuti Bupati & Wakil Bupati terpilih belum diketahui Pemilihan umum Bupati Purworejo 2024 dilaksanakan pada 27 November 2024 untuk memilih Bupati Purworejo periode 2024-2029.[1] Pemilihan Bupati (Pilbup) Purworejo tahun tersebut akan diselenggarakan setelah Pemilihan umum Presiden Indonesia 2024 (Pilpres) dan Pemilihan umum legislati...