Ward's method

In statistics, Ward's method is a criterion applied in hierarchical cluster analysis. Ward's minimum variance method is a special case of the objective function approach originally presented by Joe H. Ward, Jr.[1] Ward suggested a general agglomerative hierarchical clustering procedure, where the criterion for choosing the pair of clusters to merge at each step is based on the optimal value of an objective function. This objective function could be "any function that reflects the investigator's purpose." Many of the standard clustering procedures are contained in this very general class. To illustrate the procedure, Ward used the example where the objective function is the error sum of squares, and this example is known as Ward's method or more precisely Ward's minimum variance method.

The nearest-neighbor chain algorithm can be used to find the same clustering defined by Ward's method, in time proportional to the size of the input distance matrix and space linear in the number of points being clustered.

The minimum variance criterion

Ward's minimum variance criterion minimizes the total within-cluster variance. To implement this method, at each step find the pair of clusters that leads to minimum increase in total within-cluster variance after merging. This increase is a weighted squared distance between cluster centers. At the initial step, all clusters are singletons (clusters containing a single point). To apply a recursive algorithm under this objective function, the initial distance between individual objects must be (proportional to) squared Euclidean distance.

The initial cluster distances in Ward's minimum variance method are therefore defined to be the squared Euclidean distance between points:

Note: In software that implements Ward's method, it is important to check whether the function arguments should specify Euclidean distances or squared Euclidean distances.

Lance–Williams algorithms

Ward's minimum variance method can be defined and implemented recursively by a Lance–Williams algorithm. The Lance–Williams algorithms are an infinite family of agglomerative hierarchical clustering algorithms which are represented by a recursive formula for updating cluster distances at each step (each time a pair of clusters is merged). At each step, it is necessary to optimize the objective function (find the optimal pair of clusters to merge). The recursive formula simplifies finding the optimal pair.

Suppose that clusters and were next to be merged. At this point all of the current pairwise cluster distances are known. The recursive formula gives the updated cluster distances following the pending merge of clusters and . Let

  • , , and be the pairwise distances between clusters , , and , respectively,
  • be the distance between the new cluster and .

An algorithm belongs to the Lance-Williams family if the updated cluster distance can be computed recursively by

where and are parameters, which may depend on cluster sizes, that together with the cluster distance function determine the clustering algorithm. Several standard clustering algorithms such as single linkage, complete linkage, and group average method have a recursive formula of the above type. A table of parameters for standard methods is given by several authors.[2][3][4]

Ward's minimum variance method can be implemented by the Lance–Williams formula. For disjoint clusters and with sizes and respectively:

Hence Ward's method can be implemented as a Lance–Williams algorithm with

Variations

The popularity of the Ward's method has led to variations of it. For instance, Wardp introduces the use of cluster specific feature weights, following the intuitive idea that features could have different degrees of relevance at different clusters. [5]

References

  1. ^ Ward, J. H., Jr. (1963), "Hierarchical Grouping to Optimize an Objective Function", Journal of the American Statistical Association, 58, 236–244.
  2. ^ Cormack, R. M. (1971), "A Review of Classification", Journal of the Royal Statistical Society, Series A, 134(3), 321-367.
  3. ^ Gordon, A. D. (1999), Classification, 2nd Edition, Chapman and Hall, Boca Raton.
  4. ^ Milligan, G. W. (1979), "Ultrametric Hierarchical Clustering Algorithms", Psychometrika, 44(3), 343–346.
  5. ^ R.C. de Amorim (2015). "Feature Relevance in Ward's Hierarchical Clustering Using the Lp Norm" (PDF). Journal of Classification. 32 (1): 46–62. doi:10.1007/s00357-015-9167-1. S2CID 18099326.

Further reading

  • Everitt, B. S., Landau, S. and Leese, M. (2001), Cluster Analysis, 4th Edition, Oxford University Press, Inc., New York; Arnold, London. ISBN 0340761199
  • Hartigan, J. A. (1975), Clustering Algorithms, New York: Wiley.
  • Jain, A. K. and Dubes, R. C. (1988), Algorithms for Clustering Data, New Jersey: Prentice–Hall.
  • Kaufman, L. and Rousseeuw, P. J. (1990), Finding Groups in Data: An Introduction to Cluster Analysis, New York: Wiley.

Read other articles:

Disambiguazione – Carpe rimanda qui. Se stai cercando la località della Liguria, vedi Carpe (Toirano). Come leggere il tassoboxCarpa Carpa comune, Cyprinus carpio Stato di conservazione Rischio minimo Classificazione scientifica Dominio Eukaryota Regno Animalia Sottoregno Eumetazoa Superphylum Deuterostomia Phylum Chordata Subphylum Vertebrata Infraphylum Gnathostomata Superclasse Osteichthyes Classe Actinopterygii Sottoclasse Neopterygii Infraclasse Teleostei Superordine Ostariop...

 

American state election 1878 Michigan gubernatorial election ← 1876 November 5, 1878 1880 →   Nominee Charles Croswell Orlando M. Barnes Henry S. Smith Party Republican Democratic Greenback Popular vote 126,280 78,503 73,313 Percentage 44.66% 27.76% 25.93% County results Croswell:     40-50%     50-60%     60-70%Barnes:     30-40%     40-...

 

Part of a series onBritish law Acts of Parliament of the United Kingdom Year      1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 ...

Muhammad Layas, 2010 Mohammed H. Layas is a prominent Libyan politician and investment banker. Education He obtained a Bachelor of Arts degree in Accounting and Business Management from the University of Benghazi, Libya, and a Diploma of the Institute of Economic Development, in Washington. Politics Layas has a long history of involvement in Libyan politics, serving as a foreign diplomat for Libya before the 1969 Al-Fatah Revolution which brought Muammar Gaddafi into power. Currently, Mr. La...

 

For the York, Pennsylvania, radio station, see WARM-FM. Radio station in Scranton, PennsylvaniaWARMSimulcasting WLGD DallasScranton, PennsylvaniaBroadcast areaScranton/Wilkes-Barre/HazletonFrequency590 kHzBrandingBigfoot Legends 101.7 & 107.7ProgrammingFormatClassic countryAffiliationsCompass Media NetworksOwnershipOwnerSeven Mountains Media(Southern Belle, LLC)Sister stationsWLGDHistoryFirst air date1940Technical information[1]Licensing authorityFCCFacility ID70504ClassBPower1,8...

 

Southern California transit company Red Car redirects here. For the taxis of Hong Kong, see Taxicabs of Hong Kong. For the British town, see Redcar. Pacific ElectricPacific Electric Building and Main StreetOverviewHeadquartersLos Angeles, CaliforniaReporting markPELocaleGreater Los Angeles AreaDates of operation1901–1961 (passenger), 1965 (freight)SuccessorSouthern Pacific (freight)Los Angeles Metropolitan Transit Authority, Los Angeles Metro Rail (passenger)TechnicalTrack gauge4 f...

土库曼斯坦总统土库曼斯坦国徽土库曼斯坦总统旗現任谢尔达尔·别尔德穆哈梅多夫自2022年3月19日官邸阿什哈巴德总统府(Oguzkhan Presidential Palace)機關所在地阿什哈巴德任命者直接选举任期7年,可连选连任首任萨帕尔穆拉特·尼亚佐夫设立1991年10月27日 土库曼斯坦土库曼斯坦政府与政治 国家政府 土库曼斯坦宪法 国旗 国徽 国歌 立法機關(英语:National Council of Turkmenistan) ...

 

2020年夏季奥林匹克运动会波兰代表團波兰国旗IOC編碼POLNOC波蘭奧林匹克委員會網站olimpijski.pl(英文)(波兰文)2020年夏季奥林匹克运动会(東京)2021年7月23日至8月8日(受2019冠状病毒病疫情影响推迟,但仍保留原定名称)運動員206參賽項目24个大项旗手开幕式:帕维尔·科热尼奥夫斯基(游泳)和马娅·沃什乔夫斯卡(自行车)[1]闭幕式:卡罗利娜·纳亚(皮划艇)&#...

 

Cet article est une ébauche concernant une localité italienne et la Lombardie. Vous pouvez partager vos connaissances en l’améliorant (comment ?) selon les recommandations des projets correspondants. Pianello del Lario Administration Pays Italie Région Lombardie  Province Côme  Code postal 22010 Code ISTAT 013183 Préfixe tel. 0344 Démographie Gentilé pianellesi Population 1 050 hab. (31-12-2010[1]) Densité 117 hab./km2 Géographie Coordonnées 46°...

يفتقر محتوى هذه المقالة إلى الاستشهاد بمصادر. فضلاً، ساهم في تطوير هذه المقالة من خلال إضافة مصادر موثوق بها. أي معلومات غير موثقة يمكن التشكيك بها وإزالتها. (ديسمبر 2018) إكوادور كأس العالم 2006 الاتحاد المشرف اتحاد إكوادور لكرة القدم البلد المضيف  ألمانيا المدرب كأس العالم...

 

Rod Fanni Informasi pribadiTanggal lahir 6 Desember 1981 (umur 42)Tempat lahir Martigues, PrancisTinggi 1,86 m (6 ft 1 in)Posisi bermain BekInformasi klubKlub saat ini MarseilleNomor 24Karier junior1999–2000 MartiguesKarier senior*Tahun Tim Tampil (Gol)2000–2002 Martigues 63 (1)2002–2005 Lens 39 (0)2004–2005 → Châteauroux (pinjaman) 34 (1)2005–2007 Nice 57 (0)2007–2011 Rennes 114 (2)2011– Marseille 19 (0)Tim nasional2008– Prancis 5 (0) * Penampilan dan ...

 

У этого термина существуют и другие значения, см. Глюкоза, Чистякова и Ионова. Глюк’ozaНаталья Ильинична Чистякова-Ионова Основная информация Имя при рождении Наталья Ильинична Ионова Дата рождения 7 июня 1986(1986-06-07) (37 лет) Место рождения Москва, РСФСР, СССР[1] Ст...

Swedish social scientist ProfessorLucas GottzénBornLucas Forsberg (1977-11-08) 8 November 1977 (age 46)NationalitySwedishAcademic workDisciplineMen's studiesInstitutionsStockholm University Lucas Martin Gottzén (born 8 November 1977), formerly Lucas Forsberg, is a Swedish social scientist and an expert on men's studies and violence. He is full Professor of Child and Youth Studies at Stockholm University.[1] He was editor-in-chief of the men's studies journal Norma from 2014 to ...

 

Казахстан на Олимпийских играх Код МОК KAZ НОК Национальный олимпийский комитет Республики Казахстан Зимние Олимпийские игры в Нагано Спортсмены 60 в 8 видах спорта Знаменосец Владимир Смирнов МедалиМесто 20 Золото Серебро Бронза Всего 0 0 2 2 Участие в летних Оли�...

 

Ne doit pas être confondu avec Fleuve marin côtier. Un fleuve côtier est un « petit cours d'eau[1] qui prend naissance près des côtes et qui se jette dans la mer »[2],[3] ou dans l'océan, avec un débit permanent[4]. Le Lay est un fleuve côtier de la Vendée. Définition Le fleuve côtier[5] est donc un fleuve particulier : avec un débit permanent (soit un débit d'étiage non nul) et une embouchure marine pour acquérir la qualification de fleuve, avec une longueur ...

Disambiguazione – Se stai cercando informazioni sul dimensionamento in campo elettrotecnico, vedi Dimensionamento delle linee elettriche. Questa voce sull'argomento ingegneria è solo un abbozzo. Contribuisci a migliorarla secondo le convenzioni di Wikipedia. Segui i suggerimenti del progetto di riferimento. In ingegneria, nell'ambito della progettazione, il processo di dimensionamento consiste nell'individuazione dei parametri di natura geometrica, fisica ecc.. che caratterizzano l'o...

 

Phonology and phonetics of Slovene For assistance with IPA transcriptions of Slovene for Wikipedia articles, see Help:IPA/Slovene. This article contains phonetic transcriptions in the International Phonetic Alphabet (IPA). For an introductory guide on IPA symbols, see Help:IPA. For the distinction between [ ], / / and ⟨ ⟩, see IPA § Brackets and transcription delimiters. This article is about the phonology and phonetics of standard Slovene. Consonant...

 

この記事は英語版の対応するページを翻訳することにより充実させることができます。(2024年5月)翻訳前に重要な指示を読むには右にある[表示]をクリックしてください。 英語版記事を日本語へ機械翻訳したバージョン(Google翻訳)。 万が一翻訳の手がかりとして機械翻訳を用いた場合、翻訳者は必ず翻訳元原文を参照して機械翻訳の誤りを訂正し、正確な翻訳にし�...

Scots as spoken in Ulster, Ireland Ulster ScotsUlstèr-ScotchUllans(Braid)Scots,[1][2] Scotch[3][4]Native toIrelandRegionUlsterEthnicityUlster ScotsLanguage familyIndo-European GermanicWest GermanicNorth Sea GermanicAnglo-FrisianAnglicScotsUlster ScotsEarly formsProto-Indo-European Proto-Germanic Proto-English Northumbrian Old English Official statusRecognised minoritylanguage inNorthern IrelandRegulated byThe cross-border Boord o Ulstèr-Scotch,...

 

Mitsui E&SNama asli三井E&SJenisPublik (K.K)Kode emitenTYO: 7003Komponen Nikkei 225ISINJP3891600003IndustriPermesinanPembuatan kapalDidirikan17 November 1917; 106 tahun lalu (1917-11-17)KantorpusatTsukiji, Chuo-ku, Tokyo 104-8439, JepangWilayah operasiSeluruh duniaTokohkunciTakao Tanaka(Presiden dan CEO)ProdukKapal muatan curahKapal tanker minyakKapal peti kemasFPSOMesin dieselDerek peti kemasPabrik bahan kimiaKilang minyakFasilitas pengolahan airFasilitas pengolahan limbahPendidi...