Boosting (machine learning)

In machine learning (ML), boosting is an ensemble metaheuristic for primarily reducing bias (as opposed to variance).[1] It can also improve the stability and accuracy of ML classification and regression algorithms. Hence, it is prevalent in supervised learning for converting weak learners to strong learners.[2]

The concept of boosting is based on the question posed by Kearns and Valiant (1988, 1989):[3][4] "Can a set of weak learners create a single strong learner?" A weak learner is defined as a classifier that is only slightly correlated with the true classification. A strong learner is a classifier that is arbitrarily well-correlated with the true classification. Robert Schapire answered the question in the affirmative in a paper published in 1990.[5] This has had significant ramifications in machine learning and statistics, most notably leading to the development of boosting.[6]

Initially, the hypothesis boosting problem simply referred to the process of turning a weak learner into a strong learner.[3] Algorithms that achieve this quickly became known as "boosting". Freund and Schapire's arcing (Adapt[at]ive Resampling and Combining),[7] as a general technique, is more or less synonymous with boosting.[8]

Algorithms

While boosting is not algorithmically constrained, most boosting algorithms consist of iteratively learning weak classifiers with respect to a distribution and adding them to a final strong classifier. When they are added, they are weighted in a way that is related to the weak learners' accuracy. After a weak learner is added, the data weights are readjusted, known as "re-weighting". Misclassified input data gain a higher weight and examples that are classified correctly lose weight.[note 1] Thus, future weak learners focus more on the examples that previous weak learners misclassified.

An illustration presenting the intuition behind the boosting algorithm, consisting of the parallel learners and weighted dataset

There are many boosting algorithms. The original ones, proposed by Robert Schapire (a recursive majority gate formulation),[5] and Yoav Freund (boost by majority),[9] were not adaptive and could not take full advantage of the weak learners. Schapire and Freund then developed AdaBoost, an adaptive boosting algorithm that won the prestigious Gödel Prize.

Only algorithms that are provable boosting algorithms in the probably approximately correct learning formulation can accurately be called boosting algorithms. Other algorithms that are similar in spirit[clarification needed] to boosting algorithms are sometimes called "leveraging algorithms", although they are also sometimes incorrectly called boosting algorithms.[9]

The main variation between many boosting algorithms is their method of weighting training data points and hypotheses. AdaBoost is very popular and the most significant historically as it was the first algorithm that could adapt to the weak learners. It is often the basis of introductory coverage of boosting in university machine learning courses.[10] There are many more recent algorithms such as LPBoost, TotalBoost, BrownBoost, xgboost, MadaBoost, LogitBoost, and others. Many boosting algorithms fit into the AnyBoost framework,[9] which shows that boosting performs gradient descent in a function space using a convex cost function.

Object categorization in computer vision

Given images containing various known objects in the world, a classifier can be learned from them to automatically classify the objects in future images. Simple classifiers built based on some image feature of the object tend to be weak in categorization performance. Using boosting methods for object categorization is a way to unify the weak classifiers in a special way to boost the overall ability of categorization.[citation needed]

Problem of object categorization

Object categorization is a typical task of computer vision that involves determining whether or not an image contains some specific category of object. The idea is closely related with recognition, identification, and detection. Appearance based object categorization typically contains feature extraction, learning a classifier, and applying the classifier to new examples. There are many ways to represent a category of objects, e.g. from shape analysis, bag of words models, or local descriptors such as SIFT, etc. Examples of supervised classifiers are Naive Bayes classifiers, support vector machines, mixtures of Gaussians, and neural networks. However, research[which?] has shown that object categories and their locations in images can be discovered in an unsupervised manner as well.[11]

Status quo for object categorization

The recognition of object categories in images is a challenging problem in computer vision, especially when the number of categories is large. This is due to high intra class variability and the need for generalization across variations of objects within the same category. Objects within one category may look quite different. Even the same object may appear unalike under different viewpoint, scale, and illumination. Background clutter and partial occlusion add difficulties to recognition as well.[12] Humans are able to recognize thousands of object types, whereas most of the existing object recognition systems are trained to recognize only a few,[quantify] e.g. human faces, cars, simple objects, etc.[13][needs update?] Research has been very active on dealing with more categories and enabling incremental additions of new categories, and although the general problem remains unsolved, several multi-category objects detectors (for up to hundreds or thousands of categories[14]) have been developed. One means is by feature sharing and boosting.

Boosting for binary categorization

AdaBoost can be used for face detection as an example of binary categorization. The two categories are faces versus background. The general algorithm is as follows:

  1. Form a large set of simple features
  2. Initialize weights for training images
  3. For T rounds
    1. Normalize the weights
    2. For available features from the set, train a classifier using a single feature and evaluate the training error
    3. Choose the classifier with the lowest error
    4. Update the weights of the training images: increase if classified wrongly by this classifier, decrease if correctly
  4. Form the final strong classifier as the linear combination of the T classifiers (coefficient larger if training error is small)

After boosting, a classifier constructed from 200 features could yield a 95% detection rate under a false positive rate.[15]

Another application of boosting for binary categorization is a system that detects pedestrians using patterns of motion and appearance.[16] This work is the first to combine both motion information and appearance information as features to detect a walking person. It takes a similar approach to the Viola-Jones object detection framework.

Boosting for multi-class categorization

Compared with binary categorization, multi-class categorization looks for common features that can be shared across the categories at the same time. They turn to be more generic edge like features. During learning, the detectors for each category can be trained jointly. Compared with training separately, it generalizes better, needs less training data, and requires fewer features to achieve the same performance.

The main flow of the algorithm is similar to the binary case. What is different is that a measure of the joint training error shall be defined in advance. During each iteration the algorithm chooses a classifier of a single feature (features that can be shared by more categories shall be encouraged). This can be done via converting multi-class classification into a binary one (a set of categories versus the rest),[17] or by introducing a penalty error from the categories that do not have the feature of the classifier.[18]

In the paper "Sharing visual features for multiclass and multiview object detection", A. Torralba et al. used GentleBoost for boosting and showed that when training data is limited, learning via sharing features does a much better job than no sharing, given same boosting rounds. Also, for a given performance level, the total number of features required (and therefore the run time cost of the classifier) for the feature sharing detectors, is observed to scale approximately logarithmically with the number of class, i.e., slower than linear growth in the non-sharing case. Similar results are shown in the paper "Incremental learning of object detectors using a visual shape alphabet", yet the authors used AdaBoost for boosting.

Convex vs. non-convex boosting algorithms

Boosting algorithms can be based on convex or non-convex optimization algorithms. Convex algorithms, such as AdaBoost and LogitBoost, can be "defeated" by random noise such that they can't learn basic and learnable combinations of weak hypotheses.[19][20] This limitation was pointed out by Long & Servedio in 2008. However, by 2009, multiple authors demonstrated that boosting algorithms based on non-convex optimization, such as BrownBoost, can learn from noisy datasets and can specifically learn the underlying classifier of the Long–Servedio dataset.

See also

Implementations

  • scikit-learn, an open source machine learning library for Python
  • Orange, a free data mining software suite, module Orange.ensemble
  • Weka is a machine learning set of tools that offers variate implementations of boosting algorithms like AdaBoost and LogitBoost
  • R package GBM (Generalized Boosted Regression Models) implements extensions to Freund and Schapire's AdaBoost algorithm and Friedman's gradient boosting machine.
  • jboost; AdaBoost, LogitBoost, RobustBoost, Boostexter and alternating decision trees
  • R package adabag: Applies Multiclass AdaBoost.M1, AdaBoost-SAMME and Bagging
  • R package xgboost: An implementation of gradient boosting for linear and tree-based models.

Notes

  1. ^ Some boosting-based classification algorithms actually decrease the weight of repeatedly misclassified examples; for example boost by majority and BrownBoost.

References

  1. ^ Leo Breiman (1996). "BIAS, VARIANCE, AND ARCING CLASSIFIERS" (PDF). TECHNICAL REPORT. Archived from the original (PDF) on 2015-01-19. Retrieved 19 January 2015. Arcing [Boosting] is more successful than bagging in variance reduction
  2. ^ Zhou Zhi-Hua (2012). Ensemble Methods: Foundations and Algorithms. Chapman and Hall/CRC. p. 23. ISBN 978-1439830031. The term boosting refers to a family of algorithms that are able to convert weak learners to strong learners
  3. ^ a b Michael Kearns(1988); Thoughts on Hypothesis Boosting, Unpublished manuscript (Machine Learning class project, December 1988)
  4. ^ Michael Kearns; Leslie Valiant (1989). "Crytographic limitations on learning Boolean formulae and finite automata". Proceedings of the twenty-first annual ACM symposium on Theory of computing - STOC '89. Vol. 21. ACM. pp. 433–444. doi:10.1145/73007.73049. ISBN 978-0897913072. S2CID 536357.
  5. ^ a b Schapire, Robert E. (1990). "The Strength of Weak Learnability" (PDF). Machine Learning. 5 (2): 197–227. CiteSeerX 10.1.1.20.723. doi:10.1007/bf00116037. S2CID 53304535. Archived from the original (PDF) on 2012-10-10. Retrieved 2012-08-23.
  6. ^ Leo Breiman (1998). "Arcing classifier (with discussion and a rejoinder by the author)". Ann. Stat. 26 (3): 801–849. doi:10.1214/aos/1024691079. Schapire (1990) proved that boosting is possible. (Page 823)
  7. ^ Yoav Freund and Robert E. Schapire (1997); A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, Journal of Computer and System Sciences, 55(1):119-139
  8. ^ Leo Breiman (1998); Arcing Classifier (with Discussion and a Rejoinder by the Author), Annals of Statistics, vol. 26, no. 3, pp. 801-849: "The concept of weak learning was introduced by Kearns and Valiant (1988, 1989), who left open the question of whether weak and strong learnability are equivalent. The question was termed the boosting problem since a solution 'boosts' the low accuracy of a weak learner to the high accuracy of a strong learner. Schapire (1990) proved that boosting is possible. A boosting algorithm is a method that takes a weak learner and converts it into a strong one. Freund and Schapire (1997) proved that an algorithm similar to arc-fs is boosting.
  9. ^ a b c Llew Mason, Jonathan Baxter, Peter Bartlett, and Marcus Frean (2000); Boosting Algorithms as Gradient Descent, in S. A. Solla, T. K. Leen, and K.-R. Muller, editors, Advances in Neural Information Processing Systems 12, pp. 512-518, MIT Press
  10. ^ Emer, Eric. "Boosting (AdaBoost algorithm)" (PDF). MIT. Archived (PDF) from the original on 2022-10-09. Retrieved 2018-10-10.
  11. ^ Sivic, Russell, Efros, Freeman & Zisserman, "Discovering objects and their location in images", ICCV 2005
  12. ^ A. Opelt, A. Pinz, et al., "Generic Object Recognition with Boosting", IEEE Transactions on PAMI 2006
  13. ^ M. Marszalek, "Semantic Hierarchies for Visual Object Recognition", 2007
  14. ^ "Large Scale Visual Recognition Challenge". December 2017.
  15. ^ P. Viola, M. Jones, "Robust Real-time Object Detection", 2001
  16. ^ Viola, P.; Jones, M.; Snow, D. (2003). Detecting Pedestrians Using Patterns of Motion and Appearance (PDF). ICCV. Archived (PDF) from the original on 2022-10-09.
  17. ^ A. Torralba, K. P. Murphy, et al., "Sharing visual features for multiclass and multiview object detection", IEEE Transactions on PAMI 2006
  18. ^ A. Opelt, et al., "Incremental learning of object detectors using a visual shape alphabet", CVPR 2006
  19. ^ P. Long and R. Servedio. 25th International Conference on Machine Learning (ICML), 2008, pp. 608--615.
  20. ^ Long, Philip M.; Servedio, Rocco A. (March 2010). "Random classification noise defeats all convex potential boosters" (PDF). Machine Learning. 78 (3): 287–304. doi:10.1007/s10994-009-5165-z. S2CID 53861. Archived (PDF) from the original on 2022-10-09. Retrieved 2015-11-17.

Further reading

Read other articles:

Vista aérea de la Plaza, en 2017. La plaza de la República es uno de los grandes espacios abiertos de la Ciudad de México en donde se llevan a cabo grandes eventos culturales, artísticos, políticos y civiles. El enorme espacio se ubica al Poniente de los límites del Centro Histórico de la Ciudad de México, localizado dentro de la demarcación que corresponde a la Colonia Tabacalera, en la Alcaldía Cuauhtémoc. Su origen y traza original de grandes dimensiones corresponde al solar que...

 

 

Pertempuran Suomussalmi adalah pertempuran antara pasukan Finlandia dan Uni Soviet yang menjadi bagian dari Perang Musim Dingin. Pertempuran ini berlangsung dari 7 Desember 1939 sampai dengan 8 Januari 1940. Finlandia memenangkan pertempuran ini melawan pasukan superior. Suomussalmi dianggap sebagai kemenangan Finlandia paling jelas, paling penting, dan paling signifikan di bagian utara Finlandia.[1] Di Finlandia, pertempuran ini masih dilihat sebagai simbol dari keseluruhan Perang Mu...

 

 

Mesjid Istana Tojo yang berada di Taliboi, Tojo, Sulawesi Tengah Kerajaan Tojo1770–1951Ibu kotaTaliboi[1]Bahasa yang umum digunakanBare'e (resmi), Indonesia (Suku Bare'e sebagai suku asli di kerajaan Tojo tidak memiliki sistem tulisan, tetapi sejak adanya kerajaan Tojo tahun 1770, Suku Bare'e memakai alfabet Latin (resmi), aksara Lontara, dan abjad Arab)Agama Islam, dan LamoaPemerintahanKerajaanRaja, Mokole Wea (raja perempuan), Jena (kata ganti raja) Sejarah • ...

Simone de BeauvoirSimone de BeauvoirLahir(1908-01-09)9 Januari 1908Paris, PrancisMeninggal14 April 1986(1986-04-14) (umur 78)Paris, PrancisEraFilsafat abad ke-20KawasanFilsafat BaratAliranEksistensialismeFeminisme PrancisMarxisme BaratMinat utamaFilsafat politik, feminisme, etika, fenomenologi eksistensialGagasan pentingEtika ambiguitas, etika feminis, feminisme eksistensial Dipengaruhi Descartes, Wollstonecraft, Kant, Hegel, Husserl, Kierkegaard, Heidegger, Marx, Nietzsche, S...

 

 

Public university in Riverside, California This article contains academic boosterism which primarily serves to praise or promote the subject and may be a sign of a conflict of interest. Please improve this article by removing peacock terms, weasel words, and other promotional material. (July 2023) (Learn how and when to remove this template message) University of California, RiversideMottoFiat lux (Latin)Motto in EnglishLet there be lightTypePublic land-grant research universityEstablish...

 

 

Nama ini menggunakan cara penamaan Spanyol: nama keluarga pertama atau paternalnya adalah Baldé dan nama keluarga kedua atau maternalnya adalah Diao. Keita Baldé Baldé pada tahun 2022Informasi pribadiNama lengkap Keita Baldé DiaoTanggal lahir 8 Maret 1995 (umur 29)Tempat lahir Arbúcies, SpanyolTinggi 1,81 m (5 ft 11+1⁄2 in)Posisi bermain PenyerangInformasi klubKlub saat ini Espanyol(pinjaman dari Spartak Moskow)Nomor 9Karier junior2004–2011 Barcelona2010�...

Ne doit pas être confondu avec Seyssel. Seyssuel Château des Archevêques de Seyssuel. Administration Pays France Région Auvergne-Rhône-Alpes Département Isère Arrondissement Vienne Intercommunalité Vienne Condrieu Agglomération Maire Mandat Frédéric Belmonte 2020-2026 Code postal 38200 Code commune 38487 Démographie Gentilé Seyssuelois(es) Populationmunicipale 2 080 hab. (2021 ) Densité 213 hab./km2 Géographie Coordonnées 45° 33′ 39″ nord, 4...

 

 

Location of Alleghany County in Virginia This is a list of the National Register of Historic Places listings in Alleghany County, Virginia. This is intended to be a complete list of the properties and districts on the National Register of Historic Places in Alleghany County, Virginia, United States. The locations of National Register properties and districts for which the latitude and longitude coordinates are included below, may be seen in an online map.[1] There are 14 properties a...

 

 

Cheppes-la-PrairiecomuneCheppes-la-Prairie – Veduta LocalizzazioneStato Francia RegioneGrand Est Dipartimento Marna ArrondissementChâlons-en-Champagne CantoneChâlons-en-Champagne-3 TerritorioCoordinate48°50′N 4°28′E / 48.833333°N 4.466667°E48.833333; 4.466667 (Cheppes-la-Prairie)Coordinate: 48°50′N 4°28′E / 48.833333°N 4.466667°E48.833333; 4.466667 (Cheppes-la-Prairie) Superficie19,6 km² Abitanti184[1] (2009) Dens...

Black Mass - L'ultimo gangsterJohnny Depp nei panni di James Bulger in una scena del filmTitolo originaleBlack Mass Lingua originaleinglese Paese di produzioneStati Uniti d'America Anno2015 Durata122 min Rapporto2,35:1 Generebiografico, drammatico, gangster RegiaScott Cooper Soggettodal libro di Dick Lehr e Gerard O'Neill SceneggiaturaJez Butterworth, Mark Mallouk ProduttoreScott Cooper, John Lesher, Brian Oliver, Patrick McCormick, Tyler Thompson Produttore esecutivoBrett Ratner, Jam...

 

 

Федеральное агентство по делам Содружества Независимых Государств, соотечественников, проживающих за рубежом, и по международному гуманитарному сотрудничествусокращённо: Россотрудничество Общая информация Страна  Россия Юрисдикция Россия Дата создания 6 сентября...

 

 

  「俄亥俄」重定向至此。关于其他用法,请见「俄亥俄 (消歧义)」。 俄亥俄州 美國联邦州State of Ohio 州旗州徽綽號:七葉果之州地图中高亮部分为俄亥俄州坐标:38°27'N-41°58'N, 80°32'W-84°49'W国家 美國加入聯邦1803年3月1日,在1953年8月7日追溯頒定(第17个加入联邦)首府哥倫布(及最大城市)政府 • 州长(英语:List of Governors of {{{Name}}}]]) •&...

Storrington. Storrington adalah desa besar di distrik Horsham, Sussex Barat, Inggris, dan salah satu dari dua paroki sipil yang ada di Storrington and Sullington. Storrington terletak di sisi utara South Downs. Pada 2006, populasi desa ini sekitar 4.600 jiwa.[1] Terdapat satu jalan sebagai pusat perbelanjaan utama yang disebut High Street. Jalan A283 menghubungkan Storrington ke Steyning di timur dan Pulborough di barat. Referensi ^ Parish Population Projections Diarsipkan 2005-08-28 ...

 

 

 烏克蘭總理Прем'єр-міністр України烏克蘭國徽現任杰尼斯·什米加尔自2020年3月4日任命者烏克蘭總統任期總統任命首任維托爾德·福金设立1991年11月后继职位無网站www.kmu.gov.ua/control/en/(英文) 乌克兰 乌克兰政府与政治系列条目 宪法 政府 总统 弗拉基米尔·泽连斯基 總統辦公室 国家安全与国防事务委员会 总统代表(英语:Representatives of the President of Ukraine) 总...

 

 

عمارة مصرية قديمةمعلومات عامةالفترة الزمنية ق. 3100 BC-300 ADالبلد مصر والسودانتعديل - تعديل مصدري - تعديل ويكي بيانات العمارة في مصر القديمة هي الهندسة المعمارية المستخدمة في البناء والتشييد في نواحي عدة من تصاميم هندسية وأدوات وطرق مستخدمة في عملية البناء بمصر القديمة فقد أجم...

Belgian cyclist Igor DecraenePersonal informationBorn(1996-01-26)26 January 1996Waregem, BelgiumDied30 August 2014(2014-08-30) (aged 18)Zulte, BelgiumTeam informationDisciplineRoadRoleRiderRider typeTime trialistAmateur team2013–2014Tieltse Rennerclub Igor Decraene (26 January 1996, in Waregem – 30 August 2014, in Zulte) was a Belgian cyclist. In 2013, he became the UCI world junior men's time trial champion. The event took place on 24 September in Florence, Tuscany, Italy. In 2...

 

 

Football stadium in Shimizu-ku, Shizuoka, Japan IAI Stadium NihondairaI Sta'DairaFormer namesNihondaira Sports Stadium (1991–2009)Outsourcing Stadium Nihondaira (2009–2013)Location Shizuoka, Shizuoka, JapanCoordinates34°59′04″N 138°28′52″E / 34.98444°N 138.48111°E / 34.98444; 138.48111OwnerShizuoka CityOperatorShizuoka City Public Facility CorporationCapacity20,248Field size135 m × 73 mSurfaceGrassConstructionOpened1991Expanded1994TenantsShimizu S-Puls...

 

 

Pour les articles homonymes, voir Pasternak. Boris Pasternak Boris Pasternak en 1959. Données clés Nom de naissance Boris Leonidovitch Pasternak Naissance 10 février 1890 Moscou, Empire russe Décès 30 mai 1960 (à 70 ans) Peredelkino, Union soviétique Activité principale Poète, romancier, traducteur Distinctions Prix Nobel de littérature (1958) Auteur Langue d’écriture russe Œuvres principales Le Docteur Jivago modifier Boris Leonidovitch Pasternak (en russe : Бори...

Mexican business-class hotel brand Fiesta Inn logo A hotel in Ecatepec de Morelos Fiesta Inn is a Mexican business-class hotel brand.[1][2] It is owned by Grupo Posadas, Inc., which also owns other Mexican hotel brands, including the upscale Fiesta Americana and Fiesta Americana Grand, ultra-luxury Live Aqua, One, and the eco-tourist Explorean. There are 61 hotels operated under this brand throughout the country. Recognition Many Fiesta Inn hotels have received recognition for...

 

 

American politician (1923–2009) Warren HearnesHearnes in 196746th Governor of MissouriIn officeJanuary 11, 1965 – January 8, 1973LieutenantThomas EagletonWilliam S. MorrisPreceded byJohn M. DaltonSucceeded byKit BondChair of the National Governors AssociationIn officeAugust 9, 1970 – September 12, 1971Preceded byJohn Arthur LoveSucceeded byArch A. Moore Jr.31st Secretary of State of MissouriIn officeJanuary 9, 1961 – January 11, 1965GovernorJohn M. DaltonPre...