Hyperparameter (machine learning)

In machine learning, a hyperparameter is a parameter that can be set in order to define any configurable part of a model's learning process. Hyperparameters can be classified as either model hyperparameters (such as the topology and size of a neural network) or algorithm hyperparameters (such as the learning rate and the batch size of an optimizer). These are named hyperparameters in contrast to parameters, which are characteristics that the model learns from the data.

Hyperparameters are not required by every model or algorithm. Some simple algorithms such as ordinary least squares regression require none. However, the LASSO algorithm, for example, adds a regularization hyperparameter to ordinary least squares which must be set before training.[1] Even models and algorithms without a strict requirement to define hyperparameters may not produce meaningful results if these are not carefully chosen. However, optimal values for hyperparameters are not always easy to predict. Some hyperparameters may have no meaningful effect, or one important variable may be conditional upon the value of another. Often a separate process of hyperparameter tuning is needed to find a suitable combination for the data and task.

As well was improving model performance, hyperparameters can be used to by researchers introduce robustness and reproducibility into their work, especially if it uses models that incorporate random number generation.

Considerations

The time required to train and test a model can depend upon the choice of its hyperparameters.[2] A hyperparameter is usually of continuous or integer type, leading to mixed-type optimization problems.[2] The existence of some hyperparameters is conditional upon the value of others, e.g. the size of each hidden layer in a neural network can be conditional upon the number of layers.[2]

Difficulty-learnable parameters

The objective function is typically non-differentiable with respect to hyperparameters.[clarification needed] As a result, in most instances, hyperparameters cannot be learned using gradient-based optimization methods (such as gradient descent), which are commonly employed to learn model parameters. These hyperparameters are those parameters describing a model representation that cannot be learned by common optimization methods, but nonetheless affect the loss function. An example would be the tolerance hyperparameter for errors in support vector machines.

Untrainable parameters

Sometimes, hyperparameters cannot be learned from the training data because they aggressively increase the capacity of a model and can push the loss function to an undesired minimum (overfitting to the data), as opposed to correctly mapping the richness of the structure in the data. For example, if we treat the degree of a polynomial equation fitting a regression model as a trainable parameter, the degree would increase until the model perfectly fit the data, yielding low training error, but poor generalization performance.

Tunability

Most performance variation can be attributed to just a few hyperparameters.[3][2][4] The tunability of an algorithm, hyperparameter, or interacting hyperparameters is a measure of how much performance can be gained by tuning it.[5] For an LSTM, while the learning rate followed by the network size are its most crucial hyperparameters,[6] batching and momentum have no significant effect on its performance.[7]

Although some research has advocated the use of mini-batch sizes in the thousands, other work has found the best performance with mini-batch sizes between 2 and 32.[8]

Robustness

An inherent stochasticity in learning directly implies that the empirical hyperparameter performance is not necessarily its true performance.[2] Methods that are not robust to simple changes in hyperparameters, random seeds, or even different implementations of the same algorithm cannot be integrated into mission critical control systems without significant simplification and robustification.[9]

Reinforcement learning algorithms, in particular, require measuring their performance over a large number of random seeds, and also measuring their sensitivity to choices of hyperparameters.[9] Their evaluation with a small number of random seeds does not capture performance adequately due to high variance.[9] Some reinforcement learning methods, e.g. DDPG (Deep Deterministic Policy Gradient), are more sensitive to hyperparameter choices than others.[9]

Optimization

Hyperparameter optimization finds a tuple of hyperparameters that yields an optimal model which minimizes a predefined loss function on given test data.[2] The objective function takes a tuple of hyperparameters and returns the associated loss.[2] Typically these methods are not gradient based, and instead apply concepts from derivative-free optimization or black box optimization.

Reproducibility

Apart from tuning hyperparameters, machine learning involves storing and organizing the parameters and results, and making sure they are reproducible.[10] In the absence of a robust infrastructure for this purpose, research code often evolves quickly and compromises essential aspects like bookkeeping and reproducibility.[11] Online collaboration platforms for machine learning go further by allowing scientists to automatically share, organize and discuss experiments, data, and algorithms.[12] Reproducibility can be particularly difficult for deep learning models.[13] For example, research has shown that deep learning models depend very heavily even on the random seed selection of the random number generator.[14]

See also

References

  1. ^ Yang, Li; Shami, Abdallah (2020-11-20). "On hyperparameter optimization of machine learning algorithms: Theory and practice". Neurocomputing. 415: 295–316. arXiv:2007.15745. doi:10.1016/j.neucom.2020.07.061. ISSN 0925-2312. S2CID 220919678.
  2. ^ a b c d e f g "Claesen, Marc, and Bart De Moor. "Hyperparameter Search in Machine Learning." arXiv preprint arXiv:1502.02127 (2015)". arXiv:1502.02127. Bibcode:2015arXiv150202127C.
  3. ^ Leyton-Brown, Kevin; Hoos, Holger; Hutter, Frank (January 27, 2014). "An Efficient Approach for Assessing Hyperparameter Importance": 754–762 – via proceedings.mlr.press. {{cite journal}}: Cite journal requires |journal= (help)
  4. ^ "van Rijn, Jan N., and Frank Hutter. "Hyperparameter Importance Across Datasets." arXiv preprint arXiv:1710.04725 (2017)". arXiv:1710.04725. Bibcode:2017arXiv171004725V.
  5. ^ "Probst, Philipp, Bernd Bischl, and Anne-Laure Boulesteix. "Tunability: Importance of Hyperparameters of Machine Learning Algorithms." arXiv preprint arXiv:1802.09596 (2018)". arXiv:1802.09596. Bibcode:2018arXiv180209596P.
  6. ^ Greff, K.; Srivastava, R. K.; Koutník, J.; Steunebrink, B. R.; Schmidhuber, J. (October 23, 2017). "LSTM: A Search Space Odyssey". IEEE Transactions on Neural Networks and Learning Systems. 28 (10): 2222–2232. arXiv:1503.04069. doi:10.1109/TNNLS.2016.2582924. PMID 27411231. S2CID 3356463.
  7. ^ "Breuel, Thomas M. "Benchmarking of LSTM networks." arXiv preprint arXiv:1508.02774 (2015)". arXiv:1508.02774. Bibcode:2015arXiv150802774B.
  8. ^ "Revisiting Small Batch Training for Deep Neural Networks (2018)". arXiv:1804.07612. Bibcode:2018arXiv180407612M.
  9. ^ a b c d "Mania, Horia, Aurelia Guy, and Benjamin Recht. "Simple random search provides a competitive approach to reinforcement learning." arXiv preprint arXiv:1803.07055 (2018)". arXiv:1803.07055. Bibcode:2018arXiv180307055M.
  10. ^ "Greff, Klaus, and Jürgen Schmidhuber. "Introducing Sacred: A Tool to Facilitate Reproducible Research."" (PDF). 2015.
  11. ^ "Greff, Klaus, et al. "The Sacred Infrastructure for Computational Research."" (PDF). 2017. Archived from the original (PDF) on 2020-09-29. Retrieved 2018-04-06.
  12. ^ "Vanschoren, Joaquin, et al. "OpenML: networked science in machine learning." arXiv preprint arXiv:1407.7722 (2014)". arXiv:1407.7722. Bibcode:2014arXiv1407.7722V.
  13. ^ Villa, Jennifer; Zimmerman, Yoav (25 May 2018). "Reproducibility in ML: why it matters and how to achieve it". Determined AI Blog. Retrieved 31 August 2020.
  14. ^ Bethard, S. (2022). We need to talk about random seeds. ArXiv, abs/2210.13393.

Read other articles:

Прямая кишкалат. rectum Прямая кишка толстого кишечника обозначена красным Схема поперечного сечения прямой кишки и ануса, наружного и внутреннего сфинктеров, кавернозных тел Кровоснабжение superior rectal artery[d] Венозный отток superior rectal vein[d] Лимфа Inferior mesenteric lymph nodes[d] Каталоги MeS...

 

 

4th and 6th Prime Minister of Great Britain, 1754–56 and 1757–62 The Duke of Newcastle redirects here. For other holders of the title, see Duke of Newcastle. His GraceThe Duke of NewcastleKG PC FRSPortrait by William Hoare, c. 1750Prime Minister of Great BritainIn office29 June 1757 – 26 May 1762MonarchsGeorge IIGeorge IIIPreceded byThe Duke of DevonshireSucceeded byThe Earl of ButeIn office16 March 1754 – 11 November 1756MonarchGeorge IIPreceded byHenry Pelh...

 

 

Part of a series onBritish law Acts of Parliament of the United Kingdom Year      1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 ...

PES PesselNama lengkapPersatuan Sepakbola Pesisir SelatanJulukanHarimau PasisiaStadionGOR H. Ilyas Yakub, Nagari Painan, Kecamatan IV Jurai, Kabupaten Pesisir Selatan, Provinsi Sumatera BaratPemilikPemkab Pesisir SelatanKetuaHj. Lisda Rawdha HendrajoniSekretarisDona Satria PutraBendaharaWiwi BurdaniManajerAkmal HosenPelatihYulian SyahrevaLigaLiga 32018Juara ke-3 Kostum kandang Persatuan Sepakbola Pesisir Selatan disingkat PES Pessel adalah klub sepakbola Indonesia yang berasal dari Kabupaten ...

 

 

Radio station in Woodstock, Ontario CIHR-FMWoodstock, OntarioBroadcast areaOxford CountyFrequency104.7 MHz (FM) (HD Radio)Branding104.7 Heart FMProgrammingFormatAdult contemporaryHD2: 70s, 80s, and moreOwnershipOwnerByrnes Communications Inc.HistoryFirst air dateApril 10, 2006Technical informationLicensing authorityCRTCClassAERP8,950 wattsHAAT99.5 metres (326 ft)[1]LinksWebcastListen LiveListen Live (HD2)Websiteheartfm.ca CIHR-FM is a Canadian radio station broadcasting at 104.7 ...

 

 

Jesselyn LauwreenLahir1 Maret 2000 (umur 24)Medan, Sumatera Utara, IndonesiaKebangsaanIndonesiaNama lainJesselyn MCI 8PendidikanLe Cordon Bleu Dusit Culinary School (Thailand) Le Cordon Bleu Paris (Prancis)Kota asalMedan, Sumatera UtaraOrang tuaMarwali Lauw (ayah)Mery Tamin (ibu)KerabatEvelyn Lauwreen (kakak)Angelyn Lauwreen (adik)Karier kulinerGaya masakMasakan ThailandMasakan Prancis Restoran saat ini Restoran Miramar Indonesian & Seafood SÀNÙK THAI BOAT NOODLE Miramar...

See also: Asylum in Australia and Immigration history of Australia For the detention facilities housing such immigrants, see Australian immigration detention facilities. Fencing surrounding the detention centre on Christmas Island. The Australian government has a policy and practice of detaining in immigration detention facilities non-citizens not holding a valid visa, suspected of visa violations, illegal entry or unauthorised arrival, and those subject to deportation and removal in immigra...

 

 

Thermal electromagnetic radiation This article's lead section may be too long. Please read the length guidelines and help move details into the article's body. (January 2024) Black-body radiation is the thermal electromagnetic radiation within, or surrounding, a body in thermodynamic equilibrium with its environment, emitted by a black body (an idealized opaque, non-reflective body). It has a specific, continuous spectrum of wavelengths, inversely related to intensity, that depend only on the...

 

 

この記事は検証可能な参考文献や出典が全く示されていないか、不十分です。出典を追加して記事の信頼性向上にご協力ください。(このテンプレートの使い方)出典検索?: コルク – ニュース · 書籍 · スカラー · CiNii · J-STAGE · NDL · dlib.jp · ジャパンサーチ · TWL(2017年4月) コルクを打ち抜いて作った瓶の栓 コルク(木栓、�...

Nuclei Armati ProletariBandiera nappista del Nucleo Armato Sergio Romeo di NapoliAttivaprimavera 1974 - settembre 1978 Nazione Italia ContestoAnni di piombo IdeologiaComunismoMarxismo-leninismo AlleanzeBrigate Rosse ComponentiComponenti principaliLuca MantiniAntonio Lo MuscioAnna Maria Mantini AttivitàAzioni principaliTerrorismo rosso [[:Categoria:|Voci in Wikipedia]] Scritta murale dei Nuclei Armati Proletari negli anni settanta I Nuclei Armati Proletari (NAP) furono un'organizzazione ...

 

 

artikel ini perlu dirapikan agar memenuhi standar Wikipedia. Tidak ada alasan yang diberikan. Silakan kembangkan artikel ini semampu Anda. Merapikan artikel dapat dilakukan dengan wikifikasi atau membagi artikel ke paragraf-paragraf. Jika sudah dirapikan, silakan hapus templat ini. (Pelajari cara dan kapan saatnya untuk menghapus pesan templat ini) Artikel ini tidak memiliki referensi atau sumber tepercaya sehingga isinya tidak bisa dipastikan. Tolong bantu perbaiki artikel ini dengan menamba...

 

 

周處除三害The Pig, The Snake and The Pigeon正式版海報基本资料导演黃精甫监制李烈黃江豐動作指導洪昰顥编剧黃精甫主演阮經天袁富華陳以文王淨李李仁謝瓊煖配乐盧律銘林孝親林思妤保卜摄影王金城剪辑黃精甫林雍益制片商一種態度電影股份有限公司片长134分鐘产地 臺灣语言國語粵語台語上映及发行上映日期 2023年10月6日 (2023-10-06)(台灣) 2023年11月2日 (2023-11-02)(香�...

الدراما كورية ((هانغل: 한국드라마 ر.ر: هانغوك دُراما)) أو كي دراما هو نوع فني في اللغة الكورية يشير إلى المسلسلات التلفزيونية القصيرة الموجهة إلى الجمهور الكوري مع سمات ومميزات مختلفة تميزها عن المسلسلات الأجنبية وأوبرا الصابون. وتختصر غالبا تحت اسم «كي دراما» باللغة الإ�...

 

 

Cet article est une ébauche concernant un coureur cycliste italien. Vous pouvez partager vos connaissances en l’améliorant (comment ?). Pour plus d’informations, voyez le projet cyclisme. Pour les articles homonymes, voir Marco Zanotti et Zanotti. Marco ZanottiMarco Zanotti lors de la Bicyclette basque 2007InformationsNaissance 21 janvier 1974 (50 ans)RovatoNationalité italienneSpécialité SprinteurÉquipes amateurs 1999L'Edile-Rosa Carni-Ok Baby[1]Équipes professionnelles...

 

 

Diving to gather natural sponges Sponge diver putting on his diving suit in Tarpon Springs, Florida. Sponge diving is underwater diving to collect soft natural sponges for human use. Display of natural sponges for sale on Kalymnos in Greece Background Main article: Sponge Most sponges are too rough for general use due to their structural spicules composed of calcium carbonate or silica. But two genera, Hippospongia and Spongia, have soft, entirely fibrous skeletons. These two genera are most ...

Younky SoewarnoLahir(1956-05-15)15 Mei 1956Klaten, Jawa Tengah, IndonesiaMeninggal10 Februari 2022(2022-02-10) (umur 65)Jakarta, IndonesiaInstrumenKeyboard, Piano, Composer, Producer, ArrangerTahun aktif1981-2022Artis terkaitKrisdayantiIta PurnamasariRossaAB ThreePoppy MercuryFariz RMDeddy DhukunJakarta Rhythm SectionTito Soemarsono Gregorius Younky Soewarno (15 Mei 1956 – 10 Februari 2022) adalah seorang musisi dan komponis pada era 80 dan 90-an. Perjalanan Karir Di awal...

 

 

Standards-compliant structured cabling system architecture FTTE and FTTZ redirect here. For airports with those ICAO codes, see List of airports in Chad. Diagram originally published by the Fiber Optics LAN Section of the Telecommunications Industry Association Fiber to the Edge (FTTE), fiber to the telecom enclosure (FTTTE) or fiber to the zone (FTTZ),[1] is a fiber to the x networking approach used in the enterprise building (hotels, convention centers, office buildings, hospitals, ...

 

 

عازف بيانو   تسمية الإناث عازفة بيانو  فرع من عازف لوحة المفاتيح  تعديل مصدري - تعديل   عازف بيانو Pianist هو موسيقي يعزف على آلة البيانو.[1][2][3] يمكنه عزف قطع منفردة، أو بمرافقة مغني أو مع فرقة موسيقية كاملة. من أبرز الأنواع الموسيقية التي يمكن لعازف البيان...

NBA Northwest DivisionSport Pallacanestro Paese Stati Uniti Cadenzaannuale Aperturaottobre Chiusuragiugno Partecipanti5 squadre StoriaFondazione2004 Detentore Oklahoma Thunder Record vittorie Oklahoma Thunder (7) Modifica dati su Wikidata · Manuale La Northwest Division è una Division della Western Conference del campionato NBA. Le altre Division della Western Conference sono: la Pacific Division e la Southwest Division. La prima classificata di ogni Division partecipa a...

 

 

Scie à ruban La scie à ruban est une machine-outil qui met en rotation une bande en acier fermée sur elle-même ; elle sert principalement au délignage de plateaux en menuiserie. Elle permet également le chantournage en utilisant des lames de faible largeur. Son action diffère de celle de la scie circulaire notamment par sa hauteur de coupe et ses capacités de chantournage. Pour le chantournage, il existe également un outil spécialisé : la scie à chantourner. Principe de ...