Bfloat16 floating-point format

The bfloat16 (brain floating point)[1][2] floating-point format is a computer number format occupying 16 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. This format is a shortened (16-bit) version of the 32-bit IEEE 754 single-precision floating-point format (binary32) with the intent of accelerating machine learning and near-sensor computing.[3] It preserves the approximate dynamic range of 32-bit floating-point numbers by retaining 8 exponent bits, but supports only an 8-bit precision rather than the 24-bit significand of the binary32 format. More so than single-precision 32-bit floating-point numbers, bfloat16 numbers are unsuitable for integer calculations, but this is not their intended use. Bfloat16 is used to reduce the storage requirements and increase the calculation speed of machine learning algorithms.[4]

The bfloat16 format was developed by Google Brain, an artificial intelligence research group at Google. It is utilized in many CPUs, GPUs, and AI processors, such as Intel Xeon processors (AVX-512 BF16 extensions), Intel Data Center GPU, Intel Nervana NNP-L1000, Intel FPGAs,[5][6][7] AMD Zen, AMD Instinct, NVIDIA GPUs, Google Cloud TPUs,[8][9][10] AWS Inferentia, AWS Trainium, ARMv8.6-A,[11] and Apple's M2[12] and therefore A15 chips and later. Many libraries support bfloat16, such as CUDA,[13] Intel oneAPI Math Kernel Library, AMD ROCm,[14] AMD Optimizing CPU Libraries, PyTorch, and TensorFlow.[10][15] On these platforms, bfloat16 may also be used in mixed-precision arithmetic, where bfloat16 numbers may be operated on and expanded to wider data types.

bfloat16 floating-point format

bfloat16 has the following format:

The bfloat16 format, being a shortened IEEE 754 single-precision 32-bit float, allows for fast conversion to and from an IEEE 754 single-precision 32-bit float; in conversion to the bfloat16 format, the exponent bits are preserved while the significand field can be reduced by truncation (thus corresponding to round toward 0) or other rounding mechanisms, ignoring the NaN special case. Preserving the exponent bits maintains the 32-bit float's range of ≈ 10−38 to ≈ 3 × 1038.[16]

The bits are laid out as follows:

IEEE half-precision 16-bit float
sign exponent (5 bit) fraction (10 bit)
  ┃
 0   0   1   1   0   0   0   1   0   0   0   0   0   0   0   0 
15 14 10 9 0
 
bfloat16
sign exponent (8 bit) fraction (7 bit)
  ┃
 0   0   1   1   1   1   1   0   0   0   1   0   0   0   0   0 
15 14 7 6 0
 
Nvidia's TensorFloat-32 (19 bits)
sign exponent (8 bit) fraction (10 bit)
  ┃
 0   0   1   1   1   1   1   0   0   0   1   0   0   0   0   0   0   0   0 
18 17 10 9 0
 
AMD's fp24 format
sign exponent (7 bit) fraction (16 bit)
  ┃
 0   0   1   1   1   1   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0 
23 22 16 15 0
 
Pixar's PXR24 format
sign exponent (8 bit) fraction (15 bit)
  ┃
 0   0   1   1   1   1   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0 
23 22 15 14 0
 
IEEE 754 single-precision 32-bit float
sign exponent (8 bit) fraction (23 bit)
  ┃
 0   0   1   1   1   1   1   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 
31 30 23 22 0

Exponent encoding

The bfloat16 binary floating-point exponent is encoded using an offset-binary representation, with the zero offset being 127; also known as exponent bias in the IEEE 754 standard.

  • Emin = 01H−7FH = −126
  • Emax = FEH−7FH = 127
  • Exponent bias = 7FH = 127

Thus, in order to get the true exponent as defined by the offset-binary representation, the offset of 127 has to be subtracted from the value of the exponent field.

The minimum and maximum values of the exponent field (00H and FFH) are interpreted specially, like in the IEEE 754 standard formats.

Exponent Significand zero Significand non-zero Equation
00H zero, −0 subnormal numbers (−1)signbit×2−126× 0.significandbits
01H, ..., FEH normalized value (−1)signbit×2exponentbits−127× 1.significandbits
FFH ±infinity NaN (quiet, signaling)

The minimum positive normal value is 2−126 ≈ 1.18 × 10−38 and the minimum positive (subnormal) value is 2−126−7 = 2−133 ≈ 9.2 × 10−41.

Rounding and conversion

The most common use case is the conversion between IEEE 754 binary32 and bfloat16. The following section describes the conversion process and its rounding scheme in the conversion. Note that there are other possible scenarios of format conversions to or from bfloat16. For example, int16 and bfloat16.

  • From binary32 to bfloat16. When bfloat16 was first introduced as a storage format,[15] the conversion from IEEE 754 binary32 (32-bit floating point) to bfloat16 is truncation (round toward 0). Later on, when it becomes the input of matrix multiplication units, the conversion can have various rounding mechanisms depending on the hardware platforms. For example, for Google TPU, the rounding scheme in the conversion is round-to-nearest-even;[17] ARM uses the non-IEEE Round-to-Odd mode;[18] for NVIDIA, it supports converting float number to bfloat16 precision in round-to-nearest-even mode.[19]
  • From bfloat16 to binary32. Since binary32 can represent all exact values in bfloat16, the conversion simply pads 16 zeros in the significand bits.[17]

Encoding of special values

Positive and negative infinity

Just as in IEEE 754, positive and negative infinity are represented with their corresponding sign bits, all 8 exponent bits set (FFhex) and all significand bits zero. Explicitly,

val    s_exponent_signcnd
+inf = 0_11111111_0000000
-inf = 1_11111111_0000000

Not a Number

Just as in IEEE 754, NaN values are represented with either sign bit, all 8 exponent bits set (FFhex) and not all significand bits zero. Explicitly,

val    s_exponent_signcnd
+NaN = 0_11111111_klmnopq
-NaN = 1_11111111_klmnopq

where at least one of k, l, m, n, o, p, or q is 1. As with IEEE 754, NaN values can be quiet or signaling, although there are no known uses of signaling bfloat16 NaNs as of September 2018.

Range and precision

Bfloat16 is designed to maintain the number range from the 32-bit IEEE 754 single-precision floating-point format (binary32), while reducing the precision from 24 bits to 8 bits. This means that the precision is between two and three decimal digits, and bfloat16 can represent finite values up to about 3.4 × 1038.

Examples

These examples are given in bit representation, in hexadecimal and binary, of the floating-point value. This includes the sign, (biased) exponent, and significand.

3f80 = 0 01111111 0000000 = 1
c000 = 1 10000000 0000000 = −2
7f7f = 0 11111110 1111111 = (28 − 1) × 2−7 × 2127 ≈ 3.38953139 × 1038 (max finite positive value in bfloat16 precision)
0080 = 0 00000001 0000000 = 2−126 ≈ 1.175494351 × 10−38 (min normalized positive value in bfloat16 precision and single-precision floating point)

The maximum positive finite value of a normal bfloat16 number is 3.38953139 × 1038, slightly below (224 − 1) × 2−23 × 2127 = 3.402823466 × 1038, the max finite positive value representable in single precision.

Zeros and infinities

0000 = 0 00000000 0000000 = 0
8000 = 1 00000000 0000000 = −0
7f80 = 0 11111111 0000000 = infinity
ff80 = 1 11111111 0000000 = −infinity

Special values

4049 = 0 10000000 1001001 = 3.140625 ≈ π ( pi )
3eab = 0 01111101 0101011 = 0.333984375 ≈ 1/3

NaNs

ffc1 = x 11111111 1000001 => qNaN
ff81 = x 11111111 0000001 => sNaN

See also

References

  1. ^ Teich, Paul (2018-05-10). "Tearing Apart Google's TPU 3.0 AI Coprocessor". The Next Platform. Retrieved 2020-08-11. Google invented its own internal floating point format called "bfloat" for "brain floating point" (after Google Brain).
  2. ^ Wang, Shibo; Kanwar, Pankaj (2019-08-23). "BFloat16: The secret to high performance on Cloud TPUs". Google Cloud. Retrieved 2020-08-11. This custom floating point format is called "Brain Floating Point Format," or "bfloat16" for short. The name flows from "Google Brain", which is an artificial intelligence research group at Google where the idea for this format was conceived.
  3. ^ Tagliavini, Giuseppe; Mach, Stefan; Rossi, Davide; Marongiu, Andrea; Benin, Luca (2018). "A transprecision floating-point platform for ultra-low power computing". 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). pp. 1051–1056. arXiv:1711.10374. doi:10.23919/DATE.2018.8342167. ISBN 978-3-9819263-0-9. S2CID 5067903.
  4. ^ Dr. Ian Cutress (2020-03-17). "Intel': Cooper lake Plans: Why is BF16 Important?". Retrieved 2020-05-12. The bfloat16 standard is a targeted way of representing numbers that give the range of a full 32-bit number, but in the data size of a 16-bit number, keeping the accuracy close to zero but being a bit more loose with the accuracy near the limits of the standard. The bfloat16 standard has a lot of uses inside machine learning algorithms, by offering better accuracy of values inside the algorithm while affording double the data in any given dataset (or doubling the speed in those calculation sections).
  5. ^ Khari Johnson (2018-05-23). "Intel unveils Nervana Neural Net L-1000 for accelerated AI training". VentureBeat. Retrieved 2018-05-23. ...Intel will be extending bfloat16 support across our AI product lines, including Intel Xeon processors and Intel FPGAs.
  6. ^ Michael Feldman (2018-05-23). "Intel Lays Out New Roadmap for AI Portfolio". TOP500 Supercomputer Sites. Retrieved 2018-05-23. Intel plans to support this format across all their AI products, including the Xeon and FPGA lines
  7. ^ Lucian Armasu (2018-05-23). "Intel To Launch Spring Crest, Its First Neural Network Processor, In 2019". Tom's Hardware. Retrieved 2018-05-23. Intel said that the NNP-L1000 would also support bfloat16, a numerical format that's being adopted by all the ML industry players for neural networks. The company will also support bfloat16 in its FPGAs, Xeons, and other ML products. The Nervana NNP-L1000 is scheduled for release in 2019.
  8. ^ "Available TensorFlow Ops | Cloud TPU | Google Cloud". Google Cloud. Retrieved 2018-05-23. This page lists the TensorFlow Python APIs and graph operators available on Cloud TPU.
  9. ^ Elmar Haußmann (2018-04-26). "Comparing Google's TPUv2 against Nvidia's V100 on ResNet-50". RiseML Blog. Archived from the original on 2018-04-26. Retrieved 2018-05-23. For the Cloud TPU, Google recommended we use the bfloat16 implementation from the official TPU repository with TensorFlow 1.7.0. Both the TPU and GPU implementations make use of mixed-precision computation on the respective architecture and store most tensors with half-precision.
  10. ^ a b Tensorflow Authors (2018-07-23). "ResNet-50 using BFloat16 on TPU". Google. Retrieved 2018-11-06.
  11. ^ "BFloat16 extensions for Armv8-A". community.arm.com. 29 August 2019. Retrieved 2019-08-30.
  12. ^ "AArch64: add support for newer Apple CPUs · llvm/llvm-project@677da09". GitHub. Retrieved 2023-05-08.
  13. ^ "CUDA Library bloat16 Intrinsics".
  14. ^ "ROCm version history". github.com. Retrieved 2019-10-23.
  15. ^ a b Joshua V. Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore, Brian Patton, Alex Alemi, Matt Hoffman, Rif A. Saurous (2017-11-28). TensorFlow Distributions (Report). arXiv:1711.10604. Bibcode:2017arXiv171110604D. Accessed 2018-05-23. All operations in TensorFlow Distributions are numerically stable across half, single, and double floating-point precisions (as TensorFlow dtypes: tf.bfloat16 (truncated floating point), tf.float16, tf.float32, tf.float64). Class constructors have a validate_args flag for numerical asserts{{cite report}}: CS1 maint: multiple names: authors list (link)
  16. ^ "Livestream Day 1: Stage 8 (Google I/O '18) - YouTube". Google. 2018-05-08. Retrieved 2018-05-23. In many models this is a drop-in replacement for float-32
  17. ^ a b "The bfloat16 numerical format". Google Cloud. Retrieved 2023-07-11. On TPU, the rounding scheme in the conversion is round to nearest even and overflow to inf.
  18. ^ "Arm A64 Instruction Set Architecture". developer.arm.com. Retrieved 2023-07-26. Uses the non-IEEE Round-to-Odd rounding mode.
  19. ^ "1.3.5. Bfloat16 Precision Conversion and Data Movement" (PDF). docs.nvidia.com. p. 199. Retrieved 2023-07-26. Converts float number to nv_bfloat16 precision in round-to-nearest-even mode and returns nv_bfloat16 with converted value.

Read other articles:

Resolusi 548Dewan Keamanan PBBLokasi BruneiTanggal24 Februari 1984Sidang no.2.518KodeS/RES/548 (Dokumen)TopikPenambahan anggota baru PBB: Brunei DarussalamRingkasan hasil15 mendukungTidak ada menentangTidak ada abstainHasilDiadopsiKomposisi Dewan KeamananAnggota tetap Tiongkok Prancis Britania Raya Amerika Serikat Uni SovietAnggota tidak tetap Burkina Faso Mesir India Malta Nikaragua Belanda Pakistan Peru RSS Ukra...

 

For other places with the same name, see Hythe. St Leonard's Church at the Hythe. 'The Swan' public house in the Hythe.[1] The Hythe is an area in the southeastern part of Colchester in Essex, England, on the River Colne.[2] Historically it was a hamlet. The Hythe is home to the Paxmans Factory which manufactures automobile parts. Hythe railway station is on the Sunshine Coast Line. Services run towards Clacton-on-sea, Walton-on-the-Naze, Colchester and London. The Church of S...

 

Subgenre of role-playing and action video games Not to be confused with Live action role-playing game. Part of a series onRole-playing video games Subgenres Action RPG Soulslike Dungeon crawl Monster-taming MUD MMORPG Roguelike Tactical RPG Topics Social interaction in MMORPGs Character creation Dialogue tree GNS theory History of Eastern RPGs History of MMORPGs History of Western RPGs Non-player character Player character Quest RPG terms Statistics Threefold model Lists Free MMOs MMORPGs MUD...

Pour les articles homonymes, voir Podunavlje. Gornje PodunavljeGéographiePays  SerbieProvince VoïvodineRégion BačkaCoordonnées 45° 45′ 00″ N, 18° 57′ 00″ EVille proche Apatin - SomborSuperficie 196,48 km2AdministrationCatégorie UICN IVWDPA 328846Création 1982Patrimonialité  Site Ramsar (2007)Site web www.gornjepodunavlje.netLocalisation sur la carte d’EuropeLocalisation sur la carte de Serbiemodifier - modifier le code - modifi...

 

George Turnerdari studio KeeneLahir2 April, 1841CromfordMeninggal29 Maret, 1910Kebangsaan Britania RayaPekerjaanPelukis lanskapSuami/istriEliza LakinKate Stevens SmithAnakya George Turner (2 April 1841 – 29 Maret 1910) adalah seorang pelukis lanskap Inggris dan petani yang pernah disebut sebagai John Constable dari Derbyshire. Kehidupan dan karya George Turner lahir di Cromford, Derbyshire, Inggris, tetapi kemudian pindah ke Derby bersama keluarganya. Ia menunjukkan baka...

 

Doctor Who Live: The Monsters Are Coming!Official Poster used to promote Doctor Who Live.Date premieredOctober 8, 2010 (2010-10-08)Place premieredWembley ArenaLondon, EnglandOriginal languageEnglishGenreScience fiction Doctor Who Live: The Monsters Are Coming! is an arena stage show based on the BBC TV programme Doctor Who. Plot The live show is an implied sequel to the 1973 Doctor Who television episode Carnival of Monsters.[1] It centres around Vorgenson, the Greates...

Pour les articles homonymes, voir Bellevue (homonymie). Bellevue Vue sur Bellevue depuis le ciel. Armoiries Logo Administration Pays Suisse Canton Genève Localité(s) Le Gobé, Bellevue/Grands-Bois, Les Grands-Champs, Bellevue/Rives-du-Lac et L'Ermitage. Communes limitrophes Genthod, Collex-Bossy, Ferney-Voltaire, Le Grand-Saconnex, Pregny-Chambésy Maire Mandat Anne-Catherine Hurny(Bellevue d'Avenir) 2023-2024 NPA 1293 No OFS 6606 Démographie Gentilé Belleviste Populationpermanente ...

 

Il Giudizio universale, affresco di Michelangelo nella Cappella Sistina La Basilica di San Pietro L'arte italiana si sviluppò nella penisola italica fin dalla preistoria. Durante l'Impero romano l'Italia fu al centro di una cultura artistica che per la prima volta creò un linguaggio universalmente omogeneo per il mondo europeo e mediterraneo. In alcuni periodi l'Italia fu il paese artisticamente più all'avanguardia d'Europa. Indice 1 Arte preistorica e protostorica 2 Magna Grecia 3 Etr...

 

Questa voce sull'argomento premi letterari è solo un abbozzo. Contribuisci a migliorarla secondo le convenzioni di Wikipedia. Lambda Literary AwardRiconoscimento assegnato alle migliori opere di argomento LGBT Assegnato daLambda Literary Foundation Paese Stati Uniti Anno inizio1988 Sito webwww.lambdaliterary.org/awards Modifica dati su Wikidata · Manuale I Lambda Literary Award, noti anche come Lammys[1], sono una serie di riconoscimenti letterari che celebrano annua...

金正男遇刺现场,位于吉隆坡第二国际机场 金正男遇刺事件,是2017年2月13日已故朝鮮勞動黨總書記金正日的長子,也是現任領導人金正恩的兄長金正男於吉隆坡第二国际机场被2名女子刺殺身亡的事件。 事件经过 2017年2月6日,一名持姓名为「金哲」的朝鲜民主主义人民共和国外交护照的男子搭機抵达马来西亚,在2月8日前往浮羅交怡並在浮羅交怡威斯汀酒店(The Westin Langkaw...

 

Hypo NiederösterreichFull nameHypo NiederösterreichShort nameHypoFounded1972ArenaBundessport- und Freizeitzentrum Südstadt, Maria EnzersdorfCapacity1,200PresidentAlois EckerHead coachMartin MatuschkowitzLeagueWomen Handball Austria2020-21Women Handball Austria, 1stClub colours    Home Away Website Official site Hypo Niederösterreich (Hypo NÖ) is an Austrian women's handball club, headquartered in Maria Enzersdorf. They are one of the most successful teams on...

 

此條目可参照英語維基百科相應條目来扩充。 (2022年1月1日)若您熟悉来源语言和主题,请协助参考外语维基百科扩充条目。请勿直接提交机械翻译,也不要翻译不可靠、低品质内容。依版权协议,译文需在编辑摘要注明来源,或于讨论页顶部标记{{Translated page}}标签。 奥斯卡尔·托尔普出生1893年6月8日 逝世1958年5月1日  (64歲)奥斯陆 職業政治人物 政党工党...

2016年美國總統選舉 ← 2012 2016年11月8日 2020 → 538個選舉人團席位獲勝需270票民意調查投票率55.7%[1][2] ▲ 0.8 %   获提名人 唐納·川普 希拉莉·克林頓 政党 共和黨 民主党 家鄉州 紐約州 紐約州 竞选搭档 迈克·彭斯 蒂姆·凱恩 选举人票 304[3][4][註 1] 227[5] 胜出州/省 30 + 緬-2 20 + DC 民選得票 62,984,828[6] 65,853,514[6]...

 

Pour les articles homonymes, voir Paré. Ambroise ParéPortrait en médaillon d'Ambroise Paré, portant l'aphorisme de Virgile : Labor improbus omnia vincit. Gravure extraite de Deux livres de chirurgie, de la génération de l'homme, & manière d'extraire les enfans hors du ventre de la mère chez André Wechel (Paris), 1573.BiographieNaissance 1510 (ou 1509)Bourg-Hersent près de Laval, MaineDécès 20 décembre 1590 (à 80 ans) ParisNationalité FrançaiseFormation Collège de Fr...

 

Arena Lviv Informasi stadionOperator«Arena Lviv»LokasiLokasiLviv, UkrainaKoordinat49°46′31″N 24°1′40″E / 49.77528°N 24.02778°E / 49.77528; 24.02778KonstruksiMulai pembangunan20 November 2008Dibuat2008–2011Dibuka29 Oktober 2011Biaya pembuatan€211 juta / ₴2 287 jutaArsitekAlbert Wimmer ZT Gmbh (Wina) / Arnika (Lviv)Kontraktor umumAltkomData teknisPermukaanRumputKapasitas34,915Ukuran lapangan105m by 68mPemakaiFC Karpaty Lviv (2011–sekarang)Sunting ko...

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada Oktober 2016. KH. Drs. Shiddiq Amien, MbaBerkas:Shiddiq Amien.JPG Informasi pribadiLahir(1955-06-13)13 Juni 1955 Tasikmalaya, Jawa Barat, IndonesiaMeninggal31 Oktober 2009(2009-10-31) (umur 54) Bandung, Jawa Barat, IndonesiaKebangsaanIndonesiaProfesiDosen, Ulam...

 

Yak

Yak Bos mutus Status konservasiRentanIUCN2892 TaksonomiKerajaanAnimaliaFilumChordataKelasMammaliaOrdoArtiodactylaFamiliBovidaeGenusBosSpesiesBos mutus Przew., 1883 SubspeciesB. g. grunniens B. g. mutusDistribusi EndemikRepublik Rakyat Tiongkok lbs Yak ( Bos grunniens ), juga dikenal sebagai sapi Tartary, sapi pendengus, [1] atau lembu berambut,[2] adalah spesies sapi peliharaan berbulu panjang yang ditemukan di seluruh wilayah Himalaya di Gilgit-Baltistan (Kashmir,Pakistan) ,N...

 

Existen dudas o desacuerdos sobre la exactitud de la información en este artículo o sección. Consulta el debate al respecto en la página de discusión.Este aviso fue puesto el 21 de mayo de 2021. Confederación IndígenaConfederación Indígena de la Provincia de Caracas 1567-1577 Provincia de CaracasCapital Maracapana (antigua Laguna de Catia en Caracas)Idioma oficial CaribearahuacosjirajaraReligión PoliteísmoPeríodo histórico Guerra hispano-caribeña • 1567 Asamblea de Marac...

هذه المقالة تحتاج للمزيد من الوصلات للمقالات الأخرى للمساعدة في ترابط مقالات الموسوعة. فضلًا ساعد في تحسين هذه المقالة بإضافة وصلات إلى المقالات المتعلقة بها الموجودة في النص الحالي. (مارس 2018)   لمعانٍ أخرى، طالع مقاطعة بولاسكي (توضيح). مقاطعة بولاسكي     الإحداثي�...

 

Defunct American motor vehicle manufacturer 1922 Anderson Touring car at the South Carolina State Museum The Anderson was a United States automobile; considered the most successful automobile ever built in the Southern United States, it was manufactured by a carriage works from 1916 to 1925 in Rock Hill, South Carolina. Started by John Gary Anderson, the company sold cars through a national dealer network. The company used Continental 7R flathead six engines in its vehicles, which were noted ...