Multiply–accumulate operation

In computing, especially digital signal processing, the multiply–accumulate (MAC) or multiply-add (MAD) operation is a common step that computes the product of two numbers and adds that product to an accumulator. The hardware unit that performs the operation is known as a multiplier–accumulator (MAC unit); the operation itself is also often called a MAC or a MAD operation. The MAC operation modifies an accumulator a:

When done with floating-point numbers, it might be performed with two roundings (typical in many DSPs), or with a single rounding. When performed with a single rounding, it is called a fused multiply–add (FMA) or fused multiply–accumulate (FMAC).

Modern computers may contain a dedicated MAC, consisting of a multiplier implemented in combinational logic followed by an adder and an accumulator register that stores the result. The output of the register is fed back to one input of the adder, so that on each clock cycle, the output of the multiplier is added to the register. Combinational multipliers require a large amount of logic, but can compute a product much more quickly than the method of shifting and adding typical of earlier computers. Percy Ludgate was the first to conceive a MAC in his Analytical Machine of 1909,[1] and the first to exploit a MAC for division (using multiplication seeded by reciprocal, via the convergent series (1+x)−1). The first modern processors to be equipped with MAC units were digital signal processors, but the technique is now also common in general-purpose processors.[2][3][4][5]

In floating-point arithmetic

When done with integers, the operation is typically exact (computed modulo some power of two). However, floating-point numbers have only a certain amount of mathematical precision. That is, digital floating-point arithmetic is generally not associative or distributive. (See Floating-point arithmetic § Accuracy problems.) Therefore, it makes a difference to the result whether the multiply–add is performed with two roundings, or in one operation with a single rounding (a fused multiply–add). IEEE 754-2008 specifies that it must be performed with one rounding, yielding a more accurate result.[6]

Fused multiply–add

A fused multiply–add (FMA or fmadd)[7] is a floating-point multiply–add operation performed in one step (fused operation), with a single rounding. That is, where an unfused multiply–add would compute the product b × c, round it to N significant bits, add the result to a, and round back to N significant bits, a fused multiply–add would compute the entire expression a + (b × c) to its full precision before rounding the final result down to N significant bits.

A fast FMA can speed up and improve the accuracy of many computations that involve the accumulation of products:

Fused multiply–add can usually be relied on to give more accurate results. However, William Kahan has pointed out that it can give problems if used unthinkingly.[8] If x2y2 is evaluated as ((x × x) − y × y) (following Kahan's suggested notation in which redundant parentheses direct the compiler to round the (x × x) term first) using fused multiply–add, then the result may be negative even when x = y due to the first multiplication discarding low significance bits. This could then lead to an error if, for instance, the square root of the result is then evaluated.

When implemented inside a microprocessor, an FMA can be faster than a multiply operation followed by an add. However, standard industrial implementations based on the original IBM RS/6000 design require a 2N-bit adder to compute the sum properly.[9]

Another benefit of including this instruction is that it allows an efficient software implementation of division (see division algorithm) and square root (see methods of computing square roots) operations, thus eliminating the need for dedicated hardware for those operations.[10]

Dot product instruction

Some machines combine multiple fused multiply add operations into a single step, e.g. performing a four-element dot-product on two 128-bit SIMD registers a0×b0 + a1×b1 + a2×b2 + a3×b3 with single cycle throughput.

Support

The FMA operation is included in IEEE 754-2008.

The Digital Equipment Corporation (DEC) VAX's POLY instruction is used for evaluating polynomials with Horner's rule using a succession of multiply and add steps. Instruction descriptions do not specify whether the multiply and add are performed using a single FMA step.[11] This instruction has been a part of the VAX instruction set since its original 11/780 implementation in 1977.

The 1999 standard of the C programming language supports the FMA operation through the fma() standard math library function and the automatic transformation of a multiplication followed by an addition (contraction of floating-point expressions), which can be explicitly enabled or disabled with standard pragmas (#pragma STDC FP_CONTRACT). The GCC and Clang C compilers do such transformations by default for processor architectures that support FMA instructions. With GCC, which does not support the aforementioned pragma,[12] this can be globally controlled by the -ffp-contract command line option.[13]

The fused multiply–add operation was introduced as "multiply–add fused" in the IBM POWER1 (1990) processor,[14] but has been added to numerous other processors since then:

See also

References

  1. ^ "The Feasibility of Ludgate's Analytical Machine". Archived from the original on 2019-08-07. Retrieved 2020-08-30.
  2. ^ Lyakhov, Pavel; Valueva, Maria; Valuev, Georgii; Nagornov, Nikolai (January 2020). "A Method of Increasing Digital Filter Performance Based on Truncated Multiply-Accumulate Units". Applied Sciences. 10 (24): 9052. doi:10.3390/app10249052.
  3. ^ Tung Thanh Hoang; Sjalander, M.; Larsson-Edefors, P. (May 2009). "Double Throughput Multiply-Accumulate unit for FlexCore processor enhancements". 2009 IEEE International Symposium on Parallel & Distributed Processing. pp. 1–7. doi:10.1109/IPDPS.2009.5161212. ISBN 978-1-4244-3751-1. S2CID 14535090.
  4. ^ Kang, Jongsung; Kim, Taewhan (2020-03-01). "PV-MAC: Multiply-and-accumulate unit structure exploiting precision variability in on-device convolutional neural networks". Integration. 71: 76–85. doi:10.1016/j.vlsi.2019.11.003. ISSN 0167-9260. S2CID 211264132.
  5. ^ "mad - ps". 20 November 2019. Retrieved 2021-08-14.
  6. ^ Whitehead, Nathan; Fit-Florea, Alex (2011). "Precision & Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs" (PDF). nvidia. Retrieved 2013-08-31.
  7. ^ "fmadd instrs". IBM.
  8. ^ Kahan, William (1996-05-31). "IEEE Standard 754 for Binary Floating-Point Arithmetic".
  9. ^ Quinnell, Eric (May 2007). Floating-Point Fused Multiply–Add Architectures (PDF) (PhD thesis). Retrieved 2011-03-28.
  10. ^ Markstein, Peter (November 2004). Software Division and Square Root Using Goldschmidt's Algorithms (PDF). 6th Conference on Real Numbers and Computers. CiteSeerX 10.1.1.85.9648.
  11. ^ "VAX instruction of the week: POLY". Archived from the original on 2020-02-13.
  12. ^ "Bug 20785 - Pragma STDC * (C99 FP) unimplemented". gcc.gnu.org. Retrieved 2022-02-02.
  13. ^ "Optimize Options (Using the GNU Compiler Collection (GCC))". gcc.gnu.org. Retrieved 2022-02-02.
  14. ^ Montoye, R. K.; Hokenek, E.; Runyon, S. L. (January 1990). "Design of the IBM RISC System/6000 floating-point execution unit". IBM Journal of Research and Development. 34 (1): 59–70. doi:10.1147/rd.341.0059.Closed access icon
  15. ^ "Godson-3 Emulates x86: New MIPS-Compatible Chinese Processor Has Extensions for x86 Translation".
  16. ^ Hollingsworth, Brent (October 2012). "New "Bulldozer" and "Piledriver" Instructions". AMD Developer Central.
  17. ^ "Intel adds 22nm octo-core 'Haswell' to CPU design roadmap". The Register. Archived from the original on 2012-02-17. Retrieved 2008-08-19.
  18. ^ "STM32 Cortex-M33 MCUs programming manual" (PDF). ST. Retrieved 2024-05-06.

Read other articles:

Іоанна Бургундськафр. Jeanne de Bourgogne Іоанна Бургундськанадгробок Іоанни Бургундської Прапор принцеса Бургундії 1293 — 1313 Прапор графиня-консорт Валуа 1313 — 1328 Наступник: Бланка Французька Прапор Королева-консорт Франції 1328 — 1348 Попередник: Іоанна Евре Насту�...

هذه المقالة بحاجة لمراجعة خبير مختص في مجالها. يرجى من المختصين في مجالها مراجعتها وتطويرها. (يوليو 2016) حليب الصويامعلومات عامةاسم آخر Soya milk豆漿 or 豆奶 (بالصينية: bean thick liquid, or bean milk)豆乳 (باليابانية)두유 or 豆乳 (الكورية)المنشأ الصينتاريخ الابتكار c. 1365النوع حليب النبات — soy food (en) [...

See also: List of Italian male actors This article relies largely or entirely on a single source. Relevant discussion may be found on the talk page. Please help improve this article by introducing citations to additional sources.Find sources: List of Italian actresses – news · newspapers · books · scholar · JSTOR (July 2022) Lists of Italian films 1910s 1910 1911 1912 1913 19141915 1916 1917 1918 1919 1920s 1920 1921 1922 1923 19241925 1926 1927 1928 1...

This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: List of slums in India – news · newspapers · books · scholar · JSTOR (November 2016) (Learn how and when to remove this template message) Dharavi This is a list of slums in India. List Delhi Munirka, Delhi Talkatora Near Kolkata Pilkhana Tikiapara Basanti Mumba...

Ordo Santo HieronimusSingkatanOrdo Santo Hieronimus (O.S.H.)Tanggal pendirianAkhir abad ke-14TipeOrdo KatolikKantor pusatOrden de San JerónimoMonasterio de Santa María del ParralSubida al Parral, 240003 - Segovia, EspañaSitus webwww.monjesjeronimos.es Ordo Santo Hieronimus (bahasa Latin: Ordo Sancti Hieronymi, disingkat O.S.H.) adalah sebuah ordo Katolik tertutup dan juga merupakan istilah untuk sekumpulan biarawan pertapa yang hidup berdasarkan Peraturan Santo Augustinus, walaupun ins...

New Mexico State Poetry SocietyEstablished1969TypePoetry organizationLocationSanta Fe, New MexicoWebsitewww.nmpoetry.com/wp/ The New Mexico State Poetry Society (MSPS) is a non-profit state-level poetry association in the U.S. state of New Mexico, which is affiliated with the National Federation of State Poetry Societies (NFSPS). The organization promotes poetry, conducts monthly and annual contests, publishes poetry books and organizes periodic meetings, workshops and festivals. History Poet...

This is a list of flags inscribed with Latin-language text. Flag Dates used Latin text English translation Aguascalientes –present 1. BONA TERRA, BONA GENS2. AQUA CLARA, CLARUS CŒLUM 1. GOOD EARTH, GOOD PEOPLE2. CLEAR WATER, CLEAR SKY Alabama (reverse)[1] 1861–65 NOLI ME TANGERE TOUCH ME NOT Alabama (Governor) 1939–present AUDEMUS JURA NOSTRA DEFENDERE WE DARE DEFEND OUR RIGHTS Alajuela Province[1] –present POR [sic] PATRIA NOSTRA SANGUIS NOSTEA FOR OUR COUNTRY OUR BL...

Not to be confused with Holmes, Lancashire. Human settlement in EnglandHolmeswoodHolmeswood Methodist church and schoolHolmeswoodLocation in West LancashireShow map of the Borough of West LancashireHolmeswoodLocation within LancashireShow map of LancashireOS grid referenceSD430167Civil parishRuffordDistrictWest LancashireShire countyLancashireRegionNorth WestCountryEnglandSovereign stateUnited KingdomPost townOrmskirkPostcode districtL40Dialling code01704PoliceLa...

La cerveza es considerada en algunas culturas como bebida social. Su expansión mundial a comienzos del siglo XX d. C. es un ejemplo de alimento industrializado. Café-Concert (1878). Óleo de Édouard Manet (Walters Art Museum, Baltimore). El pan y la cerveza eran considerados alimentos básicos en las dietas de algunas culturas. Bier und Rettich (1898). Óleo de Albert Anker (Kunstmuseum Bern). La evolución de la cerveza desde la antigüedad se debe a los grandes avances en ...

This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages) This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Goes TV Tower – news · newspapers · books · scholar · JSTOR (December 2022) (Learn how and when to remove t...

Croatian footballer (born 1998) Petar Musa Musa with Benfica in 2022Personal informationDate of birth (1998-03-04) 4 March 1998 (age 25)[1]Place of birth Zagreb, CroatiaHeight 1.90 m (6 ft 3 in)[2]Position(s) StrikerTeam informationCurrent team BenficaNumber 33Youth career2006–2007 Hrvatski Dragovoljac2007–2015 NK ZagrebSenior career*Years Team Apps (Gls)2015–2017 NK Zagreb II 21 (16)2015–2017 NK Zagreb 16 (0)2017 Inter Zaprešić 0 (0)2017–2022 S...

This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Horseshoe Bay, Bermuda – news · newspapers · books · scholar · JSTOR (October 2018) (Learn how and when to remove this template message) The bay Horseshoe Bay is shown at the West of this map of Warwick Camp Horseshoe Bay is a well-known beach in Bermuda.[1...

يفتقر محتوى هذه المقالة إلى الاستشهاد بمصادر. فضلاً، ساهم في تطوير هذه المقالة من خلال إضافة مصادر موثوق بها. أي معلومات غير موثقة يمكن التشكيك بها وإزالتها. (نوفمبر 2019) دوري تركمانستان 2009 تفاصيل الموسم دوري تركمانستان  النسخة 17  البلد تركمانستان  عدد المشاركين 9  ...

You can help expand this article with text translated from the corresponding article in Chinese. (October 2011) Click [show] for important translation instructions. View a machine-translated version of the Chinese article. Machine translation, like DeepL or Google Translate, is a useful starting point for translations, but translators must revise errors as necessary and confirm that the translation is accurate, rather than simply copy-pasting machine-translated text into the English Wiki...

Taman Nasional TorndirrupAustralia BaratIUCN Kategori II (Taman Nasional)JurangDidirikan1918Luas39,36 km2 (15,2 sq mi)[1]Visitation201,000 (in 2006 [2])Pengelola otoritasDepartemen Taman dan Satwa LiarSitus webTaman Nasional TorndirrupLihat pulaDaftar kawasan lindung diAustralia Barat Taman Nasional Torndirrup adalah taman nasional yang berlokasi di wilayah Great Southern, Australia Barat. Taman Nasional ini berjarak sekitar 400 kilometer (249 mi) dari...

13th-century Chinese emperor In this Chinese name, the family name is Zhao. This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Zhao Bing – news · newspapers · books · scholar · JSTOR (February 2014) (Learn how and when to remove this template message) Zhao Bing 趙昺Emperor of the Song dynastyReign10 May 1278 ...

Ōhira 大衡村DesaKantor Desa Ōhira BenderaEmblemLokasi Ōhira di Prefektur MiyagiŌhiraLokasi di JepangKoordinat: 38°28′2.1″N 140°52′47.2″E / 38.467250°N 140.879778°E / 38.467250; 140.879778Koordinat: 38°28′2.1″N 140°52′47.2″E / 38.467250°N 140.879778°E / 38.467250; 140.879778Negara JepangWilayahTōhokuPrefektur MiyagiDistrikKurokawaPemerintahan • WalidesaHiromi OgawaLuas • Total60,32...

Esta página cita fontes, mas que não cobrem todo o conteúdo. Ajude a inserir referências (Encontre fontes: ABW  • CAPES  • Google (N • L • A)). Conteúdo não verificável pode ser removido. (Março de 2022) Flávia Ticiana Imperatriz-consorte romana Flavia Titiana no Promptuarii Iconum Insigniorum Reinado 1 de janeiro de 193-28 de março de 193 Consorte Pertinax Antecessor(a) Brútia Crispina Sucessor(a) Mânlia Escantila Flor...

Untuk kegunaan lain, lihat Hudson's Bay (disambiguasi). Hudson's Bay CompanyCompagnie de la Baie d'HudsonJenisTerbuka[1]Kode emitenTSX: HBCDidirikanLondon, Inggris(2 Mei 1670)KantorpusatSimpson TowerToronto, Ontario, KanadaTokohkunciRichard Baker,[2] Governor & Executive Chairman[3]Bonnie Brooks, Vice Chairman[4] Elizabeth 'Liz' Rodbell, President[4][5]Pendapatan $5,223 miliar CAD (2014)Laba bersih $ 258,1 juta CAD (2014)Total aset $7,9...

Questa voce o sezione sull'argomento politici iracheni non cita le fonti necessarie o quelle presenti sono insufficienti. Puoi migliorare questa voce aggiungendo citazioni da fonti attendibili secondo le linee guida sull'uso delle fonti. Ghāzī Mashʿal ʿAjīl al-Yāwar Presidente dell'IraqDurata mandato28 giugno 2004 –7 aprile 2005 PredecessoreLewis Paul Bremer SuccessoreJalal Talabani Primo ministro dell'IraqDurata mandato17 maggio 2004 –28 maggio 200...