Cache coherence

If two clients have a cached copy of a particular memory block and one client changes the block, the other client's copy must be invalidated or updated. If it is not, the system is in an incoherent state: it contains two different records of the same memory block which both claim to be up-to-date.
Incoherent caches: The caches have different values of a single address location.

In computer architecture, cache coherence is the uniformity of shared resource data that is stored in multiple local caches. In a cache coherent system, if multiple clients have a cached copy of the same region of a shared memory resource, all copies are the same. Without cache coherence, a change made to the region by one client may not be seen by others, and errors can result when the data used by different clients is mismatched.[1]

A cache coherence protocol is used to maintain cache coherency. The two main types are snooping and directory-based protocols.

Cache coherence is of particular relevance in multiprocessing systems, where each CPU may have its own local cache of a shared memory resource.

Coherent caches: The value in all the caches' copies is the same.

Overview

In a shared memory multiprocessor system with a separate cache memory for each processor, it is possible to have many copies of shared data: one copy in the main memory and one in the local cache of each processor that requested it. When one of the copies of data is changed, the other copies must reflect that change. Cache coherence is the discipline which ensures that the changes in the values of shared operands (data) are propagated throughout the system in a timely fashion.[2]

The following are the requirements for cache coherence:[3]

Write Propagation
Changes to the data in any cache must be propagated to other copies (of that cache line) in the peer caches.
Transaction Serialization
Reads/Writes to a single memory location must be seen by all processors in the same order.

Theoretically, coherence can be performed at the load/store granularity. However, in practice it is generally performed at the granularity of cache blocks.[4]

Definition

Coherence defines the behavior of reads and writes to a single address location.[3]

One type of data occurring simultaneously in different cache memory is called cache coherence, or in some systems, global memory.

In a multiprocessor system, consider that more than one processor has cached a copy of the memory location X. The following conditions are necessary to achieve cache coherence:[5]

  1. In a read made by a processor P to a location X that follows a write by the same processor P to X, with no writes to X by another processor occurring between the write and the read instructions made by P, X must always return the value written by P.
  2. In a read made by a processor P1 to location X that follows a write by another processor P2 to X, with no other writes to X made by any processor occurring between the two accesses and with the read and write being sufficiently separated, X must always return the value written by P2. This condition defines the concept of coherent view of memory. Propagating the writes to the shared memory location ensures that all the caches have a coherent view of the memory. If processor P1 reads the old value of X, even after the write by P2, we can say that the memory is incoherent.

The above conditions satisfy the Write Propagation criteria required for cache coherence. However, they are not sufficient as they do not satisfy the Transaction Serialization condition. To illustrate this better, consider the following example:

A multi-processor system consists of four processors - P1, P2, P3 and P4, all containing cached copies of a shared variable S whose initial value is 0. Processor P1 changes the value of S (in its cached copy) to 10 following which processor P2 changes the value of S in its own cached copy to 20. If we ensure only write propagation, then P3 and P4 will certainly see the changes made to S by P1 and P2. However, P3 may see the change made by P1 after seeing the change made by P2 and hence return 10 on a read to S. P4 on the other hand may see changes made by P1 and P2 in the order in which they are made and hence return 20 on a read to S. The processors P3 and P4 now have an incoherent view of the memory.

Therefore, in order to satisfy Transaction Serialization, and hence achieve Cache Coherence, the following condition along with the previous two mentioned in this section must be met:

  • Writes to the same location must be sequenced. In other words, if location X received two different values A and B, in this order, from any two processors, the processors can never read location X as B and then read it as A. The location X must be seen with values A and B in that order.[6]

The alternative definition of a coherent system is via the definition of sequential consistency memory model: "the cache coherent system must appear to execute all threads’ loads and stores to a single memory location in a total order that respects the program order of each thread".[4] Thus, the only difference between the cache coherent system and sequentially consistent system is in the number of address locations the definition talks about (single memory location for a cache coherent system, and all memory locations for a sequentially consistent system).

Another definition is: "a multiprocessor is cache consistent if all writes to the same memory location are performed in some sequential order".[7]

Rarely, but especially in algorithms, coherence can instead refer to the locality of reference. Multiple copies of the same data can exist in different cache simultaneously and if processors are allowed to update their own copies freely, an inconsistent view of memory can result.

Coherence mechanisms

The two most common mechanisms of ensuring coherency are snooping and directory-based, each having their own benefits and drawbacks.[8] Snooping based protocols tend to be faster, if enough bandwidth is available, since all transactions are a request/response seen by all processors. The drawback is that snooping isn't scalable. Every request must be broadcast to all nodes in a system, meaning that as the system gets larger, the size of the (logical or physical) bus and the bandwidth it provides must grow. Directories, on the other hand, tend to have longer latencies (with a 3 hop request/forward/respond) but use much less bandwidth since messages are point to point and not broadcast. For this reason, many of the larger systems (>64 processors) use this type of cache coherence.

Snooping

First introduced in 1983,[9] snooping is a process where the individual caches monitor address lines for accesses to memory locations that they have cached.[5] The write-invalidate protocols and write-update protocols make use of this mechanism.
For the snooping mechanism, a snoop filter reduces the snooping traffic by maintaining a plurality of entries, each representing a cache line that may be owned by one or more nodes. When replacement of one of the entries is required, the snoop filter selects for the replacement of the entry representing the cache line or lines owned by the fewest nodes, as determined from a presence vector in each of the entries. A temporal or other type of algorithm is used to refine the selection if more than one cache line is owned by the fewest nodes.[10]

Directory-based

In a directory-based system, the data being shared is placed in a common directory that maintains the coherence between caches. The directory acts as a filter through which the processor must ask permission to load an entry from the primary memory to its cache. When an entry is changed, the directory either updates or invalidates the other caches with that entry.

Distributed shared memory systems mimic these mechanisms in an attempt to maintain consistency between blocks of memory in loosely coupled systems.[11]

Coherence protocols

Coherence protocols apply cache coherence in multiprocessor systems. The intention is that two clients must never see different values for the same shared data.

The protocol must implement the basic requirements for coherence. It can be tailor-made for the target system or application.

Protocols can also be classified as snoopy or directory-based. Typically, early systems used directory-based protocols where a directory would keep a track of the data being shared and the sharers. In snoopy protocols, the transaction requests (to read, write, or upgrade) are sent out to all processors. All processors snoop the request and respond appropriately.

Write propagation in snoopy protocols can be implemented by either of the following methods:

Write-invalidate
When a write operation is observed to a location that a cache has a copy of, the cache controller invalidates its own copy of the snooped memory location, which forces a read from main memory of the new value on its next access.[5]
Write-update
When a write operation is observed to a location that a cache has a copy of, the cache controller updates its own copy of the snooped memory location with the new data.

If the protocol design states that whenever any copy of the shared data is changed, all the other copies must be "updated" to reflect the change, then it is a write-update protocol. If the design states that a write to a cached copy by any processor requires other processors to discard or invalidate their cached copies, then it is a write-invalidate protocol.

However, scalability is one shortcoming of broadcast protocols.

Various models and protocols have been devised for maintaining coherence, such as MSI, MESI (aka Illinois), MOSI, MOESI, MERSI, MESIF, write-once, Synapse, Berkeley, Firefly and Dragon protocol.[2] In 2011, ARM Ltd proposed the AMBA 4 ACE[12] for handling coherency in SoCs. The AMBA CHI (Coherent Hub Interface) specification[13] from ARM Ltd, which belongs to AMBA5 group of specifications defines the interfaces for the connection of fully coherent processors.

See also

References

  1. ^ Marowka, Ami (2010-01-01). "Chapter 2 - Pitfalls and Issues of Manycore Programming". Advances in Computers. Vol. 79. Elsevier. pp. 71–117. doi:10.1016/s0065-2458(10)79002-1.
  2. ^ a b E. Thomadakis, Michael (2011). The Architecture of the Nehalem Processor and Nehalem-EP SMP Platforms (PDF). Texas A&M University. p. 30. Archived from the original (PDF) on 2014-08-11.
  3. ^ a b Yan, Solihin. Fundamentals of parallel multicore architecture. OCLC 884540034.
  4. ^ a b Sorin, Daniel J.; Hill, Mark D.; Wood, David Allen (2011-01-01). A primer on memory consistency and cache coherence. Morgan & Claypool Publishers. OCLC 726930429.
  5. ^ a b c Patterson and Hennessy. Computer Organization and Design - 4th Edition. ISBN 978-0-12-374493-7.
  6. ^ Neupane, Mahesh (April 16, 2004). "Cache Coherence" (PDF). Archived from the original (PDF) on 20 June 2010.
  7. ^ Steinke, Robert C.; Nutt, Gary J. (2004-09-01). "A Unified Theory of Shared Memory Consistency". J. ACM. 51 (5): 800–849. arXiv:cs/0208027. doi:10.1145/1017460.1017464. ISSN 0004-5411. S2CID 3206071.
  8. ^ Patterson, David A.; Hennessy, John L. (1990). Computer Architecture A Quantitative Approach. Morgan Kaufmann Publishers. pp. 467–468. ISBN 1-55860-069-8.
  9. ^ "Ravishankar, Chinya; Goodman, James (February 28, 1983). "Cache Implementation for Multiple Microprocessors"" (PDF). Proceedings of IEEE COMPCON: 346–350.
  10. ^ Rasmus Ulfsnes (June 2013). "Design of a Snoop Filter for Snoop-Based Cache Coherency Protocols" Archived 2014-02-01 at the Wayback Machine (PDF). diva-portal.org. Norwegian University of Science and Technology. Retrieved 2014-01-20.
  11. ^ "Lecture 18: Snooping vs. Directory Based Coherency" (PDF). Berkeley.edu. Retrieved 14 May 2023.
  12. ^ Kriouile (16 September 2013). Formal Analysis of the ACE Specification for Cache Coherent Systems-on-Chip. In Formal Methods for Industrial Critical Systems. Springer Berlin Heidelberg. ISBN 978-3-642-41010-9.
  13. ^ Ltd, Arm. "AMBA | AMBA 5". Arm Developer. Retrieved 2021-04-27.

Further reading

Read other articles:

Chemical compound MedrogestoneClinical dataTrade namesColprone, othersOther namesMetrogestone; Medrogesterone; AY-62022, NSC-123018, R-13615; 6,17α-Dimethyl-6-dehydroprogesterone; 6,17α-Dimethyl-4,6-pregnadiene-3,20-dionePregnancycategory X Routes ofadministrationBy mouthDrug classProgestogen; ProgestinATC codeG03DB03 (WHO) G03FB07 (WHO) (with estrogen)Legal statusLegal status In general: ℞ (Prescription only) Pharmacokinetic dataBioavailabilityNearly 100%[1&#...

 

Prefiks nomor telepon di Malaysia. +60 adalah kode telepon milik negara Malaysia yang berkawasan dia Asia tenggara. Prefiks Area 02 Nihil; bekas kode akses domestik ke  Singapore (Dihentikan pada 15 Mei 2017 oleh Telekom Malaysia dan oleh semua penyedia lainnya pada 1 Juli 2017.)[1][2] 03  Selangor Kuala Lumpur PutrajayaGenting Highlands,  Pahang 04  Perlis Kedah PenangPengkalan Hulu,  Perak 05  PerakCameron Highlands,  Pahang Hulu ...

 

معتمدية   البلد تونس  جزء من ولاية في تونس  تعديل مصدري - تعديل   معتمدية صيادة، ولاية المنستير المعتمديّة هي تقسيم إداري يستخدم في تونس. ويمثل المستوى الثاني للتقسيم الإداري بالجمهورية التونسية، حيث ترجع المعتمدية بالنظر إلى الولاية كما تنقسم إلى بلديات (مدن) ثم ...

1902 1910 Élections législatives françaises de 1906 585 députés à la Chambre des députés 6 et 20 mai 1906 Type d’élection Élections législatives PRRRS – Ferdinand Sarrien Députés élus 132  28 FR – Alexandre Ribot Députés élus 78  49 ALP – Jacques Piou Députés élus 66  23 ARD – Louis Barthou Députés élus 90  28 SFIO – Jean Jaurès Députés élus 54  11 RI – Georges Clemen...

 

Peta bahasa dan dialek di wilayah Bogor Raya. Di Kabupaten Bogor terdapat dua bahasa daerah yang secara dominan digunakan oleh penduduknya di samping bahasa Indonesia sebagai bahasa resmi, yakni bahasa Sunda dan Betawi. Wilayah penggunaan bahasa Sunda[a] meliputi hampir seluruh wilayah Kabupaten Bogor, sedangkan bahasa Betawi hanya dituturkan di bagian tengah dan utara Kabupaten Bogor.[2][3] Saat ini, di Kabupaten Bogor sangat rentan akan pergeseran bahasa. Pergeseran ...

 

Mount Coot-thaQueensland—Legislative AssemblyMount Coot-tha (2008–)StateQueenslandDates current1950–2017NamesakeMount Coot-thaElectors30,979 (2015)Area29 km2 (11.2 sq mi)Coordinates27°28′S 152°58′E / 27.467°S 152.967°E / -27.467; 152.967 Mount Coot-tha was an electoral district in the Legislative Assembly of Queensland in the state of Queensland, Australia from 1950 to 2017.[1] The electoral district encompassed suburbs in Bri...

Village and municipality in Slovakia Church of Saint John Nepomuk Haniska (Hungarian: Enyicke) is a village and municipality in Košice-okolie District in the Kosice Region of eastern Slovakia. History Historically, the village was first mentioned in 1288. Haniska has been its official name since 1921, and was a co-name in the early 20th century and most of the 19th century. Its Hungarian name, its original and former co-name, is Enyiczke, variously spelled Enjiczke, Enjicske, and Enyicske. G...

 

Ular karung Acrochordus javanicus Status konservasiRisiko rendahIUCN176718 TaksonomiKerajaanAnimaliaFilumChordataKelasReptiliaOrdoSquamataFamiliAcrochordidaeGenusAcrochordusSpesiesAcrochordus javanicus Hornst., 1787 lbs Ular karung atau yang sering disebut ular belalai gajah, adalah sejenis ular air yang endemik di kepulauan Nusantara. Ular ini sekerabat dengan ular kadut (Acrochordus granulatus). Dalam bahasa inggris dikenal dengan nama Elephant trunk snake, sedangkan nama ilmiahnya adalah A...

 

Cet article est une ébauche concernant le jeu vidéo. Vous pouvez partager vos connaissances en l’améliorant (comment ?) (voir l’aide à la rédaction).   Liste des listes de jeux vidéo  La liste de jeux Xbox 360 compatibles avec la Xbox One répertorie les jeux vidéo Xbox 360 disponibles sur la console Xbox One, toutes régions confondues. La Xbox One est rétrocompatible à partir de novembre 2015. Quand Phil Spencer annonce cette fonctionnalité, une liste of...

City in Aleppo Governorate, Syria This article is about the city. For other uses, see Aleppo (disambiguation). Halab redirects here. For other uses, see Halab (disambiguation). City in SyriaAleppo ﺣَﻠَﺐCity Ancient City of AleppoAleppo Citadel • The entrance to al-Madina SouqGreat Mosque of Aleppo • Baron HotelSaint Elijah Cathedral • Queiq RiverPanorama of Aleppo at night SealNickname(s): Al-Shahbaa (Arabic: الشَّهْبَاء, romanized: ash-Shahb�...

 

Malaysian singer This biography of a living person needs additional citations for verification. Please help by adding reliable sources. Contentious material about living persons that is unsourced or poorly sourced must be removed immediately from the article and its talk page, especially if potentially libelous.Find sources: Che'Nelle – news · newspapers · books · scholar · JSTOR (September 2016) (Learn how and when to remove this message) Che'NelleChe...

 

This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: C'est Cheese – news · newspapers · books · scholar · JSTOR (December 2009) (Learn how and when to remove this message) 1995 studio album by The Arrogant WormsC'est CheeseStudio album by The Arrogant WormsReleased1995GenreComedyLabelArrogant Worms Record...

1900年美國總統選舉 ← 1896 1900年11月6日 1904 → 447張選舉人票獲勝需224張選舉人票投票率73.2%[1] ▼ 6.1 %   获提名人 威廉·麥金利 威廉·詹寧斯·布賴恩 政党 共和黨 民主党 家鄉州 俄亥俄州 內布拉斯加州 竞选搭档 西奧多·羅斯福 阿德萊·史蒂文森一世 选举人票 292 155 胜出州/省 28 17 民選得票 7,228,864 6,370,932 得票率 51.6% 45.5% 總統選舉結果地圖,紅色代表�...

 

Pusat kota Denizli Denizli merupakan kota yang terletak di Turki bagian barat. Penduduknya berjumlah 525.497 jiwa (2012). Kota ini merupakan ibu kota provinsi Denizli. Kota ini memiliki luas wilayah 798,75 km². Kota ini memiliki angka kepadatan penduduk sebesar 690 jiwa/km². Kota kembar Amasya, Turki Bursa, Turki[1] Muş, Turki Tokat, Turki Almelo, Belanda Pavlodar, Kazakhstan Tbilisi, Georgia Brăila, Romania Samara, Rusia Rhodes, Yunani Betzdorf, Jerman Mogilev, Belarus Larissa, Y...

 

You can help expand this article with text translated from the corresponding article in French. (May 2023) Click [show] for important translation instructions. Machine translation, like DeepL or Google Translate, is a useful starting point for translations, but translators must revise errors as necessary and confirm that the translation is accurate, rather than simply copy-pasting machine-translated text into the English Wikipedia. Do not translate text that appears unreliable or low-qua...

Israeli electoral alliance This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: United Torah Judaism – news · newspapers · books · scholar · JSTOR (January 2022) (Learn how and when to remove this message) United Torah Judaism יהדות התורה‎LeaderYitzhak GoldknopfFounded1992; 32 years a...

 

Melbourne, ArkansasKotaLokasi Melbourne di Izard County, Arkansas.Koordinat: 36°3′35″N 91°53′40″W / 36.05972°N 91.89444°W / 36.05972; -91.89444Koordinat: 36°3′35″N 91°53′40″W / 36.05972°N 91.89444°W / 36.05972; -91.89444NegaraAmerika SerikatNegara bagianArkansasCountyIzardLuas[1] • Total6,98 sq mi (18,07 km2) • Luas daratan6,98 sq mi (18,07 km2) • L...

 

American singer, actor, political activist, and gridiron football player (1898–1976) This article is about the singer and activist. For his son, see Paul Robeson Jr. Paul RobesonRobeson in 1942BornPaul Leroy Robeson(1898-04-09)April 9, 1898Princeton, New Jersey, U.S.DiedJanuary 23, 1976(1976-01-23) (aged 77)Philadelphia, Pennsylvania, U.S.Resting placeFerncliff Cemetery (Greenburgh, New York)Education Rutgers University, New Brunswick (BA) New York University Columbia University (LLB) ...

13th century King of Germany Henry (VII)Depiction in the Chronica regia Coloniensis, 13th centuryKing of Germany (formally King of the Romans)Reign1220–1235Coronation8 May 1222 (Aachen)PredecessorFrederick IISuccessorConrad IVKing of SicilyReign1212–1217PredecessorFrederick IISuccessorFrederick IIKing of ItalyReign1220–1235PredecessorFrederick IISuccessorConrad IVBorn1211Kingdom of SicilyDied12 February 1242 (aged 30–31)Martirano, Calabria,Kingdom of SicilyBurialCosenza, Cala...

 

Guglielmo I d'Inghilterradetto il ConquistatoreGuglielmo I il Conquistatore ritratto sull'Arazzo di BayeuxRe d'InghilterraStemma In carica25 dicembre 1066 –9 settembre 1087 Incoronazione25 dicembre 1066 PredecessoreEdgardo II (de iure)Aroldo II (de facto) SuccessoreGuglielmo II Duca di Normandiacome Guglielmo IIIn carica3 luglio 1035 –9 settembre 1087 PredecessoreRoberto I SuccessoreRoberto II NascitaFalaise, 8 novembre 1028 MorteRouen, 9 settembre 1087 (58 anni) Luog...