Architectural motif in neural networks for aggregating information.
In neural networks, a pooling layer is a kind of network layer that downsamples and aggregates information that is dispersed among many vectors into fewer vectors.[1] It has several uses. It removes redundant information, reducing the amount of computation and memory required, makes the model more robust to small variations in the input, and increases the receptive field of neurons in later layers in the network.
Convolutional neural network pooling
Pooling is most commonly used in convolutional neural networks (CNN). Below is a description of pooling in 2-dimensional CNNs. The generalization to n-dimensions is immediate.
As notation, we consider a tensor , where is height, is width, and is the number of channels. A pooling layer outputs a tensor .
We define two variables called "filter size" (aka "kernel size") and "stride". Sometimes, it is necessary to use a different filter size and stride for horizontal and vertical directions. In such cases, we define 4 variables .
The receptive field of an entry in the output tensor are all the entries in that can affect that entry.
Max pooling
Max Pooling (MaxPool) is commonly used in CNNs to reduce the spatial dimensions of feature maps.
Definewhere means the range . Note that we need to avoid the off-by-one error. The next input isand so on. The receptive field of is , so in general,If the horizontal and vertical filter size and strides differ, then in general,More succinctly, we can write .
If is not expressible as where is an integer, then for computing the entries of the output tensor on the boundaries, max pooling would attempt to take as inputs variables off the tensor. In this case, how those non-existent variables are handled depends on the padding conditions, illustrated on the right.
Global Max Pooling (GMP) is a specific kind of max pooling where the output tensor has shape and the receptive field of is all of . That is, it takes the maximum over each entire channel. It is often used just before the final fully connected layers in a CNN classification head.
Average pooling
Average pooling (AvgPool) is similarly definedGlobal Average Pooling (GAP) is defined similarly to GMP. It was first proposed in Network-in-Network.[2] Similarly to GMP, it is often used just before the final fully connected layers in a CNN classification head.
Interpolations
There are some interpolations of max pooling and average pooling.
Mixed Pooling is a linear sum of maxpooling and average pooling.[3] That is,
where is either a hyperparameter, a learnable parameter, or randomly sampled anew every time.
Lp Pooling is like average pooling, but uses Lp norm average instead of average:where is the size of receptive field, and is a hyperparameter. If all activations are non-negative, then average pooling is the case of , and maxpooling is the case of . Square-root pooling is the case of .[4]
Stochastic pooling samples a random activation from the receptive field with probability . It is the same as average pooling in expectation.[5]
Softmax pooling is like maxpooling, but uses softmax, i.e. where . Average pooling is the case of , and maxpooling is the case of [4]
Local Importance-based Pooling generalizes softmax pooling by where is a learnable function.[6]
Other poolings
Spatial pyramidal pooling applies max pooling (or any other form of pooling) in a pyramid structure. That is, it applies global max pooling, then applies max pooling to the image divided into 4 equal parts, then 16, etc. The results are then concatenated. It is a hierarchical form of global pooling, and similar to global pooling, it is often used just before a classification head.[7]
Region of Interest Pooling (also known as RoI pooling) is a variant of max pooling used in R-CNNs for object detection.[8] It is designed to take an arbitrarily-sized input matrix, and output a fixed-sized output matrix.
Covariance pooling computes the covariance matrix of the vectors which is then flattened to a -dimensional vector . Global covariance pooling is used similarly to global max pooling. As average pooling computes the average, which is a first-degree statistic, and covariance is a second-degree statistic, covariance pooling is also called "second-order pooling". It can be generalized to higher-order poolings.[9][10]
Blur Pooling means applying a blurring method before downsampling. For example, the Rect-2 blur pooling means taking an average pooling at , then taking every second pixel (identity with ).[11]
Vision Transformer pooling
In Vision Transformers (ViT), there are the following common kinds of poolings.
BERT-like pooling uses a dummy [CLS] token ("classification"). For classification, the output at [CLS] is the classification token, which is then processed by a LayerNorm-feedforward-softmax module into a probability distribution, which is the network's prediction of class probability distribution. This is the one used by the original ViT[12] and Masked Autoencoder.[13]
Global average pooling (GAP) does not use the dummy token, but simply takes the average of all output tokens as the classification token. It was mentioned in the original ViT as being equally good.[12]
Multihead attention pooling (MAP) applies a multiheaded attention block to pooling. Specifically, it takes as input a list of vectors , which might be thought of as the output vectors of a layer of a ViT. It then applies a feedforward layer on each vector, resulting in a matrix . This is then sent to a multiheaded attention, resulting in , where is a matrix of trainable parameters.[14] This was first proposed in the Set Transformer architecture.[15]
Later papers demonstrated that GAP and MAP both perform better than BERT-like pooling.[14][16]
In graph neural networks (GNN), there are also two forms of pooling: global and local. Global pooling can be reduced to a local pooling where the receptive field is the entire output.
Local pooling: a local pooling layer coarsens the graph via downsampling. Local pooling is used to increase the receptive field of a GNN, in a similar fashion to pooling layers in convolutional neural networks. Examples include k-nearest neighbours pooling, top-k pooling,[17] and self-attention pooling.[18]
Global pooling: a global pooling layer, also known as readout layer, provides fixed-size representation of the whole graph. The global pooling layer must be permutation invariant, such that permutations in the ordering of graph nodes and edges do not alter the final output.[19] Examples include element-wise sum, mean or maximum.
Local pooling layers coarsen the graph via downsampling. We present here several learnable local pooling strategies that have been proposed.[19] For each cases, the input is the initial graph is represented by a matrix of node features, and the graph adjacency matrix . The output is the new matrix of node features, and the new graph adjacency matrix .
Top-k pooling
We first set
where is a learnable projection vector. The projection vector computes a scalar projection value for each graph node.
The top-k pooling layer [17] can then be formalised as follows:
where is the subset of nodes with the top-k highest projection scores, denotes element-wise matrix multiplication, and is the sigmoid function. In other words, the nodes with the top-k highest projection scores are retained in the new adjacency matrix . The operation makes the projection vector trainable by backpropagation, which otherwise would produce discrete outputs.[17]
Self-attention pooling
We first set
where is a generic permutation equivariant GNN layer (e.g., GCN, GAT, MPNN).
The Self-attention pooling layer[18] can then be formalised as follows:
where is the subset of nodes with the top-k highest projection scores, denotes element-wise matrix multiplication.
The self-attention pooling layer can be seen as an extension of the top-k pooling layer. Differently from top-k pooling, the self-attention scores computed in self-attention pooling account both for the graph features and the graph topology.
History
In early 20th century, neuroanatomists noticed a certain motif where multiple neurons synapse to the same neuron. This was given a functional explanation as "local pooling", which makes vision translation-invariant. (Hartline, 1940)[20] gave supporting evidence for the theory by electrophysiological experiments on the receptive fields of retinal ganglion cells. The Hubel and Wiesel experiments showed that the vision system in cats is similar to a convolutional neural network, with some cells summing over inputs from the lower layer.[21]: Fig. 19, 20 See (Westheimer, 1965)[22] for citations to these early literature.
During the 1970s, to explain the effects of depth perception, some such as (Julesz and Chang, 1976)[23] proposed that the vision system implements a disparity-selective mechanism by global pooling, where the outputs from matching pairs of retinal regions in the two eyes are pooled in higher order cells. See [24] for citations to these early literature.
In artificial neural networks, max pooling was used in 1990 for speech processing (1-dimensional convolution).[25]
^Zhang, Aston; Lipton, Zachary; Li, Mu; Smola, Alexander J. (2024). "7.5. Pooling". Dive into deep learning. Cambridge New York Port Melbourne New Delhi Singapore: Cambridge University Press. ISBN978-1-009-38943-3.
^Zhang, Aston; Lipton, Zachary; Li, Mu; Smola, Alexander J. (2024). "14.8. Region-based CNNs (R-CNNs)". Dive into deep learning. Cambridge New York Port Melbourne New Delhi Singapore: Cambridge University Press. ISBN978-1-009-38943-3.
Untuk kebijakan atau pedoman Wikipedia, lihat Wikipedia:Jangan terbebani aturan. Diagram alir yang menjelaskan arti abaikan peraturan. Abaikan semua peraturan adalah aturan untuk meniadakan semua peraturan lainnya.[1] Abaikan semua peraturan adalah aturan ensiklopedia konten terbuka bahasa Inggris, Wikipedia, dan juga beberapa edisi Wikipedia dalam beberapa bahasa lainnya. Formulasinya umumnya Jika sebuah aturan mencegah Anda memperbaiki atau mempertahankan Wikipedia, abaikan saja (pe...
Pour les articles homonymes, voir CSJ. Communauté Saint-Jean Ecce Mater tua Repères historiques Fondation 1975 Fondateur(s) Marie-Dominique Philippe Lieu de fondation Fribourg (Suisse) Siège Rimont (Fley), Saône-et-Loire Fiche d'identité Église Catholique Type Institut religieux de droit diocésain Dirigeant François-Xavier Cazali (frères), Paul-Marie (sœurs contemplatives), Claire-de-Jésus (sœurs apostoliques) Membres 422 frères, 190 sœurs apostoliques, 90 sœurs contemplatives...
Dibawah ini adalah daftar penulis yang dilarang di Jerman Nazi. Disusun menurut abjad. A Alfred Adler Berkas:Alfred Adler (1870-1937) Austrian psychiatrist.jpgPsikiatris Austria Alfred Adler. Hermann Adler Max Adler Raoul Auernheimer B Bertolt Brecht. Otto Bauer Vicki Baum Johannes R. Becher Richard Beer-Hofmann Walter Benjamin Walter A. Berendsohn Ernst Bloch Felix Braun Bertolt Brecht Willi Bredel Hermann Broch Ferdinand Bruckner D Ludwig Dexheimer[1] Alfred Döblin John Dos Passos ...
Oleh Bazylevyč Nazionalità Unione Sovietica Ucraina Altezza 175 cm Peso 70 kg Calcio Ruolo Allenatore (ex attaccante) Termine carriera 1968 - giocatore1997 - allenatore Carriera Squadre di club1 1955-1956FShM Kiev? (?)1957-1965 Dinamo Kiev162 (54)1966 Černomorec35 (6)1967-1968 Šachtër Donec'k32 (9) Carriera da allenatore 1970 Desna Černihiv1971 Šachtar Kadiïvka1972-1973 Šachtër Donec'k1974-1976 Dinamo KievAssistente1974-1976 Unione Sovi...
Irish republican (1881–1916) Éamonn CeanntBorn(1881-09-21)21 September 1881Ballymoe, County Galway, IrelandDied8 May 1916(1916-05-08) (aged 34)Kilmainham Gaol, Dublin, IrelandCause of deathExecution by firing squadBuriedArbour Hill Prison, DublinAllegianceIrish VolunteersIrish Republican BrotherhoodYears of service1913–1916RankCommandantCommands held4th BattalionBattles/warsEaster RisingSpouse(s)Áine Ceannt Éamonn Ceannt (21 September 1881 – 8 May 1916), born Edward Th...
Study of organic evolution of plants based on fossils This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Paleobotany – news · newspapers · books · scholar · JSTOR (October 2011) (Learn how and when to remove this template message) A fossil Betula leopoldae (birch) leaf from the Early Eocene of Washington state,...
Amerindian language redirects here. For the proposed language family, see Amerind languages. This article's use of external links may not follow Wikipedia's policies or guidelines. Please improve this article by removing excessive or inappropriate external links, and converting useful links where appropriate into footnote references. (January 2022) (Learn how and when to remove this message) Yucatec Maya writing in the Dresden Codex, ca. 11–12th century, Chichen Itza The Indigenous languag...
YarmouthThe second Yarmouth station around 1908General informationLocationRailroad Avenue at Cross StreetYarmouth Port, MassachusettsCoordinates41°41′55″N 70°15′30″W / 41.69861°N 70.25833°W / 41.69861; -70.25833Line(s)Cape Cod Main Line, Hyannis SecondaryHistoryOpened1854Rebuilt1878, 1941Former services Preceding station New York, New Haven and Hartford Railroad Following station Barnstabletoward Boston Boston–Provincetown Bass Rivertoward Provincetown...
Pagina tappeto da un manoscritto miniato. L'ornamento a motivo interlacciato islamico è un tipo di decorazione sviluppato nelle aree islamiche. Esso può essere suddiviso in arabeschi, utilizzando elementi vegetali ricurvi e girih, utilizzando per lo più forme geometriche con linee rette o curve regolari. Entrambe queste forme di arte islamica sono caratterizzate da ricchi intrecci anche nell'arte dell'Impero Bizantino e in quello dell'arte copta. Indice 1 Panoramica 2 Arabesco 3 Girih 4 Ga...
Hong Kong actor and filmmaker For other uses, see Stephen Chow (disambiguation). In this Hong Kong name, the surname is Chow. In accordance with Hong Kong custom, the Western-style name is Stephen and the Chinese-style name is Sing-chi. Stephen Chow周星馳Chow in 2008PronunciationJāu SīngchìhBornStephen Chow Sing-chi (1962-06-22) 22 June 1962 (age 61)British Hong KongOccupations Director Actor Comedian Screenwriter Producer Years activeAs Director1994–presentAs Actor1988–20...
Government from 1783 to 1801 led by William Pitt the Younger For other uses, see Pitt ministry. First Pitt ministry1783–1801Pitt by Thomas GainsboroughDate formed19 December 1783 (1783-12-19)Date dissolved14 March 1801 (1801-03-14)People and organisationsMonarchGeorge IIIPrime MinisterWilliam Pitt the YoungerTotal no. of members38 appointmentsMember partiesTory PartyWhig Party (1794–1801)Status in legislatureMinority (1783–1784)Majority (1784–1794)Majorit...
Quantum algorithm In quantum computing, the variational quantum eigensolver (VQE) is a quantum algorithm for quantum chemistry, quantum simulations and optimization problems. It is a hybrid algorithm that uses both classical computers and quantum computers to find the ground state of a given physical system. Given a guess or ansatz, the quantum processor calculates the expectation value of the system with respect to an observable, often the Hamiltonian, and a classical optimizer is used to im...
County in Florida, United States County in FloridaHillsborough CountyCountyDowntown Tampa skyline FlagSealLogoLocation within the U.S. state of FloridaFlorida's location within the U.S.Coordinates: 27°55′N 82°21′W / 27.91°N 82.35°W / 27.91; -82.35Country United StatesState FloridaFoundedJanuary 25, 1834Named forWills Hill, Earl of HillsboroughSeatTampaLargest cityTampaArea • Total1,266 sq mi (3,280 km2) • Land1,0...
State park in Virginia, United States Seven Bends State ParkLocation of Seven Bends State ParkLocationShenandoah County, VirginiaNearest cityWoodstock, VirginiaCoordinates38°51′17″N 78°29′25″W / 38.854849°N 78.490395°W / 38.854849; -78.490395Area1,066 acres (431 ha)Established2019Governing bodyVirginia Department of Conservation and Recreation Seven Bends State Park is a state park in the U.S. state of Virginia, located approximately 2 miles ...
1947 history book The Unknown Revolution AuthorVolineSubjectRussian historyPublication date1947 The Unknown Revolution is a 1947 history of the Russian Revolution by Voline. Publication Voline finished the book in 1940 while in Marseilles.[1] After his death in 1945,[2] it was first published posthumously in 1947. Following 1968 events in France, the book was republished in French paperback without additional editorial content by Pierre Belfond [fr] as part of a s...
Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada Januari 2023. African Journal of International Affairs Singkatan (ISO)Afr. J. Int. Aff.Disiplin ilmuHubungan internasionalBahasaInggris, PrancisDisunting olehAdebayo Olukoshi, Emmanuel Pondi, Tukumbi Lumumba Kasango, Cyril ObiDetail publikasiPenerbitCouncil fo...
مسجد حسينية معلومات عامة الموقع أبركوه[1] القرية أو المدينة أبر كوه، محافظة يزد الدولة إيران تعديل مصدري - تعديل مسجد حسينية هو مسجد تاريخي يعود إلى عصر الدولة التيمورية، ويقع في أبر كوه.[2] مراجع ^ Wiki Loves Monuments monuments database، 2 نوفمبر 2017، QID:Q28563569 ^ Encyclopaedia of the Irania...