In computer science, an optimal binary search tree (Optimal BST), sometimes called a weight-balanced binary tree,[1] is a binary search tree which provides the smallest possible search time (or expected search time) for a given sequence of accesses (or access probabilities). Optimal BSTs are generally divided into two types: static and dynamic.
In the static optimality problem, the tree cannot be modified after it has been constructed. In this case, there exists some particular layout of the nodes of the tree which provides the smallest expected search time for the given access probabilities. Various algorithms exist to construct or approximate the statically optimal tree given the information on the access probabilities of the elements.
In the dynamic optimality problem, the tree can be modified at any time, typically by permitting tree rotations. The tree is considered to have a cursor starting at the root which it can move or use to perform modifications. In this case, there exists some minimal-cost sequence of these operations which causes the cursor to visit every node in the target access sequence in order. The splay tree is conjectured to have a constant competitive ratio compared to the dynamically optimal tree in all cases, though this has not yet been proven.
In the static optimality problem as defined by Knuth,[2] we are given a set of n ordered elements and a set of 2 n + 1 {\displaystyle 2n+1} probabilities. We will denote the elements a 1 {\displaystyle a_{1}} through a n {\displaystyle a_{n}} and the probabilities A 1 {\displaystyle A_{1}} through A n {\displaystyle A_{n}} and B 0 {\displaystyle B_{0}} through B n {\displaystyle B_{n}} . A i {\displaystyle A_{i}} is the probability of a search being done for element a i {\displaystyle a_{i}} (or successful search).[3] For 1 ≤ ≤ --> i < n {\displaystyle 1\leq i<n} , B i {\displaystyle B_{i}} is the probability of a search being done for an element between a i {\displaystyle a_{i}} and a i + 1 {\displaystyle a_{i+1}} (or unsuccessful search),[3] B 0 {\displaystyle B_{0}} is the probability of a search being done for an element strictly less than a 1 {\displaystyle a_{1}} , and B n {\displaystyle B_{n}} is the probability of a search being done for an element strictly greater than a n {\displaystyle a_{n}} . These 2 n + 1 {\displaystyle 2n+1} probabilities cover all possible searches, and therefore add up to one.
The static optimality problem is the optimization problem of finding the binary search tree that minimizes the expected search time, given the 2 n + 1 {\displaystyle 2n+1} probabilities. As the number of possible trees on a set of n elements is ( 2 n n ) 1 n + 1 {\displaystyle {2n \choose n}{\frac {1}{n+1}}} ,[2] which is exponential in n, brute-force search is not usually a feasible solution.
In 1971, Knuth published a relatively straightforward dynamic programming algorithm capable of constructing the statically optimal tree in only O(n2) time.[2] In this work, Knuth extended and improved the dynamic programming algorithm by Edgar Gilbert and Edward F. Moore introduced in 1958.[4] Gilbert's and Moore's algorithm required O ( n 3 ) {\displaystyle O(n^{3})} time and O ( n 2 ) {\displaystyle O(n^{2})} space and was designed for a particular case of optimal binary search trees construction (known as optimal alphabetic tree problem[5]) that considers only the probability of unsuccessful searches, that is, ∑ ∑ --> i = 1 n A i = 0 {\textstyle \sum _{i=1}^{n}A_{i}=0} . Knuth's work relied upon the following insight: the static optimality problem exhibits optimal substructure; that is, if a certain tree is statically optimal for a given probability distribution, then its left and right subtrees must also be statically optimal for their appropriate subsets of the distribution (known as monotonicity property of the roots).
To see this, consider what Knuth calls the "weighted path length" of a tree. The weighted path length of a tree of n elements is the sum of the lengths of all 2 n + 1 {\displaystyle 2n+1} possible search paths, weighted by their respective probabilities. The tree with the minimal weighted path length is, by definition, statically optimal.
But weighted path lengths have an interesting property. Let E be the weighted path length of a binary tree, EL be the weighted path length of its left subtree, and ER be the weighted path length of its right subtree. Also let W be the sum of all the probabilities in the tree. Observe that when either subtree is attached to the root, the depth of each of its elements (and thus each of its search paths) is increased by one. Also observe that the root itself has a depth of one. This means that the difference in weighted path length between a tree and its two subtrees is exactly the sum of every single probability in the tree, leading to the following recurrence:
This recurrence leads to a natural dynamic programming solution. Let E i j {\displaystyle E_{ij}} be the weighted path length of the statically optimal search tree for all values between ai and aj, let W i j {\displaystyle W_{ij}} be the total weight of that tree, and let R i j {\displaystyle R_{ij}} be the index of its root. The algorithm can be built using the following formulas:
The naive implementation of this algorithm actually takes O(n3) time, but Knuth's paper includes some additional observations which can be used to produce a modified algorithm taking only O(n2) time.
In addition to its dynamic programming algorithm, Knuth proposed two heuristics (or rules) to produce nearly (approximation of) optimal binary search trees. Studying nearly optimal binary search trees was necessary since Knuth's algorithm time and space complexity can be prohibitive when n {\displaystyle n} is substantially large.[6]
Knuth's rules can be seen as the following:
Knuth's heuristics implements nearly optimal binary search trees in O ( n log --> n ) {\displaystyle O(n\log n)} time and O ( n ) {\displaystyle O(n)} space. The analysis on how far from the optimum Knuth's heuristics can be was further proposed by Kurt Mehlhorn.[6]
While the O(n2) time taken by Knuth's algorithm is substantially better than the exponential time required for a brute-force search, it is still too slow to be practical when the number of elements in the tree is very large.
In 1975, Kurt Mehlhorn published a paper proving important properties regarding Knuth's rules. Mehlhorn's major results state that only one of Knuth's heuristics (Rule II) always produces nearly optimal binary search trees. On the other hand, the root-max rule could often lead to very "bad" search trees based on the following simple argument.[6]
Let
n = 2 k − − --> 1 , A i = 2 − − --> k + ε ε --> i with --> ∑ ∑ --> i = 1 n ε ε --> i = 2 − − --> k {\textstyle {\begin{aligned}n=2^{k}-1,~~A_{i}=2^{-k}+\varepsilon _{i}~~\operatorname {with} ~~\sum _{i=1}^{n}\varepsilon _{i}=2^{-k}\end{aligned}}}
and
ε ε --> 1 , ε ε --> 2 , … … --> , ε ε --> n > 0 for --> 1 ≦ ≦ --> i ≦ ≦ --> n and --> B j = 0 for --> 0 ≦ ≦ --> j ≦ ≦ --> n . {\textstyle {\begin{aligned}\varepsilon _{1},\varepsilon _{2},\dots ,\varepsilon _{n}>0~~\operatorname {for} ~~1\leqq i\leqq n~~\operatorname {and} ~~B_{j}=0\operatorname {for} ~~0\leqq j\leqq n.\end{aligned}}}
Considering the weighted path length P {\displaystyle P} of the tree constructed based on the previous definition, we have the following:
P = ∑ ∑ --> i = 1 n A i ( a i + 1 ) + ∑ ∑ --> j = 1 n B j b j = ∑ ∑ --> i = 1 n A i i ≧ ≧ --> 2 − − --> k ∑ ∑ --> i = 1 n i = 2 − − --> k n ( n + 1 ) 2 ≧ ≧ --> n 2 . {\textstyle {\begin{aligned}P&=\sum _{i=1}^{n}A_{i}(a_{i}+1)+\sum _{j=1}^{n}B_{j}b_{j}\\&=\sum _{i=1}^{n}A_{i}i\\&\geqq 2^{-k}\sum _{i=1}^{n}i=2^{-k}{\frac {n(n+1)}{2}}\geqq {\frac {n}{2}}.\end{aligned}}}
Thus, the resulting tree by the root-max rule will be a tree that grows only on the right side (except for the deepest level of the tree), and the left side will always have terminal nodes. This tree has a path length bounded by Ω Ω --> ( n 2 ) {\textstyle \Omega ({\frac {n}{2}})} and, when compared with a balanced search tree (with path bounded by O ( 2 log --> n ) {\textstyle O(2\log n)} ), will perform substantially worse for the same frequency distribution.[6]
In addition, Mehlhorn improved Knuth's work and introduced a much simpler algorithm that uses Rule II and closely approximates the performance of the statically optimal tree in only O ( n ) {\displaystyle O(n)} time.[6] The algorithm follows the same idea of the bisection rule by choosing the tree's root to balance the total weight (by probability) of the left and right subtrees most closely. And the strategy is then applied recursively on each subtree.
That this strategy produces a good approximation can be seen intuitively by noting that the weights of the subtrees along any path form something very close to a geometrically decreasing sequence. In fact, this strategy generates a tree whose weighted path length is at most
where H is the entropy of the probability distribution. Since no optimal binary search tree can ever do better than a weighted path length of
this approximation is very close.[6]
In the special case that all of the A i {\displaystyle A_{i}} values are zero, the optimal tree can be found in time O ( n log --> n ) {\displaystyle O(n\log n)} . This was first proved by T. C. Hu and Alan Tucker in a paper that they published in 1971. A later simplification by Garsia and Wachs, the Garsia–Wachs algorithm, performs the same comparisons in the same order. The algorithm works by using a greedy algorithm to build a tree that has the optimal height for each leaf, but is out of order, and then constructing another binary search tree with the same heights.[7]
The following code snippet determines an optimal binary search tree when given a set of keys and probability values that the key is the search key:
public static float calculateOptimalSearchTree(int numNodes, float[] probabilities, int[][] roots) { float[][] costMatrix = new float[numNodes + 2][numNodes + 1]; for (int i = 1; i <= numNodes; i++) { costMatrix[i][i - 1] = 0; costMatrix[i][i] = probabilities[i]; roots[i][i] = i; roots[i][i - 1] = 0; } for (int diagonal = 1; diagonal <= numNodes; diagonal++) { for (int i = 1; i <= numNodes - diagonal; i++) { int j = i + diagonal; costMatrix[i][j] = findMinCost(costMatrix, i, j) + sumProbabilities(probabilities, i, j); // Note: roots[i][j] assignment is missing, this needs to be fixed if you want // to reconstruct the tree. } } return costMatrix[1][numNodes]; }
There are several different definitions of dynamic optimality, all of which are effectively equivalent to within a constant factor in terms of running-time.[8] The problem was first introduced implicitly by Sleator and Tarjan in their paper on splay trees,[9] but Demaine et al. give a very good formal statement of it.[8]
In the dynamic optimality problem, we are given a sequence of accesses x1, ..., xm on the keys 1, ..., n. For each access, we are given a pointer to the root of our BST and may use the pointer to perform any of the following operations:
(It is the presence of the fourth operation, which rearranges the tree during the accesses, which makes this the dynamic optimality problem.)
For each access, our BST algorithm may perform any sequence of the above operations as long as the pointer eventually ends up on the node containing the target value xi. The time it takes a given dynamic BST algorithm to perform a sequence of accesses is equivalent to the total number of such operations performed during that sequence. Given any sequence of accesses on any set of elements, there is some minimum total number of operations required to perform those accesses. We would like to come close to this minimum.
While it is impossible to implement this "God's algorithm" without foreknowledge of exactly what the access sequence will be, we can define OPT(X) as the number of operations it would perform for an access sequence X, and we can say that an algorithm is dynamically optimal if, for any X, it performs X in time O(OPT(X)) (that is, it has a constant competitive ratio).[8]
There are several data structures conjectured to have this property, but none proven. It is an open problem whether there exists a dynamically optimal data structure in this model.
The splay tree is a form of binary search tree invented in 1985 by Daniel Sleator and Robert Tarjan on which the standard search tree operations run in O ( log --> ( n ) ) {\displaystyle O(\log(n))} amortized time.[10] It is conjectured to be dynamically optimal in the required sense. That is, a splay tree is believed to perform any sufficiently long access sequence X in time O(OPT(X)).[9]
The tango tree is a data structure proposed in 2004 by Erik D. Demaine, Dion Harmon, John Iacono, and Mihai Pătrașcu which has been proven to perform any sufficiently-long access sequence X in time O ( log --> log --> n OPT --> ( X ) ) {\displaystyle O(\log \log n\operatorname {OPT} (X))} . While this is not dynamically optimal, the competitive ratio of log --> log --> n {\displaystyle \log \log n} is still very small for reasonable values of n.[8]
In 2013, John Iacono published a paper which uses the geometry of binary search trees to provide an algorithm which is dynamically optimal if any binary search tree algorithm is dynamically optimal.[11] Nodes are interpreted as points in two dimensions, and the optimal access sequence is the smallest arborally satisfied superset of those points. Unlike splay trees and tango trees, Iacono's data structure is not known to be implementable in constant time per access sequence step, so even if it is dynamically optimal, it could still be slower than other search tree data structures by a non-constant factor.
The interleave lower bound is an asymptotic lower bound on dynamic optimality.
اضغط هنا للاطلاع على كيفية قراءة التصنيف طائر مطموط الكترون مطموط عريض المنقار المرتبة التصنيفية جنس[1][2] التصنيف العلمي النطاق: حقيقيات النوى المملكة: حيوانات الشعبة: حبليات الشعيبة: فقاريات العمارة: رباعية الأطراف الطائفة: طيور الطويئفة: طيور حديثة الرتبة العل
Various brands of Thailand's craft beer Brewing beer in Thailand began in 1933 with the granting of a brewing license to 57-year-old Phraya Bhirom Bhakdi, born Boon Rawd Sreshthaputra.[1] His company, Boon Rawd Brewery, produces Thailand's oldest and best-known lager, Singha (pronounced sing). Singha is sold in Thailand in standard (5 percent ABV), light (4.5 percent ABV), and draught versions.[2] Singha's largest competitor is Chang beer, made by Thai Beverages and well known wo…
Al MasryNama lengkapAl Masry ClubBerdiri1920StadionPort Said StadiumPort Said, Mesir(Kapasitas: 22,000)Ketua Samir HalabiaManajer Hossam HassanLigaEgyptian Premier League2022/235 Kostum kandang Kostum tandang Al-Masry merupakan sebuah tim sepak bola Mesir yang bermarkas di Port Said. Klub ini memainkan pertandingan kandangnya di Stadion Port Said. Titel Piala Mesir Juara: 1998 Piala Sultan Hussein Juara: 1933, 1934, 1937 Liga Terusan Juara: 1932, 1933, 1934,1935,1936,1937,1938,1939,1940,1941,194…
KardamКардам Dorp in Bulgarije Situering Oblast Dobritsj Gemeente General Tosjevo Coördinaten 43° 45′ NB, 28° 6′ OL Algemeen Oppervlakte 55,354 km² Inwoners (31 december 2019) 862 Hoogte 198 m Overig Postcode 9530 Netnummer 05733 Kenteken ТХ Portaal Bulgarije Kardam (Bulgaars: Кардам) is een dorp in Bulgarije. Het dorp is gelegen in de gemeente General Tosjevo in de oblast Dobritsj en telde op 31 december 2019 zo’n 862 inwoners. Er is tevens een gelij…
Five Nights at Freddy'sPoster perilisan bioskopSutradara Emma Tammi Produser Scott Cawthon Jason Blum Ditulis oleh Scott Cawthon Seth Cuddeback Emma Tammi Skenario Scott Cawthon Seth Cuddeback Emma Tammi Cerita Scott Cawthon Chris Lee Hill Tyler MacIntyre BerdasarkanFive Nights at Freddy'soleh Scott CawthonPemeran Josh Hutcherson Elizabeth Lail Piper Rubio Mary Stuart Masterson Matthew Lillard Penata musik The Newton Brothers SinematograferLyn MoncriefPenyunting Andrew Wesman William Paley…
Copa de Bélgica 2015-16Datos generalesSede BélgicaFecha 24 de julio de 201520 de marzo de 2016Edición 61Organizador Real Federación Belga de FútbolPalmarésPrimero Standard LiejaSegundo BrujasTercero KRC GenkSemifinalistas KAA GentDatos estadísticosParticipantes 294Partidos 295 Cronología Copa de Bélgica2014-15 Copa de Bélgica2015-16 Copa de Bélgica2016-17 [editar datos en Wikidata] La Copa de Bélgica 2015-16, llamada Copa Croky por motivos de patrocinio, fue la 61.ª tempor…
Під час гри в твістер У Вікіпедії є статті про інші значення цього терміна: Твістер (значення). Твістер (Twister) — це рухлива гра, придумана Hasbro Games. Опис Ігрове поле з розмірами 160x140 см (це розмір оригінальної гри від Hasbro, але у інших виробників розмір може відрізнятися) с…
Dieser Artikel befasst sich mit dem Politiker Thomas Hermann. Zum Komiker siehe Thomas Hermanns; zu anderen Personen siehe Thomas Herrmann. Thomas Hermann, 2019 Thomas Hermann (* 1958 in Hannover) ist ein deutscher Politiker (SPD). Seit 2014 ist er Bürgermeister der Stadt Hannover und erster Stellvertreter des Oberbürgermeisters. Von 2014 bis 2021 war er auch Vorsitzender des Rates der Stadt Hannover. Inhaltsverzeichnis 1 Leben und Wirken 2 Weitere Ämter und Mitgliedschaften 3 Weblinks 4 Einz…
NBA professional basketball team season NBA professional basketball team season 2018–19 Golden State Warriors season Conference champions Division championsHead coachSteve KerrGeneral managerBob MyersOwnersJoe LacobPeter GuberArenaOracle ArenaResultsRecord57–25 (.695)PlaceDivision: 1st (Pacific)Conference: 1st (Western)Playoff finishNBA Finals(lost to Raptors 2–4)Stats at Basketball-Reference.comLocal mediaTelevisionNBC Sports Bay AreaRadio95.7 The Game < 2017–18 2019–…
Architectural feature at the Juyongguan Pass of the Great Wall of China 40°17′20″N 116°04′06″E / 40.2890°N 116.0683°E / 40.2890; 116.0683 View of the Cloud Platform from the north The Cloud Platform at Juyongguan (simplified Chinese: 居庸关云台; traditional Chinese: 居庸關雲臺; pinyin: Jūyōngguān Yúntái) is a mid-14th-century architectural feature situated in the Guangou Valley at the Juyongguan Pass of the Great Wall of China, in th…
Sklené Localidad Edificio del Ayuntamiento BanderaEscudo SklenéLocalización de Sklené en República ChecaCoordenadas 49°36′34″N 16°00′25″E / 49.609435745393, 16.006905801134Entidad Localidad • País República Checa • Región Vysočina • Distrito Žďár nad SázavouSuperficie • Total 8,44 km²Altitud • Media 752 m s. n. m.Población (2022) • Total 109 hab. • Densidad 12,9 hab/km²…
هذه المقالة يتيمة إذ تصل إليها مقالات أخرى قليلة جدًا. فضلًا، ساعد بإضافة وصلة إليها في مقالات متعلقة بها. (يوليو 2019) ويلهلم بوغ معلومات شخصية تاريخ الميلاد 7 يوليو 1893 تاريخ الوفاة 30 يوليو 1972 (79 سنة) مواطنة النرويج الأولاد ينس بوغ الحياة العملية المدرسة الأم جامعة…
Кубок французької ліги 2002—2003 Подробиці Дата проведення 11 жовтня 2002 - 17 травня 2003 Кількість учасників 44 Призові місця Чемпіон Монако (1-й раз) Віцечемпіон Сошо Статистика Зіграно матчів 43 Забито голів 120 (2.79 за матч) ← 2001—2002 2003—2004 → Кубок французької ліги з футболу 2…
Indian erotic drama television series MastramPromotional posterGenreEroticDramaStarringsee Cast sectionMusic byAshish Chhabra UlluminatiCountry of originIndiaOriginal languageHindiNo. of seasons1No. of episodes10 (list of episodes)ProductionExecutive producerPrabhleen KaurEditorBhupesh Micky SharmaRunning time30–40 minutesProduction companyAlmighty Motion PictureOriginal releaseNetworkMX PlayerRelease30 April 2020 (2020-04-30) Mastram is an 2020 Indian erotic drama streaming tel…
Cet article est une ébauche concernant l’Indre-et-Loire et la Seconde Guerre mondiale. Vous pouvez partager vos connaissances en l’améliorant (comment ?) selon les recommandations des projets correspondants. La réunion du Conseil suprême interallié de Tours fut la dernière réunion de ce conseil réunissant dirigeants et chefs militaires français et britanniques de la Seconde Guerre mondiale. Elle se tint le 13 juin 1940 à la préfecture de Tours, en Indre-et-Loire où s'était …
This article relies excessively on references to primary sources. Please improve this article by adding secondary or tertiary sources. Find sources: Calcutta Homoeopathic Medical College & Hospital – news · newspapers · books · scholar · JSTOR (January 2019) (Learn how and when to remove this template message) Calcutta Homoeopathic Medical College & HospitalTypeMedical college and HospitalEstablished1881; 142 years ago (1881)Acad…
National identity card of Turkey Turkish identity card(reverse)TypeIdentity card, optional replacement for passport in the listed countriesIssued by TurkeyPurposeProof of identity and Travel DocumentValid in Turkey Azerbaijan1 Bosnia and Herzegovina Georgia Moldova Northern Cyprus Serbia Ukraine1 ^1 : If arriving directly from Turkey.EligibilityTurkish citizenshipExpiration10 yearsCost₺83 ₺166 (applies when re-issuing a lost identity card)…
Indigenous people of Brazil For other uses, see Munduruku (disambiguation). MundurucuTotal population13,755 (2014)[1]Regions with significant populationsBrazil (Amazonas, Mato Grosso, Pará)[1]LanguagesMunduruku, Portuguese[2] The Munduruku, also known as Mundurucu or Wuy Jugu, are an indigenous people of Brazil living in the Amazon River basin. Some Munduruku communities are part of the Coatá-Laranjal Indigenous Land.[2] They had an estimated population in 2014 …
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: The Singles Collection Volume 3 – news · newspapers · books · scholar · JSTOR (May 2023) (Learn how and when to remove this template message) 2010 box set by QueenThe Singles Collection Volume 3Box set by QueenReleased31 May 2010Recorded1983-1989GenreRockL…
Constituency of the Andhra Pradesh Legislative Assembly, India EluruConstituency for the Andhra Pradesh Legislative AssemblyLocation of Eluru Assembly constituency within Andhra PradeshConstituency detailsCountryIndiaRegionSouth IndiaStateAndhra PradeshDistrictEluruLS constituencyEluruEstablished1951Total electors238,807ReservationNoneMember of Legislative Assembly15th Andhra Pradesh Legislative AssemblyIncumbent Alla Kali Krishna Srinivas (Alla Nani) PartyYSR Congress PartyElected year2019 Elur…
Lokasi Pengunjung: 3.135.207.200