This article is about a specific type of tree data structure. For tree data structures generally, see tree (data structure).For other uses of "tree", see tree (disambiguation).
In computer science, a trie (/ˈtraɪ/, /ˈtriː/), also known as a digital tree or prefix tree,[1] is a specialized search tree data structure used to store and retrieve strings from a dictionary or set. Unlike a binary search tree, nodes in a trie do not store their associated key. Instead, each node's position within the trie determines its associated key, with the connections between nodes defined by individual characters rather than the entire key.
Tries are particularly effective for tasks such as autocomplete, spell checking, and IP routing, offering advantages over hash tables due to their prefix-based organization and lack of hash collisions. Every child node shares a common prefix with its parent node, and the root node represents the empty string. While basic trie implementations can be memory-intensive, various optimization techniques such as compression and bitwise representations have been developed to improve their efficiency. A notable optimization is the radix tree, which provides more efficient prefix-based storage.
While tries commonly store character strings, they can be adapted to work with any ordered sequence of elements, such as permutations of digits or shapes. A notable variant is the bitwise trie, which uses individual bits from fixed-length binary data (such as integers or memory addresses) as keys.
History, etymology, and pronunciation
The idea of a trie for representing a set of strings was first abstractly described by Axel Thue in 1912.[2][3] Tries were first described in a computer context by René de la Briandais in 1959.[4][3][5]: 336
The idea was independently described in 1960 by Edward Fredkin,[6] who coined the term trie, pronouncing it /ˈtriː/ (as "tree"), after the middle syllable of retrieval.[7][8] However, other authors pronounce it /ˈtraɪ/ (as "try"), in an attempt to distinguish it verbally from "tree".[7][8][3]
Overview
Tries are a form of string-indexed look-up data structure, which is used to store a dictionary list of words that can be searched on in a manner that allows for efficient generation of completion lists.[9][10]: 1 A prefix trie is an ordered tree data structure used in the representation of a set of strings over a finite alphabet set, which allows efficient storage of words with common prefixes.[1]
Tries support various operations: insertion, deletion, and lookup of a string key. Tries are composed of nodes that contain links, which either point to other suffix child nodes or null. As for every tree, each node but the root is pointed to by only one other node, called its parent. Each node contains as many links as the number of characters in the applicable alphabet (although tries tend to have a substantial number of null links). In some cases, the alphabet used is simply that of the character encoding—resulting in, for example, a size of 256 in the case of (unsigned) ASCII.[14]: 732
The null links within the children of a node emphasize the following characteristics:[14]: 734 [5]: 336
Characters and string keys are implicitly stored in the trie, and include a character sentinel value indicating string termination.
Each node contains one possible link to a prefix of strong keys of the set.
A basic structure type of nodes in the trie is as follows; may contain an optional , which is associated with each key stored in the last character of string, or terminal node.
structure Node
Children Node[Alphabet-Size]
Is-Terminal Boolean
Value Data-Typeend structure
Searching
Searching for a value in a trie is guided by the characters in the search string key, as each node in the trie contains a corresponding link to each possible character in the given string. Thus, following the string within the trie yields the associated value for the given string key. A null link during the search indicates the inexistence of the key.[14]: 732-733
The following pseudocode implements the search procedure for a given string key in a rooted trie x.[15]: 135
Trie-Find(x, key)
for 0 ≤ i < key.length doif x.Children[key[i]] = nil thenreturn false
end if
x := x.Children[key[i]]
repeatreturn x.Value
In the above pseudocode, x and key correspond to the pointer of trie's root node and the string key respectively. The search operation, in a standard trie, takes time, where is the size of the string parameter , and corresponds to the alphabet size.[16]: 754 Binary search trees, on the other hand, take in the worst case, since the search depends on the height of the tree () of the BST (in case of balanced trees), where and being number of keys and the length of the keys.[12]: 358
The trie occupies less space in comparison with a BST in the case of a large number of short strings, since nodes share common initial string subsequences and store the keys implicitly.[12]: 358 The terminal node of the tree contains a non-null value, and it is a search hit if the associated value is found in the trie, and search miss if it is not.[14]: 733
Insertion
Insertion into trie is guided by using the character sets as indexes to the children array until the last character of the string key is reached.[14]: 733-734 Each node in the trie corresponds to one call of the radix sorting routine, as the trie structure reflects the execution of pattern of the top-down radix sort.[15]: 135
1
2
3
4
5
6
7
8
9
Trie-Insert(x, key, value)
for 0 ≤ i < key.length doif x.Children[key[i]] = nil then
x.Children[key[i]] := Node()
end if
x := x.Children[key[i]]
repeat
x.Value := value
x.Is-Terminal := True
If a null link is encountered prior to reaching the last character of the string key, a new node is created (line 3).[14]: 745 The value of the terminal node is assigned to the input value; therefore, if the former was non-null at the time of insertion, it is substituted with the new value.
Deletion
Deletion of a key–value pair from a trie involves finding the terminal node with the corresponding string key, marking the terminal indicator and value to false and null correspondingly.[14]: 740
The following is a recursive procedure for removing a string key from rooted trie (x).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Trie-Delete(x, key)
if key = nil thenif x.Is-Terminal = True then
x.Is-Terminal := False
x.Value := nil
end iffor 0 ≤ i < x.Children.length
if x.Children[i] != nil
return x
end ifrepeatreturn nil
end if
x.Children[key[0]] := Trie-Delete(x.Children[key[0]], key[1:])
return x
The procedure begins by examining the key; null denotes the arrival of a terminal node or end of a string key. If the node is terminal it has no children, it is removed from the trie (line 14). However, an end of string key without the node being terminal indicates that the key does not exist, thus the procedure does not modify the trie. The recursion proceeds by incrementing key's index.
Replacing other data structures
Replacement for hash tables
A trie can be used to replace a hash table, over which it has the following advantages:[12]: 358
Searching for a node with an associated key of size has the complexity of , whereas an imperfect hash function may have numerous colliding keys, and the worst-case lookup speed of such a table would be , where denotes the total number of nodes within the table.
Tries do not need a hash function for the operation, unlike a hash table; there are also no collisions of different keys in a trie.
Buckets in a trie, which are analogous to hash table buckets that store key collisions, are necessary only if a single key is associated with more than one value.
String keys within the trie can be sorted using a predetermined alphabetical ordering.
However, tries are less efficient than a hash table when the data is directly accessed on a secondary storage device such as a hard disk drive that has higher random access time than the main memory.[6] Tries are also disadvantageous when the key value cannot be easily represented as string, such as floating point numbers where multiple representations are possible (e.g. 1 is equivalent to 1.0, +1.0, 1.00, etc.),[12]: 359 however it can be unambiguously represented as a binary number in IEEE 754, in comparison to two's complement format.[17]
Implementation strategies
Tries can be represented in several ways, corresponding to different trade-offs between memory use and speed of the operations.[5]: 341 Using a vector of pointers for representing a trie consumes enormous space; however, memory space can be reduced at the expense of running time if a singly linked list is used for each node vector, as most entries of the vector contains .[3]: 495
Techniques such as alphabet reduction may alleviate the high space complexity by reinterpreting the original string as a long string over a smaller alphabet i.e. a string of n bytes can alternatively be regarded as a string of 2nfour-bit units and stored in a trie with sixteen pointers per node. However, lookups need to visit twice as many nodes in the worst-case, although space requirements go down by a factor of eight.[5]: 347–352 Other techniques include storing a vector of 256 ASCII pointers as a bitmap of 256 bits representing ASCII alphabet, which reduces the size of individual nodes dramatically.[18]
Bitwise tries are used to address the enormous space requirement for the trie nodes in a naive simple pointer vector implementations. Each character in the string key set is represented via individual bits, which are used to traverse the trie over a string key. The implementations for these types of trie use vectorized CPU instructions to find the first set bit in a fixed-length key input (e.g. GCC's __builtin_clz()intrinsic function). Accordingly, the set bit is used to index the first item, or child node, in the 32- or 64-entry based bitwise tree. Search then proceeds by testing each subsequent bit in the key.[19]
Radix tree, also known as a compressed trie, is a space-optimized variant of a trie in which any node with only one child gets merged with its parent; elimination of branches of the nodes with a single child results in better metrics in both space and time.[20][21]: 452 This works best when the trie remains static and set of keys stored are very sparse within their representation space.[22]: 3–16
One more approach is to "pack" the trie, in which a space-efficient implementation of a sparse packed trie applied to automatic hyphenation, in which the descendants of each node may be interleaved in memory.[8]
Patricia trees
Patricia tree representation of the string set {in, integer, interval, string, structure}.
Patricia trees are a particular implementation of the compressed binary trie that uses the binary encoding of the string keys in its representation.[23][15]: 140 Every node in a Patricia tree contains an index, known as a "skip number", that stores the node's branching index to avoid empty subtrees during traversal.[15]: 140-141 A naive implementation of a trie consumes immense storage due to larger number of leaf-nodes caused by sparse distribution of keys; Patricia trees can be efficient for such cases.[15]: 142 [24]: 3
A representation of a Patricia tree is shown to the right. Each index value adjacent to the nodes represents the "skip number"—the index of the bit with which branching is to be decided.[24]: 3 The skip number 1 at node 0 corresponds to the position 1 in the binary encoded ASCII where the leftmost bit differed in the key set .[24]: 3-4 The skip number is crucial for search, insertion, and deletion of nodes in the Patricia tree, and a bit masking operation is performed during every iteration.[15]: 143
Applications
Trie data structures are commonly used in predictive text or autocomplete dictionaries, and approximate matching algorithms.[11] Tries enable faster searches, occupy less space, especially when the set contains large number of short strings, thus used in spell checking, hyphenation applications and longest prefix match algorithms.[8][12]: 358 However, if storing dictionary words is all that is required (i.e. there is no need to store metadata associated with each word), a minimal deterministic acyclic finite state automaton (DAFSA) or radix tree would use less storage space than a trie. This is because DAFSAs and radix trees can compress identical branches from the trie which correspond to the same suffixes (or parts) of different words being stored. String dictionaries are also utilized in natural language processing, such as finding lexicon of a text corpus.[25]: 73
Sorting
Lexicographic sorting of a set of string keys can be implemented by building a trie for the given keys and traversing the tree in pre-order fashion;[26] this is also a form of radix sort.[27] Tries are also fundamental data structures for burstsort, which is notable for being the fastest string sorting algorithm as of 2007,[28] accomplished by its efficient use of CPU cache.[29]
Full-text search
A special kind of trie, called a suffix tree, can be used to index all suffixes in a text to carry out fast full-text searches.[30]
Web search engines
A specialized kind of trie called a compressed trie, is used in web search engines for storing the indexes - a collection of all searchable words.[31] Each terminal node is associated with a list of URLs—called occurrence list—to pages that match the keyword. The trie is stored in the main memory, whereas the occurrence is kept in an external storage, frequently in large clusters, or the in-memory index points to documents stored in an external location.[32]
Bioinformatics
Tries are used in Bioinformatics, notably in sequence alignment software applications such as BLAST, which indexes all the different substring of length k (called k-mers) of a text by storing the positions of their occurrences in a compressed trie sequence databases.[25]: 75
^ abcdeFranklin Mark Liang (1983). Word Hy-phen-a-tion By Com-put-er(PDF) (Doctor of Philosophy thesis). Stanford University. Archived(PDF) from the original on 2005-11-11. Retrieved 2010-03-28.
^"Trie". School of Arts and Science, Rutgers University. 2022. Archived from the original on 17 April 2022. Retrieved 17 April 2022.
^Bellekens, Xavier (2014). "A Highly-Efficient Memory-Compression Scheme for GPU-Accelerated Intrusion Detection Systems". Proceedings of the 7th International Conference on Security of Information and Networks - SIN '14. Glasgow, Scotland, UK: ACM. pp. 302:302–302:309. arXiv:1704.02272. doi:10.1145/2659651.2659723. ISBN978-1-4503-3033-6. S2CID12943246.
^Kärkkäinen, Juha. "Lecture 2"(PDF). University of Helsinki. The preorder of the nodes in a trie is the same as the lexicographical order of the strings they represent assuming the children of a node are ordered by the edge labels.
^J. Kärkkäinen and T. Rantala (2008). "Engineering Radix Sort for Strings". In A. Amir and A. Turpin and A. Moffat (ed.). String Processing and Information Retrieval, Proc. SPIRE. Lecture Notes in Computer Science. Vol. 5280. Springer. pp. 3–14. doi:10.1007/978-3-540-89097-3_3. ISBN978-3-540-89096-6.
American philosopher and political activist (born 1953) For the residential section of Cornell University, see Cornell West Campus. Cornel WestWest in 2018BornCornel Ronald West (1953-06-02) June 2, 1953 (age 70)Tulsa, Oklahoma, U.S.EducationHarvard University (BA)Princeton University (MA, PhD)Notable workRace Matters (1993)Democracy Matters (2004)Political partyIndependent (since 2023)[5]Justice For All Party (since 2024)[a]Other politicalaffiliationsGreen (2023)[6...
Super Show 6Tur world oleh Super JuniorPoster Promosi Super Show 6MamacitaMulai19 September 2014 (2014-09-19)Berakhir12 Juli 2015 (2015-07-12)Putaran4Penampilan5 di Korea Selatan6 di Jepang4 di Tiongkok1 di Hong Kong2 di Taiwan2 di Thailand 1 di Singapura1 di Indonesiatotal 22Situs websuperjunior.smtown.comKronologi konser Super Junior Super Show 5(2013–14) Super Show 6(2014–15) Super Camp(2015–) Super Show 6 adalah tur dunia ketiga da...
Thalassa Sea & SpaIndustryHotelsFounded2011HeadquartersÉvry, FranceNumber of locations13 (2018)Area servedFranceParentAccorWebsitethalassa.com Thalassa Sea & Spa is the thalassotherapy brand of the group Accor. In 1984, Accor purchased the Quiberon institute, the first of the Thalassa brand, and the largest thalassotherapy center in France to this day.[1] Thalassa Sea & Spa has 9 centers in France and 4 abroad (Italy, Morocco, Bahrain).[2] History 1984: Acqui...
Biografi ini tidak memiliki sumber tepercaya sehingga isinya tidak dapat dipastikan. Bantu memperbaiki artikel ini dengan menambahkan sumber tepercaya. Materi kontroversial atau trivial yang sumbernya tidak memadai atau tidak bisa dipercaya harus segera dihapus.Cari sumber: Fakhriyani Shafariyanti – berita · surat kabar · buku · cendekiawan · JSTOR (Pelajari cara dan kapan saatnya untuk menghapus pesan templat ini) Fakhriyani ShafariyantiLahirFakhriyan...
Samsung Galaxy Note 10 (ditulis sebagai Samsung Galaxy Note10) adalah sebuah produk phablet berbasis Android yang dirancang, dikembangkan, diproduksi dan dipasarkan oleh Samsung Electronics sebagai bagian dari seri Samsung Galaxy Note. Produk tersebut diluncurkan pada 7 Agustus 2019, sebagai penerus dari Samsung Galaxy Note 9. Samsung Galaxy Note10MerekSamsung Galaxy NotePembuatSamsung ElectronicsSeriSamsung Galaxy NoteJaringan 2G 3G 4G 4G LTE 5G (hanya tersedia pada Samsung Galaxy Note 10+ 5...
Thecostraca Seekor teritip dalam famili Balanidae, Mission Beach, Queensland, Australia, 2001.TaksonomiKerajaanAnimaliaFilumArthropodaKelasHexanaupliaSubkelasThecostraca Gruvel, 1905 Subclasses Facetotecta Ascothoracida Cirripedia lbs Thecostraca merupakan kelas dari anggota hewan laut invertebrata yang terdiri dari dari 2.200 spesies yang telah diidentifikasi. Banyak spesies di dalam kelas ini yang memiliki fase larva seperti plankton yang menjadi hewan sesil atau parasit saat memasuki fase ...
Statistics on the Alpine Ski World Cup See also: List of FIS Alpine Ski World Cup women's race winners List of men's World Cup winners Information Sport: Alpine skiing Competition: FIS World Cup First winner: Heinrich Messner Last winner: Manuel Feller Most wins All: Ingemar Stenmark (86) Downhill: Franz Klammer (25) Super-G: Hermann Maier (24) Giant slalom: Ingemar Stenmark (46) Slalom: Ingemar Stenmark (40) Total Winners: 307 Events: 1927 This is a list of all male winners in FIS Alpine Ski...
Component city in Laguna, Philippines Component city in Calabarzon, PhilippinesBiñanComponent cityCity of Biñan(From top, left to right: Plaza Rizal · Alonte Sports Arena · City Hall · Southwoods City · Biñan Football Stadium) FlagSealMap of Laguna with Biñan highlightedOpenStreetMapBiñanLocation within the PhilippinesCoordinates: 14°20′N 121°05′E / 14.33°N 121.08°E / 14.33; 121.08CountryPhilippinesRegionCalabarzonProvinceLagunaDistrict Lone districtFo...
National park in British Columbia, Canada Kootenay National ParkIUCN category II (national park)[1]Stanley Valley and Mount Whymper viewed from the Stanley Glacier Trail.Location of Kootenay National Park in CanadaShow map of CanadaLocation of Kootenay National Park in British ColumbiaShow map of British ColumbiaLocationEast Kootenay, British Columbia, CanadaCoordinates50°52′59″N 116°02′57″W / 50.88306°N 116.04917°W / 50.88306; -116.04917Area1,406...
Linea M4Logo Un convoglio in viaggio il giorno dell'estensione della linea fino a San Babila ReteMetropolitana di Milano Stato Italia CittàMilano Apertura2022 Ultima estensione2023 GestoreATM Sito webwww.metro4milano.it/ CaratteristicheStazioni8 Lunghezza7,2[1] km Distanza mediatra stazioni723 m Trazione750 V CC (terza rotaia) Scartamento1,435 mm Materiale rotabileMetropolitana driverless Hitachi Rail Italy Serie 4400 (47) Totale treni: 47 Mappa della lineapi...
منتخب كرواتيا لكرة القدم Hrvatska nogometna reprezentacija معلومات عامة اللقب Vatreni (الناريون) بلد الرياضة كرواتيا الفئة كرة القدم للرجال رمز الفيفا CRO الاتحاد اتحاد كرواتيا لكرة القدم كونفدرالية يويفا (أوروبا) الملعب الرئيسي ملعب ماكسيمير الموقع الرسمي www.hns-cff.hr الطاقم واللاعبون المدر...
Electoral ward in Leeds, England Human settlement in EnglandOtley and YeadonOtley and Yeadon highlighted within LeedsPopulation18,283 (2023 electorate)Metropolitan boroughCity of LeedsMetropolitan countyWest YorkshireRegionYorkshire and the HumberCountryEnglandSovereign stateUnited KingdomUK ParliamentLeeds North WestCouncillorsColin Campbell[1] (Liberal Democrats)Ryk Downes[2] (Liberal Democrats)Sandy Lay[3] (Liberal Democrats) ...
1972 film by Robert Culp Hickey & BoggsTheatrical release posterDirected byRobert CulpWritten byWalter HillProduced byFouad SaidStarring Bill Cosby Robert Culp CinematographyBill ButlerEdited byDavid BerlatskyMusic byTed AshfordProductioncompanyFilm GuarantorsDistributed byUnited ArtistsRelease dates September 20, 1972 (1972 -09-20) ( New York City ) October 4, 1972 (1972 -10-04) ( United States ) Running time111 minutesCountryUnited StatesLanguageEn...
South Korean singer and entertainer This biography of a living person needs additional citations for verification. Please help by adding reliable sources. Contentious material about living persons that is unsourced or poorly sourced must be removed immediately from the article and its talk page, especially if potentially libelous.Find sources: Min Kyung-hoon – news · newspapers · books · scholar · JSTOR (June 2018) (Learn how and when to remove this me...
اندير كانتيرو معلومات شخصية الميلاد 9 يناير 1995 (العمر 29 سنة)بنبلونة الطول 1.89 م (6 قدم 2 1⁄2 بوصة) مركز اللعب حارس مرمى الجنسية إسبانيا معلومات النادي النادي الحالي إيبار الرقم 1 مسيرة الشباب سنوات فريق 2002–2012 أوساسونا المسيرة الاحترافية1 سنوات فريق م. (هـ.) 201...
Bài viết này cần thêm chú thích nguồn gốc để kiểm chứng thông tin. Mời bạn giúp hoàn thiện bài viết này bằng cách bổ sung chú thích tới các nguồn đáng tin cậy. Các nội dung không có nguồn có thể bị nghi ngờ và xóa bỏ. Myanmar Bài này nằm trong loạt bài về:Chính trị và chính phủMyanmar Hiến pháp Hiến pháp Tòa án Chánh án: Myo Nyunt Chính phủ Tổng thống Htin Kyaw Cố vấn nhà nước Aung San ...
Gambar tengkorak seorang Georgia oleh Johann Friedrich Blumenbach, yang digunakan sebagai arketipe karakteristik rasial ras Kaukasoid dalam bukunya tahun 1795 De Generis Humani Varietate Ras Kaukasoid[1] adalah klasifikasi ras manusia yang sudah usang berdasarkan teori ras biologis yang sekarang sudah tidak terbukti. Pernah dipakai untuk menunjuk fenotipe umum dari sebagian besar penghuni Eropa, Afrika Utara, Asia Barat, Pakistan dan India Utara.[2] Keturunan mereka juga menet...
Distance from zero to a number This article is about the absolute value of real and complex numbers. For other absolute values in mathematics, see Absolute value (algebra). For other uses, see Absolute value (disambiguation). The graph of the absolute value function for real numbers The absolute value of a number may be thought of as its distance from zero. In mathematics, the absolute value or modulus of a real number x {\displaystyle x} , denoted | x | {\displaystyle |x|} , is the non-negat...