PROFILPELAJAR.COM

Automatic taxonomy construction (ATC) is the use of software programs to generate taxonomical classifications from a body of texts called a corpus. ATC is a branch of natural language processing, which in turn is a branch of artificial intelligence.

A taxonomy (or taxonomical classification) is a scheme of classification, especially, a hierarchical classification, in which things are organized into groups or types.^[1]^[2]^[3]^[4]^[5]^[6] Among other things, a taxonomy can be used to organize and index knowledge (stored as documents, articles, videos, etc.), such as in the form of a library classification system, or a search engine taxonomy, so that users can more easily find the information they are searching for. Many taxonomies are hierarchies (and thus, have an intrinsic tree structure), but not all are.

Manually developing and maintaining a taxonomy is a labor-intensive task requiring significant time and resources, including familiarity of or expertise in the taxonomy's domain (scope, subject, or field), which drives the costs and limits the scope of such projects. Also, domain modelers have their own points of view which inevitably, even if unintentionally, work their way into the taxonomy. ATC uses artificial intelligence techniques to quickly automatically generate a taxonomy for a domain in order to avoid these problems and remove limitations.

Approaches

There are several approaches to ATC. One approach is to use rules to detect patterns in the corpus and use those patterns to infer relations such as hyponymy. Other approaches use machine learning techniques such as Bayesian inferencing and Artificial Neural Networks.^[7]

Keyword extraction

One approach to building a taxonomy is to automatically gather the keywords from a domain using keyword extraction, then analyze the relationships between them (see Hyponymy, below), and then arrange them as a taxonomy based on those relationships.

Hyponymy and "is-a" relations

In ATC programs, one of the most important tasks is the discovery of hypernym and hyponym relations among words. One way to do that from a body of text is to search for certain phrases like "is a" and "such as".

In linguistics, is-a relations are called hyponymy. Words that describe categories are called hypernyms and words that are examples of categories are hyponyms. For example, dog is a hypernym and Fido is one of its hyponyms. A word can be both a hyponym and a hypernym. So, dog is a hyponym of mammal and also a hypernym of Fido.

Taxonomies are often represented as is-a hierarchies where each level is more specific than (in mathematical language "a subset of") the level above it. For example, a basic biology taxonomy would have concepts such as mammal, which is a subset of animal, and dogs and cats, which are subsets of mammal. This kind of taxonomy is called an is-a model because the specific objects are considered instances of a concept. For example, Fido is-a instance of the concept dog and Fluffy is-a cat.^[8]

Applications

ATC can be used to build taxonomies for search engines, to improve search results.

ATC systems are a key component of ontology learning (also known as automatic ontology construction), and have been used to automatically generate large ontologies for domains such as insurance and finance. They have also been used to enhance existing large networks such as Wordnet to make them more complete and consistent.^[9]^[10]^[11]

ATC software

Other names

Other names for automatic taxonomy construction include:

Automated outline building
Automated outline construction
Automated outline creation
Automated outline extraction
Automated outline generation
Automated outline induction
Automated outline learning
Automated outlining
Automated taxonomy building
Automated taxonomy construction
Automated taxonomy creation
Automated taxonomy extraction
Automated taxonomy generation
Automated taxonomy induction
Automated taxonomy learning
Automatic outline building
Automatic outline construction
Automatic outline creation
Automatic outline extraction
Automatic outline generation
Automatic outline induction
Automatic outline learning
Automatic taxonomy building
Automatic taxonomy creation
Automatic taxonomy extraction
Automatic taxonomy generation
Automatic taxonomy induction
Automatic taxonomy learning
Outline automation
Outline building
Outline construction
Outline creation
Outline extraction
Outline generation
Outline induction
Outline learning
Semantic taxonomy building
Semantic taxonomy construction
Semantic taxonomy creation
Semantic taxonomy extraction
Semantic taxonomy generation
Semantic taxonomy induction
Semantic taxonomy learning
Taxonomy automation
Taxonomy building
Taxonomy construction
Taxonomy creation
Taxonomy extraction
Taxonomy generation
Taxonomy induction
Taxonomy learning

References

^ "Taxonomy". 10 October 2021.
^ "Taxonomy Definition & Meaning". Dictionary.com. Retrieved 2022-05-13.
^ "What is Taxonomy?". 14 August 2017.
^ "TAXONOMY | Meaning & Definition for UK English". Lexico.com. Archived from the original on March 2, 2021. Retrieved 2022-05-13.
^ "What is Taxonomy?". 20 August 2003.
^ "TAXONOMY (Noun) definition and synonyms | Macmillan Dictionary".
^ Neshati, Mahmood; Alijamaat, Ali; Abolhassani, Hassan; Rahimi, Afshin; Hoseini, Mehdi (2007). "Taxonomy Learning Using Compound Similarity Measure". IEEE/WIC/ACM International Conference on Web Intelligence (WI'07). pp. 487–490. doi:10.1109/WI.2007.135. ISBN 978-0-7695-3026-0. S2CID 14206314.
^ Brachman, Ronald (October 1983). "What IS-A is and isn't. An Analysis of Taxonomic Links in Semantic Networks". IEEE Computer. 16 (10): 30–36. doi:10.1109/MC.1983.1654194. OSTI 5363562. S2CID 16650410.
^ Velardi, Paola; Faralli, Stefano; Navigli, Roberto (10 October 2012). "OntoLearn Reloaded: A Graph-based Algorithm for Taxonomy Induction". Computational Linguistics. Association for Computational Linguistics. CiteSeerX 10.1.1.278.5674.
^ Liu, Xueqing; Song, Yangqiu; Liu, Shixia; Wang, Haixun (12–16 August 2012). "Automatic taxonomy construction from keywords". Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (PDF). ACM. p. 1433. doi:10.1145/2339530.2339754. ISBN 9781450314626. S2CID 9100603. Retrieved 7 March 2017.
^ Snow, Rion; Jurafsky, Daniel; Ng, Andrew. "Semantic Taxonomy Induction from Heterogenous Evidence" (PDF). Stanford University. Retrieved 8 March 2017. {{cite journal}}: Cite journal requires |journal= (help)

External links

Taxonomy 101: The Basics and Getting Started with Taxonomies – shows where ATC fits in to the general activity of managing taxonomies for a business enterprise in need of knowledge management.

[1] "Taxonomy". 10 October 2021.

[2] "Taxonomy Definition & Meaning". Dictionary.com. Retrieved 2022-05-13.

[3] "What is Taxonomy?". 14 August 2017.

[4] "TAXONOMY | Meaning & Definition for UK English". Lexico.com. Archived from the original on March 2, 2021. Retrieved 2022-05-13.

[5] "What is Taxonomy?". 20 August 2003.

[6] "TAXONOMY (Noun) definition and synonyms | Macmillan Dictionary".

[7] Neshati, Mahmood; Alijamaat, Ali; Abolhassani, Hassan; Rahimi, Afshin; Hoseini, Mehdi (2007). "Taxonomy Learning Using Compound Similarity Measure". IEEE/WIC/ACM International Conference on Web Intelligence (WI'07). pp. 487–490. doi:10.1109/WI.2007.135. ISBN 978-0-7695-3026-0. S2CID 14206314.

[8] Brachman, Ronald (October 1983). "What IS-A is and isn't. An Analysis of Taxonomic Links in Semantic Networks". IEEE Computer. 16 (10): 30–36. doi:10.1109/MC.1983.1654194. OSTI 5363562. S2CID 16650410.

[9] Velardi, Paola; Faralli, Stefano; Navigli, Roberto (10 October 2012). "OntoLearn Reloaded: A Graph-based Algorithm for Taxonomy Induction". Computational Linguistics. Association for Computational Linguistics. CiteSeerX 10.1.1.278.5674.

[10] Liu, Xueqing; Song, Yangqiu; Liu, Shixia; Wang, Haixun (12–16 August 2012). "Automatic taxonomy construction from keywords". Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (PDF). ACM. p. 1433. doi:10.1145/2339530.2339754. ISBN 9781450314626. S2CID 9100603. Retrieved 7 March 2017.

[11] Snow, Rion; Jurafsky, Daniel; Ng, Andrew. "Semantic Taxonomy Induction from Heterogenous Evidence" (PDF). Stanford University. Retrieved 8 March 2017. {{cite journal}}: Cite journal requires |journal= (help)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]