A symbolic linguistic representation is a representation of an utterance that uses symbols to represent linguistic information about the utterance, such as information about phonetics, phonology, morphology, syntax, or semantics. Symbolic linguistic representations are different from non-symbolic representations, such as recordings, because they use symbols to represent linguistic information rather than measurements.
In natural language processing, linguistic representations, such as syntactic representations, have long been in the service of improving the output of information retrieval systems, such as search engines and machine translation systems.[3] Recently, in span-based neural constituency parsing lexical items begin as wordpiece tokens or BPEtiktokens before they are transformed into several other representations: word vectors (word encoder), terminal nodes (span vectors, fenceposts), non-terminal nodes (span classifier), parse tree (neural CKY). It's suggested that the mapping from terminals to non-terminals learns what constructions are permitted by the language.[4]
Sells, Peter (1985). Lectures on Contemporary Syntactic Theories: An Introduction to Government-Binding Theory, Generalized Phrase Structure Grammar, and Lexical-Function Grammar. CSLI.
Pustejovsky, James (1995). The Generative Lexicon. MIT Press. ISBN 9780262661409.
Watanabe et al (2000). Improving Natural Language Processing by Linguistic Document Annotation. In Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content, pages 20–27, Centre Universitaire, Luxembourg. International Committee on Computational Linguistics.
Jurafsky, Daniel; Martin, James H. (2024). Speech and Language Processing. Draft of February 3, 2024.
US patent 10133724, Sean L. Bethard; Edward G. Katz & Christopher Phipps, "Syntactic classification of natural language sentences with respect to a targeted element", published 2018-11-20, assigned to International Business Machines Corp