Symbolic linguistic representation

A symbolic linguistic representation is a representation of an utterance that uses symbols to represent linguistic information about the utterance, such as information about phonetics, phonology, morphology, syntax, or semantics. Symbolic linguistic representations are different from non-symbolic representations, such as recordings, because they use symbols to represent linguistic information rather than measurements.

Symbolic representations are widely used in linguistics. In syntactic representations, atomic category symbols often refer to the syntactic category of a lexical item. Examples include lexical categories such as auxiliary verbs (INFL),[1] phrasal categories such as relative clauses (SRel) and empty categories such as wh-traces (tWH).US patent 10133724  In some formalisms, such as Lexical Functional Grammar, these symbols can refer to both grammatical functions and values of grammatical categories. In linguistics, empty categories are represented with .

Symbolic representations also appear in phonetic transcription, descriptions of phonological processes, trochees, phonemes, morphophonemes, natural classes, semantic features such as animacy and the qualia structures of Generative Lexicon Theory.[2]

In natural language processing, linguistic representations, such as syntactic representations, have long been in the service of improving the output of information retrieval systems, such as search engines and machine translation systems.[3] Recently, in span-based neural constituency parsing lexical items begin as wordpiece tokens or BPE tiktokens before they are transformed into several other representations: word vectors (word encoder), terminal nodes (span vectors, fenceposts), non-terminal nodes (span classifier), parse tree (neural CKY). It's suggested that the mapping from terminals to non-terminals learns what constructions are permitted by the language.[4]

Symbolic linguistic representations are frequently used in computational linguistics.[citation needed]

Other representations in linguistics that are not symbols or measurements include rules and rankings.

Notes

  1. ^ Sells 1985, p. 20.
  2. ^ Pustejovsky 1995
  3. ^ Watanabe et al, 2000
  4. ^ Jurafsky & Martin, Chapter 17.7, pg 17

References

  • Sells, Peter (1985). Lectures on Contemporary Syntactic Theories: An Introduction to Government-Binding Theory, Generalized Phrase Structure Grammar, and Lexical-Function Grammar. CSLI.
  • Pustejovsky, James (1995). The Generative Lexicon. MIT Press. ISBN 9780262661409.
  • Watanabe et al (2000). Improving Natural Language Processing by Linguistic Document Annotation. In Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content, pages 20–27, Centre Universitaire, Luxembourg. International Committee on Computational Linguistics.
  • Jurafsky, Daniel; Martin, James H. (2024). Speech and Language Processing. Draft of February 3, 2024.
  • https://web.stanford.edu/~jurafsky/slp3/17.pdf#section.17.7

US patent 10133724, Sean L. Bethard; Edward G. Katz & Christopher Phipps, "Syntactic classification of natural language sentences with respect to a targeted element", published 2018-11-20, assigned to International Business Machines Corp