Indexed languages are a class of formal languages discovered by Alfred Aho ;[ 1] they are described by indexed grammars and can be recognized by nested stack automata .[ 2]
Indexed languages are a proper subset of context-sensitive languages .[ 1] They qualify as an abstract family of languages (furthermore a full AFL) and hence satisfy many closure properties. However, they are not closed under intersection or complement.[ 1]
The class of indexed languages has practical importance in natural language processing as a computationally affordable [citation needed ] generalization of context-free languages , since indexed grammars can describe many of the nonlocal constraints occurring in natural languages.
Gerald Gazdar (1988)[ 3] and Vijay-Shanker (1987)[ 4] introduced a mildly context-sensitive language class now known as linear indexed grammars (LIG).[ 5] Linear indexed grammars have additional restrictions relative to IG. LIGs are weakly equivalent (generate the same language class) as tree adjoining grammars .[ 6]
Examples
The following languages are indexed, but are not context-free:
{
a
n
b
n
c
n
d
n
|
n
≥
1
}
{\displaystyle \{a^{n}b^{n}c^{n}d^{n}|n\geq 1\}}
[ 3]
{
a
n
b
m
c
n
d
m
|
m
,
n
≥
0
}
{\displaystyle \{a^{n}b^{m}c^{n}d^{m}|m,n\geq 0\}}
[ 2]
These two languages are also indexed, but are not even mildly context sensitive under Gazdar's characterization:
{
a
2
n
|
n
≥
0
}
{\displaystyle \{a^{2^{n}}|n\geq 0\}}
[ 2]
{
w
w
w
|
w
∈
{
a
,
b
}
+
}
{\displaystyle \{www|w\in \{a,b\}^{+}\}}
[ 3]
On the other hand, the following language is not indexed:[ 7]
{
(
a
b
n
)
n
|
n
≥
0
}
{\displaystyle \{(ab^{n})^{n}|n\geq 0\}}
Properties
Hopcroft and Ullman tend to consider indexed languages as a "natural" class, since they are generated by several formalisms, such as:[ 9]
Hayashi[ 14] generalized the pumping lemma to indexed grammars.
Conversely, Gilman[ 7] gives a "shrinking lemma" for indexed languages.
See also
References
^ a b c d Aho, Alfred (1968). "Indexed grammars—an extension of context-free grammars" . Journal of the ACM . 15 (4): 647– 671. doi :10.1145/321479.321488 . S2CID 9539666 .
^ a b c Partee, Barbara ; ter Meulen, Alice ; Wall, Robert E. (1990). Mathematical Methods in Linguistics . Kluwer Academic Publishers. pp. 536– 542. ISBN 978-90-277-2245-4 .
^ a b c Gazdar, Gerald (1988). "Applicability of Indexed Grammars to Natural Languages". In Reyle, U.; Rohrer, C. (eds.). Natural Language Parsing and Linguistic Theories . Studies in Linguistics and Philosophy. Vol. 35. Springer Netherlands. pp. 69– 94. doi :10.1007/978-94-009-1337-0_3 . ISBN 978-94-009-1337-0 .
^ Vijayashanker, K. (1987). A study of tree adjoining grammars (Thesis). ProQuest 303610666 .
^ Kallmeyer, Laura (2010). Parsing Beyond Context-Free Grammars . Springer. p. 31. ISBN 978-3-642-14846-0 .
^ Kallmeyer, Laura (16 August 2010). Parsing Beyond Context-Free Grammars . Springer. p. 32. ISBN 978-3-642-14846-0 .
^ a b Gilman, Robert H. (1996). "A Shrinking Lemma for Indexed Languages". Theoretical Computer Science . 163 (1– 2): 277– 281. arXiv :math/9509205 . doi :10.1016/0304-3975(96)00244-7 . S2CID 14479068 .
^ Hopcroft, John ; Ullman, Jeffrey (1979). Introduction to automata theory, languages, and computation . Addison-Wesley. p. 390. ISBN 978-0-201-02988-8 .
^ Introduction to automata theory, languages, and computation,[ 8] Bibliographic notes, p.394-395
^ Aho, Alfred V. (July 1969). "Nested Stack Automata" . Journal of the ACM . 16 (3): 383– 406. doi :10.1145/321526.321529 . S2CID 685569 .
^ Fischer, Michael J. (October 1968). "Grammars with macro-like productions". 9th Annual Symposium on Switching and Automata Theory (Swat 1968) . pp. 131– 142. doi :10.1109/SWAT.1968.12 .
^ Greibach, Sheila A. (March 1970). "Full AFLs and nested iterated substitution". Information and Control . 16 (1): 7– 35. doi :10.1016/s0019-9958(70)80039-0 .
^ Maibaum, T.S.E. (June 1974). "A generalized approach to formal languages" . Journal of Computer and System Sciences . 8 (3): 409– 439. doi :10.1016/s0022-0000(74)80031-0 .
^ Hayashi, Takeshi (1973). "On derivation trees of indexed grammars: an extension of the {$uvwxy$}-theorem" . Publications of the Research Institute for Mathematical Sciences . 9 (1): 61– 92. doi :10.2977/prims/1195192738 .
External links
Each category of languages, except those marked by a * , is a proper subset of the category directly above it. Any language in each category is generated by a grammar and by an automaton in the category in the same line.