Definite clause grammar

A definite clause grammar (DCG) is a way of expressing grammar, either for natural or formal languages, in a logic programming language such as Prolog. It is closely related to the concept of attribute grammars / affix grammars. DCGs are usually associated with Prolog, but similar languages such as Mercury also include DCGs. They are called definite clause grammars because they represent a grammar as a set of definite clauses in first-order logic.

The term DCG refers to the specific type of expression in Prolog and other similar languages; not all ways of expressing grammars using definite clauses are considered DCGs. However, all of the capabilities or properties of DCGs will be the same for any grammar that is represented with definite clauses in essentially the same way as in Prolog.

The definite clauses of a DCG can be considered a set of axioms where the validity of a sentence, and the fact that it has a certain parse tree can be considered theorems that follow from these axioms.[1] This has the advantage of making it so that recognition and parsing of expressions in a language becomes a general matter of proving statements, such as statements in a logic programming language.

History

The history of DCGs is closely tied to the history of Prolog, and the history of Prolog revolves around several researchers in both Marseille, France, and Edinburgh, Scotland. According to Robert Kowalski, an early developer of Prolog, the first Prolog system was developed in 1972 by Alain Colmerauer and Phillipe Roussel.[2] The first program written in the language was a large natural-language processing system. Fernando Pereira and David Warren at the University of Edinburgh were also involved in the early development of Prolog.

Colmerauer had previously worked on a language processing system called Q-systems that was used to translate between English and French.[3] In 1978, Colmerauer wrote a paper about a way of representing grammars called metamorphosis grammars which were part of the early version of Prolog called Marseille Prolog. In this paper, he gave a formal description of metamorphosis grammars and some examples of programs that use them.

Fernando Pereira and David Warren, two other early architects of Prolog, coined the term "definite clause grammar" and created the notation for DCGs that is used in Prolog today. They gave credit for the idea to Colmerauer and Kowalski, and they note that DCGs are a special case of Colmerauer's metamorphosis grammars. They introduced the idea in an article called "Definite Clause Grammars for Language Analysis", where they describe DCGs as a "formalism ... in which grammars are expressed clauses of first-order predicate logic" that "constitute effective programs of the programming language Prolog".[4]

Pereira, Warren, and other pioneers of Prolog later wrote about several other aspects of DCGs. Pereira and Warren wrote an article called "Parsing as Deduction", describing things such as how the Earley Deduction proof procedure is used for parsing.[5] Pereira also collaborated with Stuart M. Shieber on a book called "Prolog and Natural Language Analysis", that was intended as a general introduction to computational linguistics using logic programming.[6]

Example

A basic example of DCGs helps to illustrate what they are and what they look like.

 sentence --> noun_phrase, verb_phrase.
 noun_phrase --> det, noun.
 verb_phrase --> verb, noun_phrase.
 det --> [the].
 det --> [a].
 noun --> [cat].
 noun --> [bat].
 verb --> [eats].

This generates sentences such as "the cat eats the bat", "a bat eats the cat". One can generate all of the valid expressions in the language generated by this grammar at a Prolog interpreter by typing sentence(X,[]). Similarly, one can test whether a sentence is valid in the language by typing something like sentence([the,bat,eats,the,bat],[]).

Translation into definite clauses

DCG notation is just syntactic sugar for normal definite clauses in Prolog. For example, the previous example could be translated into the following:

 sentence(A,Z) :- noun_phrase(A,B), verb_phrase(B,Z).
 noun_phrase(A,Z) :- det(A,B), noun(B,Z).
 verb_phrase(A,Z) :- verb(A,B), noun_phrase(B,Z).
 det([the|X], X).
 det([a|X], X).
 noun([cat|X], X).
 noun([bat|X], X).
 verb([eats|X], X).

Difference lists

The arguments to each functor, such as (A,B) and (B,Z) are difference lists; difference lists are a way of representing a prefix of a list as the difference between its two suffixes (the bigger including the smaller one). Using Prolog's notation for lists, a singleton list prefix P = [H] can be seen as the difference between [H|X] and X, and thus represented with the pair ([H|X],X), for instance.

Saying that P is the difference between A and B is the same as saying that append(P,B,A) holds. Or in the case of the previous example, append([H],X,[H|X]).

Difference lists are used to represent lists with DCGs for reasons of efficiency. It is much more efficient to concatenate list differences (prefixes), in the circumstances that they can be used, because the concatenation of (A,B) and (B,Z) is just (A,Z).[7]

Indeed, append(P,B,A), append(Q,Z,B) entails append(P,Q,S), append(S,Z,A). This is the same as saying that list concatenation is associative:

A = P + B = P + (Q + Z) = (P + Q) + Z = S + Z = A

Non-context-free grammars

In pure Prolog, normal DCG rules with no extra arguments on the functors, such as the previous example, can only express context-free grammars; there is only one argument on the left side of the production. However, context-sensitive grammars can also be expressed with DCGs, by providing extra arguments, such as in the following example:

 s --> a(N), b(N), c(N).
 a(0) --> [].
 a(M) --> [a], a(N), {M is N + 1}.
 b(0) --> [].
 b(M) --> [b], b(N), {M is N + 1}.
 c(0) --> [].
 c(M) --> [c], c(N), {M is N + 1}.

This set of DCG rules describes the grammar which generates the language that consists of strings of the form .[8]

 s --> symbols(Sem,a), symbols(Sem,b), symbols(Sem,c).
 symbols(end,_) --> [].
 symbols(s(Sem),S) --> [S], symbols(Sem,S).

This set of DCG rules describes the grammar which generates the language that consists of strings of the form , by structurally representing n [citation needed]

Representing features

Various linguistic features can also be represented fairly concisely with DCGs by providing extra arguments to the functors.[9] For example, consider the following set of DCG rules:

 sentence --> pronoun(subject), verb_phrase.
 verb_phrase --> verb, pronoun(object).
 pronoun(subject) --> [he].
 pronoun(subject) --> [she].
 pronoun(object) --> [him].
 pronoun(object) --> [her].
 verb --> [likes].

This grammar allows sentences like "he likes her" and "he likes him", but not "her likes he" and "him likes him".

Parsing with DCGs

An example parse tree for this grammar.

The main practical use of a DCG is to parse sentences of the given grammar, i.e. to construct a parse tree. This can be done by providing "extra arguments" to the functors in the DCG, like in the following rules:

 sentence(s(NP,VP)) --> noun_phrase(NP), verb_phrase(VP).
 noun_phrase(np(D,N)) --> det(D), noun(N).
 verb_phrase(vp(V,NP)) --> verb(V), noun_phrase(NP).
 det(d(the)) --> [the].
 det(d(a)) --> [a].
 noun(n(bat)) --> [bat].
 noun(n(cat)) --> [cat].
 verb(v(eats)) --> [eats].

One can now query the interpreter to yield a parse tree of any given sentence:

 | ?- sentence(Parse_tree, [the,bat,eats,a,cat], []).
 Parse_tree = s(np(d(the),n(bat)),vp(v(eats),np(d(a),n(cat)))) ? ;

Other uses

DCGs can serve as a convenient syntactic sugar to hide certain parameters in code in other places besides parsing applications. In the declaratively pure programming language Mercury I/O must be represented by a pair of io.state arguments. DCG notation can be used to make using I/O more convenient,[10] although state variable notation is usually preferred.[11] DCG notation is also used for parsing and similar things in Mercury as it is in Prolog.

Extensions

Since DCGs were introduced by Pereira and Warren, several extensions have been proposed. Pereira himself proposed an extension called extraposition grammars (XGs).[12] This formalism was intended in part to make it easier to express certain grammatical phenomena, such as left-extraposition. Pereira states, "The difference between XG rules and DCG rules is then that the left-hand side of an XG rule may contain several symbols." This makes it easier to express rules for context-sensitive grammars.

Peter Van Roy extended DCGs to allow multiple accumulators.[13][14]

Another, more recent, extension was made by researchers at NEC Corporation called Multi-Modal Definite Clause Grammars (MM-DCGs) in 1995. Their extensions were intended to allow the recognizing and parsing expressions that include non-textual parts such as pictures.[15]

Another extension, called definite clause translation grammars (DCTGs) was described by Harvey Abramson in 1984.[16] DCTG notation looks very similar to DCG notation; the major difference is that one uses ::= instead of --> in the rules. It was devised to handle grammatical attributes conveniently.[17] The translation of DCTGs into normal Prolog clauses is like that of DCGs, but 3 arguments are added instead of 2.

See also

Notes

  1. ^ Johnson, M. (1994). "Two ways of formalizing grammars". Linguistics and Philosophy. 17 (3): 221–240. doi:10.1007/BF00985036. S2CID 62165766.
  2. ^ Kowalski, Robert A. (January 1988). "The Early Years of Logic Programming" (PDF). Communications of the ACM. 31 (1): 38–43. doi:10.1145/35043.35046. S2CID 12259230.
  3. ^ Colmerauer, A. (1978). "Metamorphosis grammars". Natural Language Communication with Computers. Lecture Notes in Computer Science. Vol. 63. pp. 133–189. doi:10.1007/BFb0031371. ISBN 3-540-08911-X.
  4. ^ Pereira, Fernando C. N.; Warren, David H. D. (1980). "Definite Clause Grammars for Language Analysis—A Survey of the Formalism and a Comparison with Augmented Transition Networks" (PDF). Artificial Intelligence. 13 (3): 231–278. doi:10.1016/0004-3702(80)90003-X.
  5. ^ Pereira, F. C. N.; D. H. D. Warren (1983). "Parsing as deduction" (PDF). Proceedings of the 21st annual meeting on Association for Computational Linguistics. Association for Computational Linguistics Morristown, NJ, USA. pp. 137–144.
  6. ^ Pereira, F. C. N.; S. M. Shieber (2002). Prolog and natural-language analysis. Microtome Publishing. ISBN 9780971977709.
  7. ^ Fleck, Arthur. "Definite Clause Grammar Translation". Retrieved 2009-04-16.
  8. ^ Fisher, J. R. "Prolog Tutorial -- 7.1". Retrieved 2016-05-17.
  9. ^ "DCGs give us a Natural Notation for Features". Retrieved 2009-04-21.
  10. ^ "Prolog to Mercury Transition Guide: Input/Output". Retrieved 2015-03-26.
  11. ^ "The Mercury Language Reference Manual: DCG-rules". www.mercurylang.org. Retrieved 2023-04-07.
  12. ^ Pereira, F. (1981). "Extraposition grammars" (PDF). Computational Linguistics. 7 (4): 243–256.
  13. ^ Van Roy, Peter (1990). "Extended DCG Notation: A Tool for Applicative Programming in Prolog". UCB Technical Report. 90 (583).
  14. ^ Source code is available at [1].
  15. ^ Shimazu, H.; Y. Takashima (1995). "Multimodal definite clause grammar" (PDF). Systems and Computers in Japan. 26 (3): 93–102. doi:10.1002/scj.4690260309. S2CID 8199913.
  16. ^ Abramson, H. (April 1984). Definite clause translation grammars (PDF) (Technical report). 84-3.
  17. ^ Sperberg-McQueen, C. M. "A brief introduction to definite clause grammars and definite clause translation grammars". Retrieved 2009-04-21.

Read other articles:

Université de Pau et des pays de l’Adour (UPPA)Logo de l'université de Pau et des pays de l'Adour.HistoireFondation 17 décembre 1970StatutType Université en FranceForme juridique Établissement public national à caractère scientifique culturel et professionnel (d)Président Laurent Bordes (depuis 2021)Devise « L'université est une chance. Saisissons-la ! »Membre de Aerospace Valley, membre associé de la Communauté d'universités et établissements d'AquitaineSite w...

 

 

Artikel ini bukan mengenai katalisis atau sosialisme. Kapitalisme adalah sistem ekonomi di mana perdagangan, industri dan alat-alat produksi dikendalikan oleh pemilik swasta dengan tujuan memperoleh keuntungan dalam ekonomi pasar.[1][2] Pemilik modal dalam melakukan usahanya berusaha untuk meraih keuntungan sebesar-besarnya. Dengan prinsip tersebut, pemerintah tidak dapat melakukan intervensi pasar guna memperoleh keuntungan bersama, tetapi intervensi pemerintah dilakukan seca...

 

 

Akar-Akar PohonSenimanVincent van GoghTahun1890F816, JH2113MediumMinyak di kanvasUkuran50.0 cm × 100.0 cm (19.7 in × 40.6 in)LokasiMuseum Van Gogh, Amsterdam Akar-Akar Pohon adalah sebuah lukisan minyak karya Vincent van Gogh yang ia lukis pada Juli 1890 saat ia tinggal di Auvers-sur-Oise, Prancis.[1][2] Lukisan tersebut adalah contoh kanvas dua persegi yang ia pakar dalam lanskap-lanskap terakhir buatannya.[3] Referensi ^ va...

State electoral district of New South Wales, Australia St Leonards was an electoral district of the Legislative Assembly in the Australian state of New South Wales, created in 1859, partly replacing Sydney Hamlets, and named after the Sydney suburb of St Leonards, which then included North Sydney, its main settlement. It extended from North Sydney to Broken Bay, including the Northern Beaches. It elected one member from 1859 to 1882, two members from 1882 to 1889 and three members from 1889 t...

 

 

1981 American horror film by Lawrence D. Foldes Don't Go Near the ParkPoster artDirected byLawrence D. FoldesWritten byLawrence D. FoldesStory byLinwood ChaseProduced byLawrence D. FoldesStarring Aldo Ray Meeno Peluce Tamara Taylor Robert Gribbin Barbara Monker Linnea Quigley Chris Riley CinematographyWilliam DeDiegoEdited byDan PerryMusic byChris LedesmaProductioncompanyStar CinemaDistributed byCardinal IV Film DistributorsCannon FilmsRelease date September 11, 1981 (1981-09-1...

 

 

Malayalam cinema Before 1960 1960s 1960 1961 1962 1963 19641965 1966 1967 1968 1969 1970s 1970 1971 1972 1973 19741975 1976 1977 1978 1979 1980s 1980 1981 1982 1983 19841985 1986 1987 1988 1989 1990s 1990 1991 1992 1993 19941995 1996 1997 1998 1999 2000s 2000 2001 2002 2003 20042005 2006 2007 2008 2009 2010s 2010 2011 2012 2013 20142015 2016 2017 2018 2019 2020s 2020 2021 2022 2023 2024 vte The following is a list of Malayalam films released in the year 1992. Title Director Screenplay Cast A...

Country music radio station in Bar Harbor, Maine WBFESimulcasts WBFB BangorBar Harbor, MaineBroadcast areaMount Desert Island, Downeast MaineFrequency99.1 MHzBrandingThe BearProgrammingFormatCountryAffiliationsMotor Racing NetworkPremiere NetworksOwnershipOwnerBlueberry Broadcasting(Blueberry Broadcasting, LLC)HistoryFirst air dateJune 1992 (1992-06)Former call signsWPRG (1988–1992)WLKE (1992–2013)Call sign meaningsimilar to WBFBTechnical information[1]Licensing authorit...

 

 

Musée de GrenobleParvis du musée formant l'esplanade François-Mitterrand.Informations généralesType Musée d'art, collection (en)Ouverture 1798 (il y a 226 ans)Dirigeant Guy TosattoSurface 18 270 m2 dont 7 500 m2 d'espaces d'exposition [3] + Tour de l'IsleVisiteurs par an 228 689 (2019) (détails)Site web Site officielCollectionsCollections peintures, sculptures, dessins, antiquités, objets d'artNombre d'objets 25 000 entreposés[1],[2] dont 900 e...

 

 

هنري الثالث (بالفرنسية: Henri III de Valois)‏  ملك بولندا ودوق لتوانيا الأكبر فترة الحكم16 مايو 1573 – 12 مايو 1575 نوع الحكم ملكي زغمونت الثاني أوغست آنا ججلون ملك فرنسا فترة الحكم30 مايو 1574 – 2 أغسطس 1589 تشارلز التاسع هنري الرابع معلومات شخصية الميلاد 19 سبتمبر 1551(1551-09-19)قصر فونتينبلو، ف...

Heroes of the StormDéveloppeur Blizzard EntertainmentÉditeur Blizzard EntertainmentCompositeur Glenn StaffordJason HayesDébut du projet 2011Date de sortie INT : 2 juin 2015 Genre MOBAMode de jeu MultijoueurPlate-forme Windows, Mac OS XLangue MultilingueMoteur HavokVersion 2.33.0.65285Site web www.heroesofthestorm.commodifier - modifier le code - modifier Wikidata Heroes of the Storm (abrégé HotS) est un jeu vidéo de type arène de bataille en ligne multijoueur (MOBA), édité et d�...

 

 

Languages indigenous to Mesoamerica Maya glyphs in stucco at the Museo de sitio in Palenque, Mexico. An example of text in a Mesoamerican language written in an indigenous Mesoamerican writing system. Mesoamerican languages are the languages indigenous to the Mesoamerican cultural area, which covers southern Mexico, all of Guatemala, Belize, El Salvador, and parts of Honduras, Nicaragua and Costa Rica.[1][2] The area is characterized by extensive linguistic diversity containin...

 

 

Disambiguazione – Se stai cercando altri significati, vedi Macerata (disambigua). Questa voce o sezione sull'argomento centri abitati delle Marche non cita le fonti necessarie o quelle presenti sono insufficienti. Puoi migliorare questa voce aggiungendo citazioni da fonti attendibili secondo le linee guida sull'uso delle fonti. Questa voce o sezione sull'argomento geografia ha problemi di struttura e di organizzazione delle informazioni. Motivo: La voce, povera di fonti in rappor...

Swedish musician Ulf EkbergEkberg in 2012Background informationBirth nameUlf Gunnar EkbergAlso known asBuddhaBorn (1970-12-06) 6 December 1970 (age 53)Gothenburg, SwedenGenresPopOccupation(s)Musician, singer, songwriter, businessman, television producer, film producerInstrument(s)Keyboards, vocalsYears active1983–presentMusical artist Ulf Gunnar Ekberg (born 6 December 1970), also known as Buddha, is a Swedish musician, best known as a founding member of the pop group Ace of Base, alon...

 

 

Study of ethical conduct in sexual behavior Sexual morality redirects here. For the book, see Sexual Morality (book). This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Sexual ethics – news · newspapers · books · scholar · JSTOR (January 2008) (Learn how and when to remove this message) This article is written ...

 

 

Italian draughtsman and printmaker Stefano Della BellaPortrait by Carlo DolciBorn18 May 1610Florence, Grand Duchy of TuscanyDied12 July 1664 (aged 54)Florence, Grand Duchy of TuscanyNationalityItalianKnown forEngraverMovementBaroque Stefano della Bella (18 May 1610[1] – 12 July 1664) was an Italian draughtsman and printmaker known for etchings of a great variety of subjects, including military and court scenes, landscapes, and lively genre scenes. He left 1052 prints, and sever...

Indian actress, playback singer, television personality (born 2000) Sivaangi KrishnakumarSivaangi Krishnakumar in Super SingerBorn (2000-05-25) 25 May 2000 (age 24)Trivandrum, Kerala, IndiaEducationM.O.P. Vaishnav College for Women ChennaiOccupationsActressplayback singertelevision personalityComedianYears active2019–presentParentsK. Krishnakumar (father)Binni Krishnakumar (mother)YouTube informationChannel Sivaangi Krishnakumar Years active2005; 2009;2019–presentSubscriber...

 

 

You can help expand this article with text translated from the corresponding article in Russian. (April 2013) Click [show] for important translation instructions. View a machine-translated version of the Russian article. Machine translation, like DeepL or Google Translate, is a useful starting point for translations, but translators must revise errors as necessary and confirm that the translation is accurate, rather than simply copy-pasting machine-translated text into the English Wikip...

 

 

Îles TuamotuArchipel des Tuamotu (mul) Les Tuamotu (au milieu et en violet) sur la carte de la Polynésie française Géographie Pays France Archipel Tuamotu Localisation Océan Pacifique Coordonnées 18° 47′ 47″ S, 141° 35′ 02″ O Superficie 850 km2 Nombre d'îles 76 atolls Île(s) principale(s) Anaa, Fakarava, Hao, Makemo, Manihi, Rangiroa, Tikehau Géologie Atoll Administration Statut Forme un district Collectivité d'outre-mer Polynésie ...

« Boire » redirige ici. Pour les autres significations, voir Boire (homonymie) et Hydratation. Boire Cet article est une ébauche concernant la physiologie. Vous pouvez partager vos connaissances en l’améliorant (comment ?) selon les recommandations des projets correspondants. Hydratation d'une petite fille lors d'une opération humanitaire au Pakistan L’hydratation, en physiologie, est l'absorption d'eau par un être, qui se fait en buvant et en mangeant, ou la réduct...

 

 

United States Army general Lightning Joe redirects here. For the boxer, see Joe Gatti. J. Lawton CollinsBirth nameJoseph Lawton CollinsNickname(s)Lightning JoeBorn(1896-05-01)1 May 1896New Orleans, Louisiana, U.S.Died12 September 1987(1987-09-12) (aged 91)Washington, D.C., U.S.BuriedArlington National CemeteryAllegianceUnited StatesService/branchUnited States ArmyYears of service1917–1956RankGeneralService number0-5247UnitInfantry BranchCommandsChief of Staff of the United States ...