Language Parser in Prolog
(1) To build a parser, we need to rewrite CFG (context free grammar) as a set of Prolog clauses. In fact, the conversion of CFG into Prolog clauses is so simple that it can be done automatically. The clauses which are written in Definite Clause Grammar (DFG) notation looks like this:
s --> np, vp.
np --> det, n.
n --> [leaf node].
When these clauses are loaded from a file into memory, they are automatically converted into:
S(L1,L) :- np(L1,L2),vp(L2,L).
np(L1,L):- d(L1,L2),n(L2,L).
n([dog|L],L).
or equivalent. The translated rules are implemented based on the idea of so-called difference lists.
a) Enter the above grammar in a file and then load it into Prolog. Type listing to see the translated rules.
b) Write a grammar which can recognize the sentences “kim ate the egg”, “sandy saw the duck” etc. After writing the grammar below, save it and run it in prolog as s[[kim,ate,the,egg],[]]. and s[[sandy,saw,the,duck],[]].
s --> np, vp.
vp --> v, np.
np --> det, n.
det --> [].
det --> [the].
n --> [sandy].
n --> [egg].
n --> [duck].
v --> [ate].
v --> [saw].
c) Extend the above grammar to allow the recognization of di-transitive verbs such as “kim gave the egg to sandy”.
s --> np, vp.
vp --> v, np, pp.
np --> det, n.
pp --> p, np.
pp --> [].
det --> [].
det --> [the].
n --> [sandy].
n --> [egg].
n --> [duck].
v --> [ate].
v --> [saw].
v --> [gave].
p -->[to].
(2) The programs above is able to recognize grammatical structure (that is, they could correctly answer ``yes'' or ``no'' when asked whether the input was a sentence, a noun phrase, and so on). A practical parser should also report the structure of the sentence. In particular, we want to get the parser to produce the tree diagram for sentences. Representing the tree structure is straightforward; they correspond to Prolog clauses. For example, the tree
can be represented as the structures:
s(np(det(a),n(woman)),vp(v(shoots))).
To produce this representation, we add extra arguments to on the above grammars, for example:
s(s(NP,VP)) --> np(NP),vp(VP).
np(np(DET,N)) --> det(DET),n(N).
vp(vp(V,NP)) --> v(V),np(NP).
vp(vp(V)) --> v(V).
det(det(the)) --> [the].
det(det(a)) --> [a].
n(n(woman)) --> [woman].
n(n(man)) --> [man].
v(v(shoots)) --> [shoots].
np(np(DET,N)) --> det(DET),n(N).
vp(vp(V,NP)) --> v(V),np(NP).
vp(vp(V)) --> v(V).
det(det(the)) --> [the].
det(det(a)) --> [a].
n(n(woman)) --> [woman].
n(n(man)) --> [man].
v(v(shoots)) --> [shoots].
Extend the grammar you develop so that it can generate a correct tree for “sandy ate the egg”.
(3) In English, pronouns are marked for CASE. That is, the pronoun has a different form before the verb than after it. For example, the sentence him saw he is wrong because it violates this basic fact. To extend the DCG to account for this information, we also add extra argument to the grammars, such as:
s --> np(sub),vp.
np(sub) --> det,n.
np(obj) --> det,nobj.
vp --> v,np(obj).
nobj -->probj.
n --> pr.
pr --> [i].
pr --> [we].
pr --> [you].
pr --> [they].
v --> [see].
probj --> [me].
probj --> [you].
probj --> [us].
probj --> [them].
det --> [];[the].
Get this program working and see whether it is able to accept “I see them”” and “We see them”, but not “them see they” or “me see i”.
Other features in English such as AGREEMENT can also be encoded into the grammar in this way. AGREEMENT means that the verb and its subject have to agree in number; that is, they are either either singular or plural.
Build a parser which should accept the “the dog chases the cats” but not “the dog chase the cats”.
No comments:
Post a Comment