In this exercise, you will integrate a morphological analyzer into your grammar. Up to now you have been working with a full form lexicon. This means you have full control over the lexical entries within your grammar, but it is also very tedious as you have to write a separate lexical entry for each inflected (or derived) form.
For this exercise you can work either with a version of the finite-state morphological analyzer that is part of the English ParGram grammar. This is automatically available to you if you have an XLE license and have downloaded the English grammar. It is called english.infl.patch.full.fst and can be found in the "prelex" folder of the English ParGram grammar. Alternatively you can work with the morphological analyzer that is available in the Starter Grammar section of the XLE documentation. This is called eng-pargram-morph.fst.
The slides already show you how to integrate the verbs and the unknown entry.
Make sure your grammar can do this. You need to make sure your grammar is set up correctly in terms of the following:
---- DEMO ENGLISH MORPHOLOGY (1.0) TOKENIZE: P!basic-parse-tok.fst G!default-gen-tokenizer.fst ANALYZE: english.infl.patch.full.fst ----
---- MORPH ENGLISH LEXICON (1.0) "this guesses words that are unknown to your lexicon to be either adjectives or nouns" -unknown ADJ-S XLE (^ PRED) = '%stem'; N-S XLE (^ PRED) = '%stem'. "lexical entries for tags coming out of the morphological analyzer" +Verb V-POS XLE . +Pres TNS XLE @VPRES. +3sg PERS XLE @S-AGR. +PastBoth TNS XLE "past tense or past particle" { @VPAST | @PASTP }. +123SP PERS XLE . +Non3sg PERS XLE @BARE-AGR. +Prog ASP XLE @VPROG. +Noun N-POS XLE . +Pl N-NUM XLE (^ NUM) = pl. +Sg N-NUM XLE (^ NUM) = sg. ----
---- MORPH ENGLISH RULES (1.0) "sample rule" "this deals with verbs. The sublexcial items are the POS assigned to the various tags in morph-lex.lfg" V --> V-S_BASE "verb stem, e.g. run" V-POS_BASE "suffix saying that this is a verb: +Verb" { TNS_BASE "tense suffix, e.g. +Pres" PERS_BASE "person suffix, e.g. +3sg" | ASP_BASE}. "aspectual information" N --> N-S_BASE "noun stem" N-POS_BASE N-NUM_BASE. ----
"all verbs here" hate V-S XLE @(TRANS %stem). "using the morphological analyzer" eat V-S XLE @(OPT-TRANS %stem).
Now expand the grammar so that nouns and adjectives are also coming out of the morphological analyzer.
Generally proceed in the following way:
% morphemes bananas analyzing {bananas} {bananas "+Token"|banana "+Noun" "+Pl"}
In the MORPHOLOGY section there is an "unknown" entry which allows the morphological analyzer to pass its knowledge about lexical items into the grammar.
-unknown ADJ-S XLE (^ PRED) = '%stem'; N-S XLE (^ PRED) = '%stem'.
This entry has the effect that any word not present in the Lexicon section of your grammar will be guessed to be either a noun or an adjective. If this word follows the sublexical rules specified in the grammar, then it can be parsed by the grammar.
You do need to write lexical entries for those words which do not have guessable functional annoations. For example, the morphological analyzer does not know whether a verb is intransitive, transitive, ditransitive, etc. So for verbs you need to specify the head word (lemma) and the relevant subcatgorization information in the lexicon.
For nouns and adjectives, this is not necessary, as the "unknown" guesser in guesses words not in the lexicon to be either nouns or adjectives. So, unless you wanted to specify extra information for a particular lemma, you do not need to have extra entries for nouns and adjectives in your lexicon. Try deleting (or commenting out) all the ones you have entered and see if your testsuite still works.
If you want entries in your grammars that could be several different Parts-of-Speech (for example "dog" could be encoded as a verb and as a noun), you need to make sure that the entries interact with one another properly. The way to do this is to specify that the verb entry is not the only entry via ETC, for example.
dog +V-S XLE @(TRANS dog); ETC.
Read more about the interactions between lexical entries and lexicons in XLE Lexicon Entries and Lookup Model.
Note that as your grammar is expanding, you need to ensure backwards compatibility. So, when you hand in the grammar that solves this exercise, you also need to hand in a testsuite that: a) includes the sentences from this exercise; b) includes the sentences from all the previous exercises; c) sentences you came up with yourself in previous exercises to test your grammar. You should run your testsuite (we will also do this and check your grammar for backwards compatibility) and make sure that everything still works as expected.
The Grammar Writer's Cookbook, Ch. 12 (finite-state morphologies)
Kaplan, Ron, John T. Maxwell III, Tracy Holloway King and Richard Crouch. 2004. Integrating Finite-state Technology with Deep LFG Grammars. In Proceedings of the ESSLLI04 Workshop on Combining Shallow and Deep Processing for NLP.