Arboretum implementation status as of 10-May-04 To invoke the arboretum machinery, do (pushnew :arboretum *features*) at a fresh Lisp prompt, then recompile the LKB and load the ERG. If using the LexDB, then after loading, click on LexDB--Load TDL Entries and load the file erg/arboretum/mal-lex.tdl. Then index for generation, and process sentences using the command lkb::grammar-check() instead of do-parse-tty(), as for example (lkb::grammar-check "dog barks") Note that error descriptions are now defined in the file "errordesc.lsp" in this directory, to avoid having to recompile the LKB file "arboretum.lsp" during mal-rule development. Error types handled or in development: 1. Determiner-noun number mismatch Example: 'two dog bark' => 'two dogs bark' Device: mal-infl rule for making plural nouns for unmarked form. Notes: Multiple outputs possible for "dog bark": "dogs bark" or "a/the dog barks". Can sometimes disambiguate for subject NPs, assuming that 3sg marking on verb is correct, so "dog barks" can (always?) correct to "a/the dog barks". But this ambiguity is always present with non-subject NPs: "they chased cat" (either "a/the cat" or "cats"). We'll hope a tree bank can get the choice to look okay. 2. Subject-verb number mismatch Example: 'dogs barks' => 'dogs bark' Device: mal-infl rule for making non-3sg verbs for 3sg-marked form. Notes: Since we don't supply a rule for removing the plural marking from a noun, we only supply one possible correction for 'dogs barks'. But as noted above, we will in principle offer two corrected alternatives for 'dog bark', where the 'bark' => 'barks' is what we want for 'this dog bark'. This approach may run into trouble for the (arguably rare?) cases where the determiner and the verb are both marked singular, but disagree with the noun, as in 'this dogs barks'. Lacking a mechanism for undoing plural marking on a noun keeps ambiguity down at some possible cost in coverage. Note that we will do okay for the converse: 'these dog bark'. 3. Missing determiner Example: 'dog barks' => 'a/the dog barks' Device: Unary phrasal mal-rule which makes an NP for a singular count noun. Notes: Succeeds for subject NPs where the verb is marked for 3sg. Multiple outputs possible for "dog bark"; see discussion of determiner-noun number mismatch. 4. Extraneous (duplicate) determiner Example: 'the my dog barks' => 'my dog barks' Device: Binary mal-rule which combines two determiners, treating the second one as the head, and adding placeholder semantic relations for the restrictor and scope of the first determiner. Notes: FIX - Not yet generating. This approach requires munging before generation, since our strong assumption of monotonicity for the RELS list in the grammar means there will be semantics supplied by the first determiner that we don't want in the input to the generator. Maybe useful, since the munging rules might be more systematic about choosing which of the two determiners is more contentful: on the current approach, we'll lose information for 'our the dog barks'. Maybe we even want to provide richer paraphrasing eventually, to correct 'our that dog barks' to 'that dog of ours barks' where we preserve both dets. 5. Extraneous determiner for strictly mass nouns Example: 'I need an information' => 'I need information' Device: Mal lexical entries for "a/an" which have the properties of "some". Notes: Works only for mass/abstract nouns that do not also have a life as a singular count noun. 6. Missing subject for finite VP Example: 'looks good' => "they/that looks good' Device: Unary phrasal mal-rule converting finite VP to sentence, adding semantics for a proposition and the first-person subject pronoun. Notes: It's not obvious how to supply a single overt subject when the missing one could be either 1sg ("saw myself") or 3per ("looks good", "aren't here"). For now, we opt for the vague demonstrative/pronoun "they/that". FIX - we currently restrict the daughter VP to be [ROBUST -] in order to prevent mal-inflected verbs from heading the VP, which limits interactions with other error types in the same example. 7. Inversion of subject and main verb Example: 'hired you Kim?' => 'did you hire Kim' Device: Lexical mal-rule like the one for inverted auxiliary verbs. Notes: 8. Negation of main verb Example: 'we hired not him' => 'we did not hire him' Device: Lexical mal-rule like adverb-addition for auxiliary verbs. Notes: Only works if the "not" follows the main verb, so we don't handle 'we not hired him'. 9. VP complement mismatch [Not yet implemented] Example: 'this allows to stay' => 'this allows one to stay' Device: Proposed - unary phrasal rules converting each of INF, PRP, BSE VPs to underspecified VFORM. Could also do lexical rules, but better control and perhaps more efficiency with phrasal rules. 10. Perfective aspect/tense mismatch Example: 'last night he has arrived' => 'last night he arrived' Device: Proposed - munging rules to detect presence of closed-class semantic predicates for "last", "yesterday", etc. which exclude present-perfect, and substitute corrected semantics along with error detection flag. Notes: Will be a bit subtle: contrast "since last week I have improved", where semantics using present-perfect is okay, with "*before last week I have improved". This approach will require alterations to the control structure, always checking parses (well-formed or not) against these munging rules, which will also be cleaning up semantics for generation in some cases, as with double-determiner treatment above.