Lab 4 Writeup The facts: In Armenian, the sentence "I can eat glass" is: es apaki krnam utel I-NOM glass-NOM/ACC can-1SG eat-INF "es" is the first person singular nominative pronoun. "apaki" is the nominative/accusative form. "krnam" is an auxiliary verb meaning "to be able" that gets the inflection, and "utel" is the infinitive. The sentence "It doesn't hurt me" is: an intsi chhi vnaser it-NOM me-DAT not-is-3SG hurt-NEGPART "an" is the third person singular nominative pronoun. "intsi" is the first person singular dative pronoun, because the verb "vnasel" takes a dative object (which is nice, because it lets us make sure that case agreement is being passed up correctly). "chhi" is the third person singular of the negative form of the copula, and it takes "vnaser", a special negative participle form (which is very close to the infinitive: -l --> -r), as an argument. Before I started work on the lab, I had to replumb some morphology that I'd faked up in the previous labs. As you may recall, most Armenian verbs in the present and imperfect tenses take the prefix particle "ke-/k-". I had previous just put this particle directly in the lexicon. There are a few exceptional verbs that don't take "ke-/k-", including the copula and "to be able" above, and I could have just entered those in the lexicon without the particle. However, that wasn't going to be sufficient: I need the infinitive and negative participle forms of the verbs, and neither of those forms take "ke-/k-". To address this, I first replaced all the verb lexical entries, which had been in the "-l" infinitive form, with a bare stem. Then I wrote lexical rules to produce the infinitive and negative participle forms, adding a feature FORM to VERB as described in the lab. At first, the only values of FORM were "inf" and "neg", but this turned out to be insufficient, as we'll see below. Getting all the forms of inflection to play together was tricky. I went through many iterations before I finally settled on the final system. I override the dfinition of SIGN to add two new boolean features: KE-MARKED (which records whether a verb has had the "ke-/k-" prefix added yet) and PN-MARKED (which records whether a verb has been inflected for person and number). I did this so that I could force the lexical rules to apply in precisely one order. For the infinitive and negative participle forms, only the lexeme-to-word rules that build those forms (inf_verb-lex-rule and neg_verb-lex-rule) apply, leaving the verb marked [FORM inf] or [FORM neg]. For finite verbs, first the lexeme-to-lexeme ke-marking rule (ke_verb-lex-rule) applies marking the sign [KE-MARKED +]; then (and only then) the lexeme-to-word person-and-number rules apply, marking the verb [PN-MARKED +]. I also enhanced the various phrase structure rules so that they require [KE-MARKED +, PN-MARKED +] as well as [INFLECTED +], because otherwise the infinitive and negative participle forms could be used as main verbs. That brought me back to where I was at the end of the last lab, (although I had to tweak all the test sentences in test.all to replace "ke VERB" with "ke-VERB", since it was now an affix instead of a separate word). [Aside: the work described up to here was actually interleaved with the rest of the lab described below, but I didn't keep track of all the twisty little passages, and separating them makes for a more straightforward writeup.] To implement the potential form, I created potential-aux-verb-lex, which takes an infinitive verb as its first complement, and constrains its subject and other complements to be the same as the subject and complements of the verbal complement. It is also [KE-MARKED +], since the verb "krnal" does not take the "ke-" prefix. While NP complements come before the V in the unmarked sentence order, V complements come *after* the verb. To deal with this, I created a new phrase structure rule, aux-head-comp-phrase, that is head-initial instead of head-final. I also constrained the old head-comp-phrase rule to only work with [HEAD noun] phrases, and the new aux rule to only work with [HEAD verb] phrases, to prevent accepting many odd word orders. The one that should work is SOAV (subject, object, aux, verb), but without those constraints I also allowed (and over-generated) SAVO, SOVA, and several others. The semantics of the potential fell out nicely from the example provided in the lab. Although the handles are in a different order, they encode an equivalent semantic structure. To implement the negative form, I added a new lexical item for the negative copula (which will have to be broken down into the positive copula and a negative inflection if we need the positive copula later). This has an irregular form in the third-person singular -- see irregs.tab. This copula also behaves as an auxiliary verb, so I was able to make another new verb lex-item called negative-aux-verb-lex based on the potential verb one above. The same new phrase structure rule worked for this negative auxiliary. However, the semantics of this new lex-item turned out wrong. Rather than being some form of adverb, it was a verb, so its PRED became associated with an event instead of the handle of the hurt relation. To fix this, I stopped deriving negative-aux-verb-lex from basic-verb-lex, and instead derive it from basic-scopal-adjective-lex, adding the additional CONT.HCONS constraint to make a qeq associating the negative relation with the hurt relation. It still has [HEAD verb], though, so that it will be inflected properly. Deriving a verb from an adverb but still subjecting it to verbal inflection in this way is either a crime against nature or an elegant solution. At this point, I could parse all the correct sentences, but I had a serious overgeneration problem. If I parsed a sentence with an auxiliary verb and then generated, I also got a three spurious versions of the sentence: with a finite verb as the verbal complement, and with a finite or non-finite form with "ke-" added. This was because pn-marking and ke-marking did not add anything to the SYNSEM of the verb, so ke-marked verbs were compatible with all FORM values, and so would unify with lexical items that wanted either [FORM neg] or [FORM inf]. The solution to this was to add one more value of FORM, "fin" for finite, which is added by the ke-marking rule. I've made a new test.items that includes correct test sentences and incorrect variations of them, including: mard e krnay khnanal man the can to-sleep "the man can sleep" mard e kov e krnay utel man the cow the can to-eat "the man can eat the cow" es krnam phrrshtal I can to-sneeze "I can sneeze" es kov e krnam utel I cow the can to-eat "I can eat the cow" and, of course, es apaki krnam utel "I can eat glass" an intsi chhi vnaser "It doesn't hurt me" These new sentences are included in the revised test.all as well. All the right ones parse, all the wrong ones don't parse, and the glass/hurt sentences don't overgenerate.