Kelly O'Hara LING 567 HW #5 4 February 2007 Zulu has agreement but no case, so I chose the corresponding package of two kinds of agreement. In this case I chose subject-verb and object-verb agreement. There are some elements in my test suite that I put in as determiner-noun agreement, but I think we have agreed these are not determiners, so I stuck with the for-sure verb agreement. The nature Zulu's noun class system meant that I ended up with about a bazillion lexical rules. I put in sets of rules for subject prefixes, subject concords, and object concords. Each of those categories over 20 lexical rules apiece. Most have corresponding i-rules as well. I also included a rule for verb tense, just so I could make sure the prefix attaches in the right place. That was a relatively painless addition. Before I get into the actual agreement issues, I'll talk some about the noun class prefixes. At the end of lab 4, I was tempted to just hard-code the noun class prefixes onto the nouns, as my lexicon is not that large. But I came away from class on Wednesday determined to apply the noun classes by lexical rules. I added a "gender" feature (which I called NCL for nounclass) to the PNG feature, and then started in making subclasses for all the noun classes. Because most of the noun classes are in singular/plural pairs, I created several supertypes. The supertypes have names like c1-or-c2, which then has c1 and c2 as subtypes. One tricky bit is that class 10 is the plural of both classes 9 and 11. For this I created two supertypes c9-or-c10 and c11-or-c10 (the eleven is first so I'll remember it's singular). Class 9 inherits from the first one, class 11 from the second one, and class 10 from both. Some classes are not in singular/plural pairs (mostly abstract concepts, mass nouns, and infinitive forms) and these inherit directly from nclass. Individual lexical entries' NCL values are specified as the class supertypes (where appropriate). The class (and therefore prefix ) is determined by a class-prefix-lex-rule. For each noun class, I wrote a separate subtype of this rule, where I specify the PNG.NUM value and further constrain the PNG.NCL value. I wrote these as infl-ltow-lex-rules, as that's what I had written down in my notes from class, but I wonder now why that works, because I am usually changing the NCL value. Perhaps because I am further constraining the value, rather than changing it entirely? At any rate, every lexical rule has a corresponding i-rule, to provide the correct surface forms. On to agreement! Non-interogative Zulu verbs must agree with their subject. If the subject is first or second person, the verb agrees in person and number. If the subject is third person, it agrees with the noun class (by which you get number for free). Zulu noun classes are similar to the concept of grammatical gender, but on a much larger scale. Agreement is indicated by a prefix on the verb, known as a subject concord. There is a subject concord for each of the 17 noun classes, plus for each of the 4 person/number pairs for non-third person subjects, for a total of 21 concords. Transitive verbs can optionally agree with their object as well, also by a prefix on the verb. I suspect the differences in form between the subject and object concords are morphophonological, relating to where in the word each appears. However, in my grammars they are presented as separate, and for the purposes of this assignment I treated them as unrelated. In building my test suite, I encountered some strange variations of the agreement system. I was able to avoid these completely in my test suite, and so am not worrying about implementing them. I have instead implemented all of the standard subject-verb and object-verb agreement. Once I figured out how a rule needed to work, it was mostly copy-and-paste to provide coverage for all of the noun classes. To give a big-picture view, the order of application of the present verb rules is: (object concord) - ltol verb tense -ltol subject concord -ltow (negation) -ltow(?) Object concords join closest to the verb. The DTR value for obj-concord-lex-rule is transitive-verb-lex, because intransitive verbs shouldn't have objects. Then, similar to the noun class prefixes, I implemented subtype rules for all the noun classes, plus the first and second person concords. The rules look at the PNG values of the (one element on the) COMPS list of the input verb. Also like the noun class prefixes, I made a corresponding list of i-rules for the object concord prefixes. Next up, verb tense. Tense doesn't have anything to do with agreement, but I wanted to implement it because it gets sandwiched between the subject and object concords. Putting in a lexical rule seemed like the most elegant way to handle this. All the sentences I am currently parsing are in future tense, so that is the only tense I implemented. The rule sets the INDEX.E.TENSE value to `fut'. It also has a corresponding i-rule to add the tense prefix. One side effect of implementing tense is that I can no longer parse imperatives, which don't have tense markers. I felt this was an acceptable trade off for now. The trickiest part of getting tense working was setting up the input. The future tense prefix attaches outside the object concord -- if there is one. Object-verb agreement is optional, so I needed the rule to accept DTR values of types obj-concord-lex-rule and verb-lex. I accomplished this by creating a supertype verb-tense-rule-dtr that was a supertype to both input types. Subject concords are implemented pretty much the same as object concords. The DTR value of subj-concord-lex-rule is verb-tense-lex-rule, because every verb form has both a subject concord and a tense marker, or neither. The agreement information is also pulled from the PNG values of the (one element on the ) SUBJ list of the input verb. The subj-concord-lex-rule is a lexeme to word rule. However, it is also possible to add a negative prefix onto the front of the whole affair. I only mention this because I had to edit the DTR value of the neg-infl-lex-rule. The first time I ran tsdb with subject concords, negation (which was working previously) did not parse. It took me a minute to realize that it was because the input value was now subj-concord-lex-rule, rather than lex-item (or whatever it had been before, I forget now). I mentioned earlier that not all verbs forms take subject concords, but the ones that don't have different means of negation. For now this is all I need. I am very happy with how my grammar is working now. Although I'm no longer parsing imperatives, the total number of parses went up considerably. My baseline lab 4 grammar (with a cleaned up test suite) was 6 parses, with one overgeneration, My grammar now does 19 parses, with no overgeneration. This is everything that I should be able to parse at this point, so I'm doing as well as I can. My grammar also generates as I would expect it to. A sentence with an intransitive verb generates two sentences: one declarative and one with the question marker. This isn't surprising, as we haven't provided any means of differentiating the two yet. With a transitive verb, there are 4 generations: the declarative/interrogative pairs as above, for verbs with and without the object concord.