eLinguistics 567, Spring 2005, Jonathan Pool Lab 6 (Esperanto) Preparation Redundancy In my prior report, I described the semantically identical multiple parses yielded by the grammar in the case of adverbs modifying transitive verbs. I showed that constraining adverbs' MOD values so as to prevent adverbs from modifying nodes of particular types couldn't stop all such redundancy without also causing under-acceptance. In this section I describe how I eliminated the redundant parses arising from adverbial modification. Emily Bender suggested investigating the effect of preventing the head-opt-comp rule's daughter from having a [ MODIFIED hasmod ] value. Following this idea for the phrase rules licensing the redundant parses, I added the following constraints: head-opt-comp-phrase: [ HEAD-DTR.SYNSEM.MODIFIED notmod ] head-comp-phrase: [ HEAD-DTR.SYNSEM.MODIFIED notmod ] This decreased the redundant parses, but did not eliminate them. Some parses that couldn't be eliminated in this manner remained, as illustrated by: precize tondas harojn ili precisely shear-PRES hair-PL-ACC they They cut hair precisely This sentence previously had three semantically identical parses, and blocking the complementation of a modified verb decreased them to two. Those remaining were parses in which the adverb "precize" modified the verb phrase "tondas harojn" in one case and modified the sentence "tondas harojn ili" in the other case. The rule licensing both of these modifications was adj-head-int. I investigated dealing with this surviving redundancy by requiring or prohibiting the presence of a subject in the head daughter of an adj-head-int phrase. I found that this would work, but would have adverse side-effects. Requiring a subject (i.e. constraining HEAD-DTR.SYNSEM.LOCAL.CAT.VAL.SUBJ to have the value "null") would eliminate the sole parse of ili precize tondas in which the adverb modifies the verb before the subject is realized (adverbs attach only to their right). And prohibiting a subject (i.e. constraining HEAD-DTR.SYNSEM.LOCAL.CAT.VAL.SUBJ to have the value "cons") would eliminate a semantically unique (VP-modified) parse of precize ili tondas leaving only a parse in which the adverb modifies the subject. Finally, I succeeded in eliminating the remaining redundancies by barring the head daughter of an adj-head-int phrase from having a post-head subject descendant. To accomplish this, I first investigated the possibility of requiring the adj-head-int type to have a value of "-" on HEAD-DTR.SYNSEM.LOCAL.CAT.POSTHEAD. But the POSTHEAD value of the subj-head type is likewise "+", so this method would eliminate a semantically unique parse in sentences such as precize ili tondas So, I resorted to defining a Boolean feature, "PREMODIFIABLE", that permits or prohibits adverbial modification. I made it appropriate for the type "valence", constrained the "head-subj-phrase" type to have a value "-" on SYNSEM.LOCAL.CAT.VAL.PREMODIFIABLE, and constrained the "all-adverb-lex" type to make the first item of its SYNSEM.LOCAL.CAT.HEAD.MOD list have a "+" value on LOCAL.CAT.VAL.PREMODIFIABLE. Since the adj-head-int phrase requires these two values to be identical, adverbial modification is now prevented once a post-head subject is realized. Imperatives Esperanto has a verbal suffix that expresses imperativity, exhortation, and volition. Occasionally it is used even to express states and actions considered but not realized, as in: La akvo tro malprofundas ke vi plonĝu the water-NOM too shallow-PRES that you dive-IMP The water is too shallow for you to dive Ili permesis ke mi parolu they permit-PAST that I speak-IMP They permitted me to speak This -u suffix alternates with the suffixes expressing declarative tenses and the suffix expressing conditionality. For example, the noun "vintro" ('winter') and the verb stem "ven" ('come') can be combined in any of these ways (and equivalently in the noun-verb order): venis vintro (Winter came) venas vintro (Winter is coming) venos vintro (Winter will come) venus vintro (Winter would come) venu vintro (Let winter come) I am calling any verb with the -u suffix and "imperative" verb. Imperative verbs are distributionally almost identical to declarative and conditional verbs, in that, like them, imperative verbs may have subjects of all persons and numbers, may be embedded in both declarative and interrogative complementizer clauses, and may be the verbs of wh-questions. Examples: mi pensu I think-IMP Let me think cxu mi atendu TF I wait-IMP Shall I wait? mi ne scias cxu mi atendu I not know-PRES TF I wait-IMP I don't know whether (I ought) to wait kial ili tion kredu why they that-N-ACC believe-IMP Why should they believe that? mi petas ke vi rapidu I request-PRES that you-NOM fast-IMP Quickly, please However, when an imperative verb that would require a subject if declarative or conditional has no overt subject and is the verb of a declarative sentence, then the sentence is grammatical and the implied subject has the second person and an underspecified number. The number may be revealed if the verb has an adjectival predicative complement, as in: estu saĝa be-IMP wise-NOM-SG Be wise (singular) estu saĝaj be-IMP wise-NOM-PL Be wise (plural) The grammaticality of second-person imperative verbs with no overt subjects does not extend to other contexts and does not extend to non-imperative verbs in any context. Thus: *kial tion kredu why that-N-ACC believe-IMP *mi petas ke rapidu I request-PRES that fast-IMP *respondos baldaŭ answer-FUT soon Some verbs lexically license null subjects, and these may be inflected imperatively, as in: mi preĝas ke pluvu I pray-PRES that rain-IMP I pray that it rain varmiĝu become-hot-IMP May it warm up Imperative verbs of this class that are the verbs of declarative sentences and have no expressed subjects can produce ambiguities, arising from the two licenses. Thus, "varmiĝu", above, may also mean "(You) warm up". This kind of ambiguity is resolved in the event of a non-nominal predicative complement, since it will be adjectival if the subject is merely covert, but adverbial if null, as in: estu trankvila be-IMP quiet Be quiet estu trankvile be-IMP quietly Let it be quiet Interrogatives Esperanto applies interrogativity to any constituent by preceding it with the question word "ĉu". This word may appear at the beginning of a verb clause. If it is to be attached to a subclausal constituent, that constituent appears immediately after it, though because of the free order of clausal constituents there are typically both clausal and subclausal interpretations of such an attachment. For example: ĉu pluvas TF rain-PRES Is it raining? ĉu min vi kritikas TF me you-NOM criticize-PRES Is it I that you're criticizing? Are you criticizing me? ĉu matene vi plej komforte verkas TF morning-ADV you-NOM most comfort-ADV write-as-compose-PRES Is it in the morning that you're most comfortable writing? Are you most comfortable writing in the morning? When a speaker wants to disambiguate the attachment, the usual device is to ask a wh-question and follow it with an interrogative sentence fragment containing only the questioned constituent, such as: kiam vi plej komforte verkas; ĉu matene when you-NOM most comfort-ADV write-as-compose-PRES TF morning-ADV When are you most comfortable writing; in the morning? Any verb form may be subjected to interrogative treatment: ĉu pluvas TF rain-PRES Is it raining? se mi eksiĝus ĉu vi plendus if I ex-MED-COND TF you-NOM complain-COND If I resigned, would you complain? ĉu ni dancu TF we dance-IMP Shall we dance? mi ne scias ĉu plori I not know-PRES TF weep-INF I don't know whether to cry It appears to me, then, that Esperanto syntax treats interrogativity as a higher-level attribute than imperativity. A clause can be interrogative or non-interrogative. In either case, its verb can be indicative (tensed), conditional, or imperative. Clausal Embedding Esperanto exhibits embedded clauses introduced by both relative words and complementizers. This discussion is limited to the latter. A complementizer-embedded clause's valence depends on the complementizer. If the complementizer is the interrogative one ("ĉu"), the clause has an interrogative character, which does not depend on whether its verb is indicative, conditional, or imperative. However, if the complementizer is the non-interrogative one ("ke"), the clause's character depends on the verb, with an imperative verb making the clause imperative and an indicative or conditional verb making the clause declarative. This declarative, imperative, or interrogative character of the clause determines its ability to be selected as a complement by a matrix verb. Some verbs can and others cannot be complemented by a complementizer clause. Of those that can be, for some the complement clause must be interrogative, for others it must be non-interrogative, and for others it may be either. When it must be non-interrogative, in some cases it must further be imperative. In all of these cases, however, the embedded clause's verb may be imperative (even though that doesn't make the clause itself imperative when the clause's complementizer is interrogative). For example, the verb "postul" ('demand') requires that a complementizer-clausal complement be imperative (thus with the "ke" complement and an imperative verb), as in: mi postulas ke vi silentu I demand-PRES that you-NOM silent-IMP I demand that you be silent The verb "esper" ('hope') requires that a complementizer-clausal complement be non-interrogative (with "ke") but not necessarily imperative, as in: mi esperas ke infanoj silentu dum koncertoj I hope-PRES that children-NOM silent-IMP during concerts-NOM I hope children are expected to be silent during concerts iuj esperas ke ili neniam mortos some-N-PL hope-PRES that they never die-FUT Some hope that they'll never die The verb "demand" ('ask to know') requires that a clausal complement be interrogative (with "ĉu"), optionally with an imperative verb, as in: demandu ĉu vi restu hejme ask-IMP TF you-NOM remain-IMP home-ADV Ask whether you should stay home mi demandos ĉu vi rajtas kunveturi I ask-FUT TF you-NOM right-PRES with-travel-INF I'll ask whether you can come along Other verbs, including "sci" ('know') and "dub" ('doubt'), permit the full range of complementizer-clause complement types illustrated above. Syntactic Coverage The current grammar covers both matrix and embedded clauses with the full range of indicative, conditional, and imperative verb forms, and covers embedded clauses with both interrogative and non-interrogative complementizers. Verbs correctly select for complement clause types, including the permitted combinations of complementizers and verb forms. The correct range of free word orders within clauses is recognized. As a result, bizarre and human-challenging center-embedded sentences are treated as grammatical, such as: dubas ke dubas ke dubas ke dubas mi mi mi mi doubt-PRES that doubt-PRES that doubt-PRES that doubt-PRES I I I I I doubt whether I doubt whether I doubt whether I doubt Realistically complex sentences combining verb forms, the forms of clausal embedding, and adjectival and adverbial modifiers are correctly parsed, such as: postulis nur ili ke mi demandu cxu flavaj hundoj frenezaj timu ke iliaj kadukaj estroj nepre tondos cxies harojn ;demand-PAST only they that I ask-IMP TF yellow-NOM-PL dog-NOM-PL ;crazy-NOM-PL fear-IMP that their-NOM-PL decrepit-NOM-PL masters-NOM ;definitely shear-FUT everybody's hairs-ACC ;'They alone demanded that I ask whether crazy yellow dogs should fear ;that their decrepit masters would definitely cut everybody's hair.' Individual violations introduced into such sentences prevent them from being parsed. The grammar correctly accepts covert-subject imperative sentences and correctly rejects attempts to embed such clauses. Multiple parses that are semantically indistinguishable have been eliminated. The second-person pronoun "vi" has singular and plural interpretations, so any appearance of it in a sentence gives rise to dual parses. Possessive pronouns are treated as existing in determiner and adjective versions, so their use produces dual parses, whose MRSs differ but not in a way that seems entirely correct or useful, so I intend to seek a better treatment of these pronouns. Various constructions not intended to be covered have not been covered. These include infinitive verbs, predicative complements, and categorial (noun-to-verb etc.) derivations. Syntactic Analysis The main device used for the correct treatment of matrix and embedded clauses is a set of additional Boolean "cat" features: IMPER marks a phrase as imperative or not. QUESTION marks a phrase as interrogative or not. SENTENCE marks a phrase as a semantically annotated clause or not. ROOTMOM marks a phrase as mandatorily having the root node as its mother or not. I have constrained the "headed-phrase" type to require it to copy its head daughter's values of IMPER and QUESTION to itself. The imperative verb lexical rule gives a word an [ IMPER + ] value, and the other verb lexical rules give a word an [ IMPER - ] value. This value is copied to the root node by virtue of the just-mentioned constraint, except that the copying is stopped if the VP is a complement. In that case, the complementizer decides what its own IMPER value is. If it is the interrogative complementizer, it makes itself [ IMPER - ]. As for the "ke" complementizer, the grammar treats it as two separate lexemes with the same stem, a declarative complementizer and an imperative complementizer. The declarative "ke" requires that its complement be [ IMPER - ] and gives itself that value, too. The imperative "ke" requires that its complement be [ IMPER + ] and gives itself that value, too. This bifurcation seems to facilitate assigning two different MSG values to the complementizer depending on the IMPER value of its complement. The QUESTION feature's value is set by complementizers. It is "+" in the interrogative and "-" in the declarative and imperative complementizers. The headed-phrase constraint copies this value to the CP mother and, if there is a CP grandmother, to it, too. The grammar uses the SENTENCE feature to assure that a phrase is semantically annotated before being used as a complement or as a sentence. The grammar adds to the constraints on the "head-nexus-phrase" type by making it [ SENTENCE - ]. This value is inherited by the various subject-head and complement-head phrase types. The complementizers require their complement to be [ SENTENCE + ] and give themselves a [ SENTENCE - ] value. Four grammar rules, for declarative, imperative, command, and interrogative clauses, require their head daughters to be [ SENTENCE - ] (as phrases generally are) and give themselves a [ SENTENCE + ] value. The ROOTMOM feature assures that imperative clauses with covert subjects appear only as sentences and not as embedded clauses. Phrases generally inherit [ ROOTMOM - ] from three supertypes, "basic-head-subj-phrase", "basic-head-comp-phrase", and "adj-head-int-phrase". The complementizer types require their complement clause to be [ ROOTMOM - ]. The "bare-imperative-phrase" type, which licenses covert-subject imperative phrases, makes itself [ ROOTMOM + ]. This prevents it from then being used as a complement by a complementizer. All clausal semantic types copy their daughters' ROOTMOM value to themselves. The "command" clause type requires itself (and thus its daughter) to be [ ROOTMOM + ], so this type is used for a covert-subject imperative sentence. The other clause types require themselves and their daughters to be [ ROOTMOM - ], so they are used for all other clauses. Verb lexeme types use these features for complement selection. Those selecting imperative complementizer-clause complements require their complements to be [ SENTENCE - ], [ QUESTION - ], and [ IMPER + ]. Those selecting non-interrogative complements that may be declarative or imperative require their complements to be [ SENTENCE - ] and [ QUESTION - ] but don't specify IMPER. Those selecting interrogative complements require them to be [ SENTENCE + ], [ QUESTION + ], and [ IMPER - ]. The effect is to require interrogative complements to have semantic annotations added, reflecting the contribution of the interrogative complementizer, but not to require that of other complements, since "ke" doesn't require any additional semantic annotation. This regime is compatible with the fact that complementizer phrases with the interrogative "ĉu", when semantically annotated, are qualified to be both stand-alone sentences and embedded clauses, while those with the non-interrogative "ke" may be used as complements but cannot become stand-alone sentences. I looked for existing features that would have the same functionality but didn't find any that seemed practical. I considered trying to use constraints on semantic features, and in fact some of those do seem to have desired effects, but I wondered whether reliance on semantic constraints for syntactic modeling could cause future problems arising from the tendency for many syntactic forms to have the same semantic value and vice-versa. So I surmised that adding "cat" features would be the most robust strategy. Semantic Analysis Lexical rules that inflect nouns and determiners for number constrain the lexemes' values of CONT.HOOK.INDEX.PNG.NUM accordingly. Lexical entries for pronouns do so to NUM and PER, with a lexical type doing this batchwise for the numerous third-person singular pronouns. Several lexical and phrasal types constrain their CONT.MSG.PRED values, and verb types constrain those of their complements. However, these verb types also constrain their complements' CAT.IMPER, CAT.SENTENCE, CAT.QUESTION values, making it unclear, pending further investigation, whether the PRED constraints are superfluous. Other semantic constraints are introduced in various type definitions, but I cannot explain their purposes and effects without further study. The semantic contribution of the interrogative complementizer is not yet being correctly reflected in indexed MRS results. All proposition_m_rel elements above such a question's question_m_rel element are missing, and the question_m_rel's argument is an indirect rather than a direct reference to the embedded proposition. Other than that problem, MRS results appear to be as expected for declarative and interrogative matrix clauses with any level of embedding of declarative clauses. The grammar's definition of "complementizer-lex-item" does not include the recommended identification of the complementizer's MSG with its complement's MSG value. This is because this identification appeared to prevent the correct use of this type as a supertype for the interrogative complementizer type, and because the omission of this constraint didn't seem to cause any errors. Testing My testing has been confined to single-sentence and batch parses within LKB. I intend to begin deploying the itsdb tool once I have corrected the semantics of the existing grammar.