;;; -*- Mode: tdl; Coding: utf-8; indent-tabs-mode: nil; -*- ;;; ;;; upon completion of `lexical parsing' (i.e. application of lexical rules ;;; until a fix-point is reached), we can now filter lexical entries. there is ;;; little point attempting to do that earlier (as PET used to in its original ;;; `-default-les' mode, where generics were only activated where there seemed ;;; to be `gaps' in the _initial_ lexical chart, i.e. after lexical lookup). ;;; ;;; the main problem in this approach is the interaction with orthographemics: ;;; in the initial lexical chart, there will be an edge analysing |UPS| as the ;;; plural or 3sg present tense form of the preposition |up|. it is only once ;;; lexical rules have been processed that we know such hypotheses have turned ;;; out invalid. thus, lexical filtering rules below operate on lexical edges, ;;; lexical entries that have gone through any number of lexical rules, i.e. ;;; everything that would ordinarily feed into syntactic rules. ;;; ;;; initially, our strategy is conservative: whenever there is a native entry, ;;; purge all generic entries in the same chart cell, unless there is a good ;;; reason to keep some. for now, only capitalization is considered a reason, ;;; and even there (i.e. for generic names), certain types of native entries ;;; will filter. ;;; ;;; both on tokens and signs, the `native' vs. `generic' distinction is made in ;;; ONSET values: `con_or_voc' vs. `unk_onset'. ;;; ;; ;; throw out generic whenever a native entry is available, unless the token is ;; a named entity (which now includes names activated because of mixed case or ;; non-sentence-initial capitalization). ;; generic_non_ne+native_lfr := lexical_filtering_rule & [ +CONTEXT < [ SYNSEM.PHON.ONSET con_or_voc ] >, +INPUT < [ SYNSEM.PHON.ONSET unk_onset, ORTH.CLASS non_ne ] >, +OUTPUT < >, +POSITION "I1@C1" ]. ;; ;; a native name, however, should suppress generic names, even NE ones. ;; proper_ne+name_lfr := lexical_filtering_rule & [ +CONTEXT < [ SYNSEM [ PHON.ONSET con_or_voc, LOCAL.CAT.HEAD noun, LKEYS.KEYREL.PRED abstr_named_rel ] ] >, +INPUT < [ SYNSEM [ PHON.ONSET unk_onset, LKEYS.KEYREL.PRED named_unk_rel ] ] >, +OUTPUT < >, +POSITION "I1@C1" ]. ;; ;; discard generic names (even NE ones) for |I| (and possibly other pronouns), ;; as the rule stands now. ;; _fix_me_ ;; seeing that |I| is the only pronoun that is standardly capitalized, should ;; we maybe make the +CONTEXT more specific. what are the chances of someone ;; launching a |Live You| product? (13-nov-08; oe) ;; proper_ne+pronoun_lfr := lexical_filtering_rule & [ +CONTEXT < [ SYNSEM [ PHON.ONSET con_or_voc, LKEYS.KEYREL.PRED pron_rel ] ] >, +INPUT < [ SYNSEM [ PHON.ONSET unk_onset, LKEYS.KEYREL.PRED named_unk_rel ] ] >, +OUTPUT < >, +POSITION "I1@C1" ]. ;; ;; likewise, discard a generic name for tokens like |Jr.| (post-head titles) ;; ;; _fix_me_ ;; but |The Church -- Turing thesis| makes this rule doubtful. why did we ;; think it would be justified in the first place? (23-oct-08; oe) ;; #| proper_ne+title_lfr := lexical_filtering_rule & [ +CONTEXT < [ SYNSEM [ PHON.ONSET con_or_voc, LOCAL.CAT.HEAD ttl ] ] >, +INPUT < [ SYNSEM [ PHON.ONSET unk_onset, LKEYS.KEYREL.PRED named_unk_rel ] ] >, +OUTPUT < >, +POSITION "I1=C1" ]. |# ;; ;; a named entity corresponding to a name kills a PoS-activated generic name, ;; unless that is a named entity itself. ;; generic_name+ne_name_lfr := lexical_filtering_rule & [ +CONTEXT < [ SYNSEM.PHON.ONSET unk_onset, ORTH.CLASS namedentity ] >, +INPUT < [ SYNSEM [ PHON.ONSET unk_onset, LKEYS.KEYREL.PRED named_unk_rel ], ORTH.CLASS non_ne ] >, +OUTPUT < >, +POSITION "I1@C1" ].