;;; Hey, emacs(1), this is -*- Mode: fundamental; Coding: utf-8; -*- got it? ;;; ;;; first shot at collecting the various parameters needed for the extraction ;;; of bi-lexical (semantic) dependencies from ERG analyses. ;;; ;;; ;;; some relations are made transparent, in the sense of equating themselves ;;; with one of their arguments. for nominalizations, for example, assume the ;;; following: ;;; ;;; x1:nominalization[ARG1 e2] ;;; ;;; flagging ‘nominalization’ as transparent conceptually means that nodes ‘x1’ ;;; and ‘e2’ are unified, i.e. treated as a single node. practically, the two ;;; node labels ‘x1’ and ‘e2’ become interchangeable. ;;; ;;; for coordinate structures, one and the same predicate at times uses L-HNDL ;;; and R-HNDL, or L-INDEX and R-INDEX, or both. the section on [redundant] ;;; arguments below will eliminate the index arguments when handles are present ;;; too, so redundacy elimination must be performed prior to making relations ;;; [transparent]. still, when making transparent conjunction relations, we ;;; need to be able to either equate L-HNDL or L-INDEX with the conjunction, ;;; depending on which of the two is actually present. ;;; [transparent] comp_equal ARG1 ;; ;; _fix_me_ ;; the ‘eventuality’ of the quasi-modal construction (‘the train is to leave.’) ;; is introduced over the span of the VP, such that we currently have no way of ;; keeping it as a dependency (one would have to determine the ‘external head’ ;; (‘leave’, in this case) of that span, which may or may not be a well-defined ;; notion). thus, for now, make it transparent. (12-jan-14; oe) ;; eventuality ARG1 nominalization ARG1 unknown ARG ;;; ;;; some grammatical constructions introduce relations that can be projected ;;; into word-to-word dependencies. N-N compounds, for example, introduce a ;;; two-place relation (holding between two entities) quite similar to that of ;;; a preposition. thus, ‘hospital arrival’ vs. ‘patient arrival’ can have a ;;; semantic structure isomorphic to ‘arrival at the hospital’ vs. ‘arrival ;;; by the patient’, only the underspecified two-place ‘compound’ predicate ;;; symbol does not attempt to make explicit the exact nature of the relation. ;;; in the following, each predicate symbol is followed by two role names, of ;;; with the first should be the head, the second the argument in the word-to- ;;; word relation. note that the ‘special’ role ARG0, below, means that the ;;; node itself (e.g. the one labelled with the predicate ‘part_of’) is to be ;;; taken as the head of the dependency, i.e. effectively (for the ‘part_of’ ;;; case) its ARG1 dependency is renamed. finally, note that it can of course ;;; be desirable, in principle, to classify a relation as both transparent and ;;; relational. for now, the dependency label introduced here, will be taken ;;; from the predicate symbol matched as relational, e.g. ‘and_c’. ;;; ;;; ;;; _fix_me_ ;;; just now, it would seem that making a dependency relational should always ;;; imply transparency of the dependency, i.e. its own identity (ARG0) should ;;; be equated with the identity of the head (which, transitively, may lead to ;;; clusters of equated node identities). on this assumption, further naming ;;; any of the relational dependencies as transparent should either overwrite ;;; or augment the identity equations; discuss with angelina. (28-dec-13; oe) ;;; [relational] /_c$/ L-HNDL R-HNDL /_c$/ L-INDEX R-INDEX appos ARG2 ARG1 ;; ;; as of late summer 2014, an optional fourth argument in this section can be ;; used to rename dependencies, e.g. the ARG1 and ARG2 of the ‘comp’(arative) ;; relation to something like ‘comp’ and ‘ref’(erence), so that, for example, ;; ‘shorter’ and ‘more alert’ both can wear the comparative construction on ;; their sleeves. note, though, that in morphological comparatives (‘shorter’) ;; the ‘comp’ dependency conceptually would be a cycle, from ‘short’ to itself, ;; and thus is not included in the DM bi-lexical projection. ;; /^comp(?:_|$)/ ARG0 ARG1 /^comp(?:_|$)/ ARG0 ARG2 than compound ARG2 ARG1 compound_name ARG2 ARG1 discourse L-HNDL R-HNDL discourse L-INDEX R-INDEX implicit_conj L-HNDL R-HNDL implicit_conj L-INDEX R-INDEX loc_nonsp ARG2 ARG1 loc_sp ARG2 ARG1 measure ARG2 ARG1 ;; ;; _fix_me_ ;; it would be tempting to make ‘neg’ a dependency label in its own right, i.e. ;; pull the same trick as we do for ‘part_of’, for example. however, there is ;; more to negation, as there is some special treatment in the converter for ;; contracted auxiliaries, e.g. ;; ;; he didn't arrive. -- he did not arrive. ;; isn't abrams tall? -- is abrams not tall? ;; isn't abrams the chair? -- is abrams not the chair? ;; ;; in the last of these, the identity copula actually needs to be a node in the ;; DM graph, as its relation can be modified (e.g. negated), which probably ;; means our choice of head in pulling apart the contraction (when going into ;; PTB-style tokenization) needs to be reconsidered. (13-jan-14; oe) ;; neg ARG0 ARG1 nonsp ARG2 ARG1 of_p ARG2 ARG1 parenthetical ARG2 ARG1 ;; ;; _fix_me_ ;; upon closer inspection, this seems to currently not have any effect. check ;; with angelina. (13-jan-14; oe) ;; part_of ARG0 ARG1 plus ARG2 ARG3 poss ARG2 ARG1 subord ARG2 ARG1 /^temp_loc.*/ ARG2 ARG1 times ARG2 ARG3 unspec_manner ARG2 ARG1 with_p ARG2 ARG1 instrument ;;; ;;; some lexical entries, notably contracted negations like |couldn't| carry ;;; multiple relations that it is desirable to tease apart, i.e. distribute ;;; over multiple tokens. seeing that the actual analysis only used a single ;;; lexical token (i.e. leaf node in the derivation), we need to break that up ;;; into multiple tokens. for the time being, at least, we assume that these ;;; cases are limited to instances of more fine-grained initial tokenization. ;;; for |can't|, for example, the initial (PTB-compliant) tokenization would ;;; have been |ca| |n't|, and the leaf node in the derivation will actually ;;; carry unique pointers (in its +ID) list to the underlying initial tokens. ;;; thus, it is sufficient to (a) identify contracted leaf nodes by lexical ;;; entry identifier, (b) extract the corresponding initial tokens and their ;;; characterization, (c) align these tokens with an equal number of relations ;;; in the EDS (within the same characterization, and bearing predicate names ;;; as in the list following the lexical entry name); and finally (d) adjust ;;; the characterization of these relations according to the initial tokens. ;;; _fix_me_ ;;; but what about additional initial tokens, say punctuation, that can have ;;; attached to the same lexical token? the only way of discounting these in ;;; the alignment of initial tokens to EDS relations, presumably, would be to ;;; detect orthographemic rules in the derivation tree (and distinguish which ;;; are prefix vs. suffix ones), to discard these from the +ID list. ;;; (27-feb-12; oe) [contracted] can_aux_neg_1 _can_v_modal neg does1_neg_1 _ neg do1_neg_1 _ neg ;;; ;;; in coordinate structures, there will often (but not always) be pairs of ;;; (right and left) index and handle. at the EDS level, however, there is no ;;; remaining distinction between non-scopal and scopal arguments, hence purge ;;; one of the two pointers (typically to the same thing, when both index and ;;; handle are present). to mirror our decision about root nodes in clauses, ;;; prefer handle links over index ones. in the following, we specify triples ;;; of (a) a predicate symbol (which can be a regular expression, as always); ;;; (b) an argument link role to be tested for existence; and (c) and argument ;;; link to be purged, when the argument of (b) is present. ;;; [redundant] /.*/ L-HNDL L-INDEX /.*/ R-HNDL R-INDEX ;;; ;;; by and large, relations corresponding to actual words are characterized by ;;; so-called ‘lexical’ predicate symbols, ones starting with an underscore. ;;; there are, however, a number of predicate symbols that do not obey this ;;; convention and yet can often be associated with a surface token. for the ;;; time being, our approach is to enumerate all such symbols, i.e. only the ;;; relations on this list will be considered for word-to-word dependencies. ;;; [lexical] /^_.*/ abstr_deg basic_card card cop_id dofm dofw ellipsis_expl ellipsis_ref elliptical_n excl fraction generic_entity id interval little-few_a manner measure mofy much-many_a named named_n ne_x neg numbered_hour ord part_of person person_n place_n polite pron season superl temp thing time time_n year_range yofc ;;; ;;; at the end of the day, allow some renaming of roles and predicates; assume ;;; these are applied after all of the above are evaluated, so as to not have ;;; to think about interference of rule sections. ;;; [roles] L-HNDL ARG1 L-INDEX ARG1 R-HNDL ARG2 R-INDEX ARG2 [predicates] compound_name compound implicit_conj conj /^loc_.*/ loc part_of part /^temp_.*/ temp