Gpred is for `grammar' predicates: these have to be enumerated in the SEM-I,
but much too frequent for the DTD.

I am not sure that Realpred should be empty - possibly the lemma
should be an element rather than an attribute

Note that the LKB code does not treat pred case as significant - the only
place where case is respected is string-valued arguments (CARG and ??)

POS - now lowercase and obligatory
(v|n|j|r|p|q|c|x|u|a|s)
v - verb
n - noun
a - adjective or adverb (i.e. supertype of j and r)
j - adjective
r - adverb 
s - verbal noun (used in Japanese and Korean)
p - preposition
q - quantifier (needs to be distinguished for scoping code)
x - other closed class
u - unknown

The implicit hierarchy is 
 n := u
 v := u
 a := u
 j := a
 r := a
 s := n, s:= v
 p := u
 q := u
 c := u
 x := u

Labels are always implicitly of the same sort (i.e., label)

Var is the old arg: the list of attributes must be determined by the
Matrix/SEM-I, so the current list is still preliminary

Note that the reason for not making vids be IDs in the XML sense is two-fold:
a) it's too much hassle to ensure uniqueness in a document
b) this isn't supported in RELAX NG - argument is this is the sort of thing
that's part of the semantics, not the syntax, which I tend to agree with -
there's lots of things that have to be checked outside the DTD anyway

Rargnames should be enumerated in the SEM-I, but I don't propose to do this in
the DTD

cfrom and cto are on eps and rmrs - they might also end up on the other
main elements (in-groups, hcons, rargs).  Having them on preds seems
too restrictive/cumbersome - they should be associated with minimal
information units.

cfrom and cto default to -1 if unknown

In-groups are used for predicates which have the same handle in the
MRS - they are grouped in one in-group <ing> but given unique handles
for the RMRS.  I [AAC] am still not entirely happy with in-groups.

The surface attribute on rmrs and ep is used to give the original
surface string.  It is optional.  

The base attribute is used for languages where the current parsing
engines cannot represent the lemma properly.  For example, in Jacy the
lemma is in ASCII, but the base form should often be Chinese
characters.  This attribute is depreciated - as the lemma attribute in
realpred is a string, in theory it can handle the real base form (we
can use all of unicode).  When all the tools can handle it, then we
can remove it.  For the moment, base should only be added if it is
different from both lemma and string.

ident is an attribute on rmrs's to identify which utterance they
belong with.  The HoG currently uses a wrapper around the RMRS, with
identifying information there instead.  Hinoki uses the ident
identifier but may switch to a wrapper, in which case ident may be
removed.  In any case it is optional.
 
04/09 changed the DTD to conform with 09/02 ERG SEM-I VPM
though note the use of plus and minus for + and minus