resa is a tool for evaluating syntactic analysis at multiple levels. Evaluation is over triples denoting labelled spans, with the spans described using inter-character positions. Spans are calculated using an automatic alignment between annotations and the original raw text, so mis-matches in sentence or token segmentation does not preclude evaluation. For more details, see Rebecca Dridan & Stephan Oepen. 2013. Document parsing: Towards realistic syntactic analysis. In Proceedings of the 13th International Conference on Parsing Technologies, Nara, Japan. Added 30/7/2014 Important note: evalb .prm files can be used almost without modification, however, since not all annotation schemes consider everything after a hyphen in a label to be irrelevant, in order to replicate evalb's discarding of function tags etc, you should add: LABEL_DELIM - to your .prm file. The value of LABEL_DELIM should be a string containing any character you consider an indication of the end of a label, eg LABEL_DELIM -=:/ INSTALL autoreconf -i ./configure make USAGE Usage: resa Options: -h [ --help ] This usage information. -r [ --raw ] arg Raw text file, required unless gformat and tformat are CHAR. -g [ --gold ] arg Gold annotation file. -t [ --test ] arg Test annotation file. -v [ --verbose ] Unmatched tuples printed to STDERR. -s [ --stats ] Print output in tab-delimited form. -f [ --fuzzy ] Allow fuzziness in spans around punctuation -u [ --unlabelled ] Unlabelled evaluation for phrase structure labels or dependencies -m [ --multi ] Allow multiple tags (only valid for POS tags in LINE format). -b [ --boundary ] Use sentence boundary end point, rather than span. -i [ --interim ] Output interim characterised files. Files will be of form .{gold,test}.tuples -p [ --param ] arg evalb-style parameter file. -G [ --gformat ] arg (=MRG) Annotation format of gold file: LINE: 1 sentence per line TAB: 1 token per line, second+ column(s) (if present) considered to be POS. Empty lines considered sentence breaks. MRG: .mrg format as in Penn Treebank. CONLLX: CONLL-X format. CHAR: Characterised tuples as from option --interim -T [ --tformat ] arg (=MRG) Annotation format of test file, options as for gold format.