Introduction: Elementary Dependency Match (EDM) is a granular evaluation metric based on the so-called 'ltriples' format that can be exported from a [incr tsdb()] profile. These triples are derived from the variable-free reduction of MRSs known as ''Elementary Dependencies'', described in Stephan Oepen and Jan Tore Lønning, 2006. Discriminant-Based MRS Banking In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006), pages 1250-1255, Genoa, Italy. http://www.emmtee.net/bib/Oep:Lon:06.pdf Elementary Dependencies describe almost all the semantics contained in an MRS (excepting scopal information), and can be divided into three types: NAMES: predicate name to char span ARGS: ARG-type relations between char spans PROPS: features of predicates, such as TENSE and GENDER An EDM evaluation is measured over all three types of ED, but other combinations are possible. EDM_NA evaluates predicate names and arguments, and is closest to other metrics such as GR, CCG dependencies etc. The default ouput for the evaluation script shows precision, recall and f-score over each relation separately, as well as typical aggregations. To use this evaluation, you first need to set the data up as described below. Set up: 1. Export gold: $LOGONROOT/lingo/lkb/src/tsdb/home/export --binary --format ltriples \ 2. Export test: $LOGONROOT/lingo/lkb/src/tsdb/home/export --binary --format ltriples \ --active=all This should produce directories containing one gzipped file per item parsed. The ltriples should look like: _treat_v_1<10:17> ARG2 _user_n_of<23:27> The links (eg. <10:17>) are necessary to the evaluation, and if your output doesn't have them, ask Stephan why. Evaluate Usage: cat |./edm_eval.pl [-i] [-v] [-p ] [-s] -i: ignore gold where parse failed -v: verbose output -p : parse number -s: raw figures for statistical significance calculations To evaluate a profile: ls -1 jhk.gold/*|edm_eval.pl jhk.test To evaluate a profile, only over files that received a parse: ls -1 jhk.gold/*|edm_eval.pl -i jhk.test To evaluate a single item: echo jhk.gold/3025231.gz |edm_eval.pl jhk.test To evaluate a specific analysis of a single item: echo jhk.gold/3025231.gz |edm_eval.pl -p 100 jhk.test To examine the errors in a single item: echo jhk.gold/3025231.gz |edm_eval.pl -v jhk.test To produce the files needed for statistical significance testing: for file in jhk.gold/*; do echo $file|edm_eval.pl -s jhk.test; done > jhk.test.stats Significance Testing: An implementation of the computationally-intensive randomisation test described in: Alexander Yeh. 2000. More accurate tests for the statistical significance of result differences. In Proceedings of the 18th International Conference on Computational Linguistics (COLING 2000), pages 947–953, Saarbruecken, Germany. Usage: statsig_shuffle.pl [iterations] statsig_shuffle.pl jhk.gold.stats jhk.test.stats 10000