Dependency-based target representations have received much attention in parsing research of at least the past decade, in part because they offer a comparatively easy-to-use interface to grammatical structure.  Over an even longer period, the formal and linguistic foundations of syntactico-semantic dependency analysis have continously evolved, and there is considerable variation across representations schemes in use today—even within a single language.

For English, for example, the so-called LTH format (named after the Faculty of Engineering at Lund University) defined by Johansson & Nugues (2007) was used for the 2007, 2008, and 2009 shared tasks of the Conference on Natural Language Learning (CoNLL).  Subsequently, the family of Stanford Dependencies (SD) proposed by de Marneffe and Manning (2008) enjoyed wide popularity.  And more recently, the Semantic Dependency Parsing (SDP) and Universal Dependencies (UD) representations summarized by Oepen et al. (2016) and Nivre et al. (2016), respectively, further increase diversity—as target representations in parsing tasks at the 2014 and 2015 Semantic Evaluation Exercises and for the 2017 CoNLL shared task, respectively.

For each of these representations (and others), detailed intrinsic evaluation reports are available that allow one to estimate parser performance, for example in terms of average dependency accuracy and speed, for different types of input text. These reports, however, are difficult to compare across types of representations (and sometimes different selections of test data), and they fail to provide insights into the actual utility of the various representations for downstream tasks that use grammatical analysis as a pre-processing step.  Two notable exceptions to this observation are the extrinsic evaluation studies by Miyao et al. (2009) and Elming et al. (2013), which seek to determine the contributions of differents types of dependency representations to a variety of downstream tasks.

The purpose of the First Shared Task on Extrinsic Parser Evaluation (EPE 2017) is to shed more light on the downstream utility of various representations (at the available levels of accuracy for different parsers), i.e. to seek to contrastively isolate the relative contributions of each type of representation (and corresponding parsing systems) to a selection of state-of-the-art downstream systems (which use different types of text, i.e. exhibit broad domain and genre variation). Please see the 2017 task page for further information on the task set-up and call for participation.

A Few Examples

Following are some example graphs (for a simplification of sentence #20209013 from the venerable Wall Street Journal Corpus), first in the LTH representation.  Formally, this graph is a rooted tree, i.e. every node is reachable from a distinguished root node by exactly one directed path.  Function words frequently are heads (e.g. the copula and infinitival particle), and dependency relations are syntactic in nature, for example distinguishing the subject and predicative complement to the copula.

In contrast, the UD representation, while also syntactic in nature, shows a preference for content words (rather than function words) as heads, e.g. the predicative adjective impossible rather than the copula. Following is an analysis in UD version 2.0, taking advantage of so-called ‘enhanced’ dependencies, which can introduce reentrant nodes into the graph, e.g. at technique, which is annotated as both the subject of the adjective as well as as the object of apply.

In contrast, the following two semantic dependency graphs (in the so-called DM and PSD variants, respectively, from the SDP collection) are neither connected nor rooted trees, i.e. some of the surface nodes are structurally isolated, and for other nodes there are multiple incoming edges. Semantically, technique arguably is dependent on the determiner (the quantificational locus), the modifier similar, and the predicate apply. Conversely, the predicative copula, infinitival to, and the vacuous preposition marking the deep object of apply can be argued to not have a semantic contribution of their own.

For the same sentence, the PSD graph has many of the same dependency edges (albeit using a different labeling scheme and inverse directionality in a few cases), but it analyzes the predicative copula as semantically contentful and does not treat almost as scoping over the entire graph.