Publication of Results

Results and experiences from the shared task will be presented on Wednesday, September 20, 2017, as part of the joint programme of the DepLing 2017 and IWPT 2017 conferences. EPE 2017 will produce a peer-reviewed proceedings volume that will be published by the Association for Computational Linguistics (ACL).

All participating teams are invited to contribute a system description paper to the proceedings of up to six pages in content, not counting bibliographical references. These articles should provide in-depth technical background to all relevant properties of each submission (or, specifically, each submitted run)—to help put results into perspective and allow replication. Such information could comprise, for example, specifics of the dependency representation used, pre-processing of parser inputs, parsing approach and implementation, training data, additional resources, hyper-parameters, and other relevant configuration options.

Trying to tease apart which dimensions of parser-internal variation correlate with observed end-to-end results will arguably be the most scientifically interesting outcome of the EPE 2017 shared task. The system descriptions will ideally not only provide the necessary information for this analysis but also discuss observable downstream effects across different configurations (and possibly other participating systems). Where available, they might also seek to relate extrinsic results from the EPE 2017 downstream systems to previously published intrinsic evaluation results, for example from the 2017 Universal Dependencies Parsing Task.

System submissions must be submitted on-line no later than Sunday, August 13, 2017 (anywhere in the world). Given the particular nature of these papers, system descriptions shall not be anonymous. They must be formatted using the standard templates of the 2017 ACL Conference. All submissions will receive three reviews from the EPE 2017 programme committee, which will be comprised of the task co-organizers, a few external experts, and at least one member from each participating team. Reviewer comments will be available by August 28, 2017, and camera-ready final manuscripts will be due one week later. Participants will be given access to the complete draft proceedings in early September, and it will be possible to fine-tune and update system descriptions until Friday, September 15.

System Submissions

Participation in the shared task requires submission of parser outputs for the ‘raw’ texts comprising the evaluation data. To generalize over a broad variety of different dependency representations and to provide a uniform interface to the various downstream applications, EPE 2017 defines its own interchange format for morpho-syntactico-semantic dependency graphs. An example file (providing UD-like analyses for the development text from the negation application) demonstrates the required format for system submissions. Unlike a venerable string of tabular-separated (CoNLL-like) file formats, the EPE serialization of dependency representations is tokenization-agnostic (nodes can correspond to arbitrary and potentially overlapping or empty sub-strings of the underlying document), has no hard-wired assumptions about the range of admissible annotations on nodes, naturally lends itself to graphs transcending rooted trees (including different notions of ‘roots’ or top-level ‘heads’), and straightforwardly allows framework-specific extensions.

The EPE interchange format serializes a sequence of dependency graphs as a stream of JSON objects, using the newline-separated so-called JSON Lines convention. Each dependency graph has the top-level properties id (an integer) and nodes, with the latter being an array of node objects. Each node, in turn, bears its own (unique) id (an integer), form (a string, the surface form), and start and end character ranges (integers); all but the id property are optional (e.g. to be able to represent ‘empty’ or elided nodes). Furthermore, nodes can have properties and edges, where the former is a JSON object representing an arbitrary attribute–value matrix, for example containing properties like pos, lemma, or more specific morpho-syntactic features.

The encoding of graph structure in the EPE interchange format is by virtue of the edges property on nodes, whose value is an array of edge objects, each with at least the following properties: label (a string, the dependency type) and target (an integer, the target node). Thus, edges in the EPE encoding are directed from the head (or predicate) to the dependent (or argument). Unlike for nodes, there is no meaningful ordering information among edges, i.e. the value of the edges property is interpreted as a set. Conversely, encoding each edge as its own JSON object makes possible framework-specific extensions; for example, a future UD parser could output an additional boolean basic property, to distinguish so-called ‘basic’ and ‘enhanced’ dependencies.

Finally, adopting the terminology of Kuhlmann & Oepen (2016), the EPE interchange format supports the optional designation of one or more ‘top’ nodes. In classic syntactic dependency trees, these would correspond to a (unique and obligatory) root, while in the SDP semantic dependencies, for example, top nodes correspond to a semantic head or highest-scoping predicate and can have incoming edges. In the JSON encoding, nodes can bear a boolean top property (where absence of the property is considered equivalent to a false value).

Evaluation Period

The ‘official’ evaluation period of the task will run from Tuesday, June 6, to Thursday, June 15, 2017. At the start of the evaluation period, we will make available the test data for the three downstream tasks (while the training and development data will remain unchanged, of course). Participants will be expected to prepare their submission by processing all files (training, development, and evaluation) using the same configuration of their parser. Parser outputs have to be submitted in the EPE interchange format, but the task co-organizers might be able to offer advice or assistance on converting from other common formats (see the documentation in the EPE utility package). For parsing systems supporting multiple output formats or otherwise interestingly different configurations, multiple ‘runs’ can be accepted from the same team.

It will be important that all submissions of parser outputs follow a uniform directory and file naming regime. Each team can submit between one and five distinct runs, where each ‘run’ can correspond to a different type of dependency representations or differences in training data, parsing approach, parser parameterization, or other relevant variation. By prior agreement with the co-organizers, it can be legitimate for a team to provide more than five runs. All runs from one team need to be submitted as a single compressed archive file; in case a team makes multiple submissions before the end of the evaluation period, only the most recent archive file will be used in evaluation. Thus, the top-level directory structure should contain zero-based, two-digit run identifiers, e.g. 00, 01, etc.

Within each run, the directory structure and file naming should mirror the structure of the parser inputs, i.e. the topmost directory names below the run identifier should be events/, negation/, and opinion/. In particular, please (a) parse all the ‘.txt’ files in the ‘training/’, ‘development/’, and ‘evaluation/’ sub-directories of our most recent parser input package (version 1.5); (b) put parser outputs into a parallel directory structure, using the exact same basic file names but replacing the ‘.txt’ suffix with ‘.epe’; (c) within each run directory provide a file ‘README.txt’ with high-level information on the type of dependency representation and parser; and (d) package all the directories and files up into a compressed archive (of type ‘.tgz’ or ‘.zip’) and email epe-organizers @ nlpl.eu a download link.

Pre-Evaluation Trial Run

To generalize the downstream applications to work with different types of dependency representations, the task co-organizers depend on the availability of the broadest possible range of different parser outputs (compatible with the EPE definition of dependency representations) and packaged in the EPE interchange format. To initiate a working relationship with parser developers and facilitate mutual feedback, the task schedule foresees a ‘trial run’ period in early to mid-April. Candidate participants are asked to run the training and development data for all downstream applications (in total, about half a million whitespace-separated tokens) through their parsers, serialize parsing results in the EPE interchange format, and make parser outputs available to the task organizers. Where parsers support interestingly different dependency outputs (e.g. propagating dependencies into coordinate structures, or pushing some lexical information directly onto dependency edges), multiple submissions will be very welcome; these should be packaged separately, each into a directory tree (parallel to the parser inputs) of its own.

Registration of Intent

To make your interest in the EPE 2017 task known to the organizers, and to receive updates on data and infrastructure availability, please self-subscribe to the mailing list for (infrequent) EPE announcments. The mailing list archives are available publicly. We may ask for a mildly more formal registration of candidate participants in connection with the trial run in late April (see the task schedule and below; more information to come).

Access Information

The ‘raw’ parser inputs representing the training and development data for the various downstream applications have been available since mid-March 2017. Please see the infrastructure overview for download links.