The Genia data used for segmentation experiments reported in Read et. al. (2012) was created using the file GENIAcorpus3.02.pos.xml, available from the GENIA Project. We extracted the sentence elements from with the abstract elements (ignoring titles, since they were generally just one sentence paragraphs), and stripped all the sentence and word tags. In the unsegmented.txt file given to the segmenting tools, each abstract is presented as a single paragraph, with paragraph breaks (blank lines) between abstracts. Corpus annotations (c) GENIA Project, used under the CC-by license.