This program extracts tagged tokens from tsdb profile (virtual profiles
included, item numbers must be unique). Works by tokenising the items from the
item file using whichever tokeniser option is given, extracting the leaves with
their lexical types (and optionally morph rules) from the parse tree and trying
to match the tokens to the leaves.

Install directions:

autoreconf -i
./configure
make

Usage: 

./tagextract [options] grammar-file profile
Options:
  -h [ --help ]            This usage information.
  -t [ --tok ] arg (=none) tokeniser: none, repp, chasen,yy (default: none)  
                           Tokeniser will be applied to the item string in the 
                           item file.  (YY and ChaSen unimplemented as yet. Let
                           Rebecca know if you want them.)
  -r [ --rpp ] arg         tokenizer .rpp file
  -c [ --call ] arg        rpp calls, multiple options valid.
  -p [ --pos ]             Use TnT POS tags.
  -m [ --model ] arg       TnT model, defaults to WSJ model in LOGON tree.
  -i [ --infl ]            Tags include morphological inflection rules.
  --format arg (=TNT)      token format:TNT,CANDC,FSC (default: TNT)
  -n [ --num ]             Output item and parse number.
  -l [ --limit ] arg (=0)  Number of readings at which a context is ignored.  
                           Set to nbest to negate the effect of using a model 
                           during parsing.


eg: 
	./tagextract -t repp -c ascii -c xml -c latex -c wiki \
	 -r $LOGONROOT/lingo/erg/rpp/tokenizer.rpp \
	 $LOGONROOT/lingo/erg/english.tdl $LOGONROOT/lingo/erg/tsdb/gold/ws01