Performace tweaks				 Francis Bond
-----------------

(1) Quick Check:

Make sure the two quick-check files are kept up to date.

LKB:	japanese/lkb/checkpaths.lsp
PET:	japanese/pet/qc.tdl


Generate as follows:
====================
PET:
----
Make sure:
 - you are using compatible versions of flop and cheap
 - your grammar is up-to-date:

See japanese/makw-qc.bash

flop japanese.tdl 
mv pet/qc.tdl pet/qc.tdl.old
time LANG=ja_JP.EUC-JP; flop japanese; cut -d@ -f7 tsdb/skeletons/kinou1/item | chasen -F "%M "| cheap -limit=100000 -packing -compute-qc=pet/qc.tdl japanese;  flop japanese


### Using the Ikehara kinou-shiken-bun
grep -v '#' testsuites/mt-test-set-1.txt | chasen -F "%m " |\ 
cheap -limit=100000 -packing -compute-qc=pet/qc.tdl japanese 
flop japanese

On one line:
flop japanese.tdl; mv pet/qc.tdl pet/qc.tdl.old; grep -v '#' testsuites/mt-test-set-1.txt | chasen -F "%m " | cheap -packing -compute-qc=pet/qc.tdl japanese;  flop japanese.tdl

### can do this for vanilla, but need to have the chasen preprocessor
cut -d@ -f 7 tsdb/skeletons/vanilla/item | cheap -compute-qc=pet/qc.tdl japanese

After you have made the quick check file, you need to rebuild the grammar


Note: This is slow, as quick-check is, off course, turned off.  Flop
is also slow at the moment.  Otherwise you should use the mode you
would normally use (e.g. with packing if you use packing).

Try with more sentences, then cut all paths below the first 50.


FCB: I should write a script to do this.

LKB:
----
(see Copestake (2002), indexed under checkpaths)


mv lkb/checkpaths.lsp pet/checkpaths.lsp.old

from within the *common-lisp* buffer:

(lkb::with-check-path-list-collection 
   "~/delphin/grammars/japanese/lkb/checkpaths" 
   (parse-sentences 
      "~/delphin/grammars/japanese/testsuites/hinoki-test-a.100" 
      "~/delphin/grammars/japanese/testsuites/hinoki-test-a.100.results"))

(lkb::with-check-path-list-collection 
   "~/logon/dfki/jacy/lkb/checkpaths.lsp" 
   (parse-sentences 
      "~/logon/dfki/jacy/testsuites/mrs.ja" 
      "~/logon/dfki/jacy/testsuites/mrs.ja.results"))

========================================================================

(2) Optimize selection of keyargs [only do this occasionally)
(Oepen + Callmeier 2000: in the book)

You can gain some performance increase by setting the order in which
the daughters of rules are checked.  The order can be specified in 
lkb/globals.lsp
(defparameter *rule-keys*
  '((HEAD-ADJUNCT-RULE1 . 1)
    (COMPOUNDS-RULE . 1)
    (KARA-MADE-RULE . 2) 
    (HEAD_SUBJ_RULE . 2)
    (HEAD-SPECIFIER-RULE . 2)
    (HEAD-COMPLEMENT-RULE . 2) 
    (HEAD-COMPLEMENT2-RULE . 2)
    (HEAD-ADJUNCT-RULE2 . 2)))
and
pet/japanese.set
;; assoc (rules -> keyarg position) (alternative to KEY-ARG mechanism)
rule-keyargs := 
$HEAD-ADJUNCT-RULE1 1 
$HEAD-ADJUNCT-RULE2 2 
$HEAD-ADJUNCT-RULE3 1 
$RELATIVE-CLAUSE-RULE 1 
$COMPOUNDS-RULE 1 
$SENTENCE-TE-COORDINATION-RULE 1
$CONJ-RULE 1
$KARA-MADE-RULE 2 
$HEAD_SUBJ_RULE 2 
$HEAD-SPECIFIER-RULE 2 
$HEAD-COMPLEMENT-HF-RULE 2 
$HEAD-COMPLEMENT-HI-RULE 1 
$HEAD-COMPLEMENT-AFFIXBIND-RULE 2
$HEAD-COMPLEMENT2-RULE 2 
$HEAD-2OBL-COMPLEMENTS-RULE 2
$VN-LIGHT-RULE 2
$VEND-VEND-RULE 1
$VSTEM-VEND-RULE 2 
$VN-VEND-RULE 2
$PREFIX-ATTACH-RULE 1
$NP-QUEST-FRAG-RULE 2.

Key mode in cheap is set with:
  `-key=n' --- select key mode (0=key-driven, 1=l-r, 2=r-l, 3=head-driven)
default is 0.

You get the data by :creating two profiles one with -key=1 and one
with key=2, tuning on -rulestats.  First enable
[Process,switches:write rule relation] in \itsdb.  Use the mode you
would normally use (e.g. with packing if you use packing).

Then [Analyze:rule table] for both profiles and you want to check the
daughter with the least number of active edges (the passive edges
should be the same modulo memory overflow errors).

1 means daugher 1 first [l-r for binary rules]
2 means daugher 2 first [r-l for binary rules]
default is l-r or KEY-ARG.

Or you can use KEY-ARG and specify it per rule in the grammar.

FCB: this would also be nice to automate
========================================================================

(3) Put some restrictions on the application of morphological rules

lkb/globals.lsp
; maximum depth for chaining morphological rules
(defparameter *maximal-morphological-rule-depth* 3)

pet/japanese.set
; maximum depth for chaining morphological rules
orthographemics-maximum-chain-depth := 3.
; minimum stem length to apply morphological rules
orthographemics-minimum-stem-length := 3.
; don't apply the same rule twice to the same lexeme
orthographemics-duplicate-filter:= true.
;; only allow cohesive chains (whatever they may be)
;orthographemics-cohesive-chains:= true.


(4) If you have rules that are spanning only, mark them in pet/japanese.set

spanning-only-rules := $frg-np $frg-pp $frg-s-adv $frg-i-adv
		       $frg-pp-np $frg-i-adv-np $frg-pp-int 
		       $runon_s.