Fully Open Source LKB (LKB-FOS) For installation and usage instructions, see http://moin.delph-in.net/LkbFos Significant changes and bug fixes are summarised below. 3 December 2021 * In certain (rare) cases, a type constraint could be ignored when computing the expanded constraint of another type - fixed. * DRMS output failed if the parse contained a predicate only 1 or 2 characters long - fixed. * Worked around LUI failure to display feature structures that: (1) contain types that look like integers, or (2) contain list types or list head/tail features at the top level. Clicking on a type with a numeric name in a LUI window now also works. * Improved the layout of Dependency MRS at font sizes larger or smaller than the default. * Disabled the confusing menus called "Operation on x" that used to pop up when right clicking on an active element in an LKB window. * Fixed a subtle bug in the agenda handling code, which occasionally led to tasks being executed in the wrong order; also, the agenda doesn't now hold on to memory if processing halts unexpectedly. * Changed the interpretation of *first-only-p* = 0: a complete chart is created, but there's no attempt to unpack any results. * LKB Top now correctly displays blank lines. Still waiting for a fix for the McCLIM bug that causes large feature structures to become garbled after scrolling. * In all recent versions of the ERG, if chart packing is enabled then in LUI parse chart windows every non-lexical edge is labelled `CTYPE`; getting more useful output requires a minor change to the function lui-chart-edge-name in the ERG's lkb/user-fns.lsp. See https://github.com/delph-in/erg/issues/26 and https://delphinqa.ling.washington.edu/t/getting-parse-chart-in-lkb/686 18 July 2021 * Added the ability to declare rules as 'spanning-only'; when parsing, such rules will only ever be applied over the entire input string. To use this facility, set the parameter *spanning-only-rules* to a list of the names of these rules, e.g. (defparameter *spanning-only-rules* '(aj-hd_int-inv_c aj-r_frg_c)) * Internal improvements to parser code for maintainability and efficiency. * Incorporated fix to McCLIM that resolves previous problems with icons in the file selector sometimes not being drawn. * Known issues: due to McCLIM bugs, large feature structures (such as a "Full FS" from parsing) can become garbled after scrolling, and LKB Top sometimes does not display blank lines. Fixes expected. 30 April 2021 * Added a pre-built native binary for Apple M1 (lkb.darwin_arm64). * Starting with the (previous) 27 December 2020 release, the default value of mrs::*normalize-predicates-p* has been changed to t. This means that in generation, predicates will be normalised via the SEMI - and indexing for generation therefore expects a SEMI to have been loaded. If there is no SEMI, include (setq mrs::*normalize-predicates-p* nil) at some point in the script file before calling index-for-generator. * Generating with *gen-extract-surface-hook* unset gave an error when trying to display the results window in LKB native graphics (LUI graphics was OK) - fixed. * For consistency with Allegro CL builds of the LKB, added to *punctuation-characters* seven characters beyond the Basic Latin Unicode block (Ideographic Full Stop etc.) * All menus now use modern menu navigation (click to open the menu, click a second time to select a menu item). Also, the LKB Top commands `Options > Shrink/Expand menu' work more smoothly. * Parse trees now look more balanced. Parse trees window opening and resizing are much more responsive when there are hundreds of parses. Fixed minor bug in computing parse tree node labels (any re-entrancies between values of top-level features were ignored). * Fixed some minor interface issues in View... command dialogs, parse chart window titles, and the type hierarchy `Show/Hide Types' commands. * Internal improvements to the parser and generator, including: improvements to the hyperactive parsing strategy, tighter packing of passive edges, more economical calling of the unifier, and reduction of garbage collection overheads in batch parsing / generation. * Internal improvements to use more appropriate data structures, remove some unnecessary global state, and plug memory leaks. Fixed incorrect code for initializing the dag pool, which could cause batch parsing / generation to fail. * Known issues: recent changes in McCLIM result in file and directory icons in the file selector sometimes not being drawn, and LKB Top sometimes not displaying blank lines when it should. Fixes expected for next binary release. 27 December 2020 * Made dialog boxes open a little less slowly on macOS, partially working around a widely complained-about graphics issue in XQuartz. * Internal improvements to use more appropriate data structures in quickcheck and generator. More thorough consistency checking when reading quickcheck paths file. * In the parser, setting the parameter *non-idiom-root* had no effect; now, if set to the name of an instance, this is checked against each parsing result to see whether *additional-root-condition* needs testing. In the generator, failure of *additional-root-condition* now only outputs a warning. * Reversed a poor decision in the July 2020 version to generalise passive edge top types internally to improve packing; this caused problems with GG. * Updated tsdb and swish++ binaries to March 2020 versions from the LOGON distribution; profiles with fields containing integers longer than 30 bits are now retrieved correctly. [incr tsdb()] failed to follow symbolic links to profiles - fixed. * In LUI, when attempting interactive unification, a failure when applying a type constraint was sometimes ignored - fixed. Also fixed a problem displaying a type with no features in LUI. * In View... command dialogs, grammar entities with names consisting only of digits are now found. Also fixed related bug where the initial suggestion was displayed with vertical bars around it if it started with a digit. * Reading of transfer and MRS rules now follows the revised TDL syntax specification in TdlRfc; error recovery after TDL syntax errors in these rule files is also improved. 11 July 2020 * Fixed memory leaks in chart display, and in parsing when it does not run to completion. * Sped up selective unpacking in rare circumstances when the number of possible decompositions of a node reaches the hundreds or more. * Speed improvements to the generator, especially noticeable with complex inputs. * Better handling of edge, agenda and sentence length resource limits. 25 June 2020 * The [incr tsdb()] podium failed to find profiles whose directory name contained a dot (e.g. data-test.20.05.22) - fixed. * Added libtermcap.so.2 to lib/linux.x86.64/ so only a single LD_LIBARY_PATH /lib/linux.x86.64 is required. * Some internal data structures changed for better scalability: edge registry, selective unpacking per-edge agenda, and parser passive edge chart. Parser agenda edge priority computation changed to improve compactness of packing. 21 May 2020 * Most of [incr tsdb()] has been ported to LKB-FOS. All general-purpose grammar development support is present, including the podium - although only for Linux; macOS may follow, but requires some work to recompile the C code of tsdb and swish++. Execute (tsdb:tsdb :podium) to start the podium. * No LOGON-specific functionality is available (i.e. source code enabled by the :logon feature), which means that PVM, WWW demo, SVMs and language models, external MT system interfaces etc are missing. * Maxent disambiguation models can be created, either via the function tsdb::train or the [incr tsdb()] podium command Trees -> Train. For training and the Trees -> Rank command, `tadm' and `evaluate' in the LOGON distribution are required (in directory sdsu/, invoked through bin/), and must be accessible through the Unix PATH environment variable. In addition, LD_LIBARY_PATH must include /lib/linux.x86.64 * http://moin.delph-in.net/LkbGeneration explains how to load maxent models and use them to rank trees in unpacking. Unpacking with ranking is now fully integrated into the parser and generator code. Only local trees (mother and daughters) are scored; no grandparent or n-gram features are used. Complete results are not re-scored. Unpacking lost a few trees in certain rare circumstances, caused by incorrect interfacing to the Carroll & Oepen (2005) algorithm - fixed. * In some cases the generator failed to filter out strings incompatible with the parameter *gen-start-symbol* - fixed. It also did not check results against *additional-root-condition* - fixed. * Removed the global parameter *gen-maximal-number-of-realizations*; use *gen-first-only-p* instead (analogously to *first-only-p* in the parser). * In generation, trigger rules occasionally failed to fire which would in the LOGON system. It turns out that LKB and LOGON used different versions of the function extract-pred-from-rel-fs. I don't understand the reason for the difference, but anyway I changed it to match the LOGON system. * Made the LSP (Linguistic Server Protocol) fully functional, allowing LKB-FOS to be run as a server over a socket connection. * Fixed a couple of memory leaks, caused by poorly implemented code in the third-party acl-compat library. There is still a memory leak which needs attention in the interface to yzlui. * The macOS application LKB.app has been reimplemented to include better error checking/reporting and avoid dependencies on the Finder (which in macOS Mojave 10.14 onwards needed to be explicitly authorised by the user in System Preferences). * LKB binaries are stored compressed and automatically uncompress themselves when loading, reducing storage requirements by 80%. * In the LKB Top Advanced menu, added an "Evaluate Lisp expression..." command. The associated dialog comes preloaded with a few useful Lisp expressions, such as (lui-initialize). * The Edit command in the Transfer Output window Debug menu is now enabled; when running under emacs the user can edit an MRS. * Fixed a bug in the Generate -> Display Input MRS and Display Internal MRS menu commands which prevented them opening a graphical window. Many other minor interface fixes. * In the Set Options dialog, added *summary-tree-font-size* and removed *bracketing-p*. Changed defaults for *maximum-number-of-edges* and *unpack-edge-allowance*, both to 2000. * Removed out-of-date versions of open source Lisp subsystems from the source code tree, and changed the build script to load the latest versions using the Quicklisp library manager. 13 July 2019 * With chart packing enabled, setting *first-only-p* to a non-NIL value could miss some parses; fixed. * Much faster MRS construction, giving a noticeable speed-up when there are large numbers of parses. * Better diagnostics in the LKB Top window when there is an unexpected error. * Fixed a (rarely occurring) infinite loop bug in unpacking, e.g. in the `hike' test suite with the item "There are many possible starting points in order to reach the top of the 1755 meter high Rendalssølen." 14 March 2019 * TDL docstrings weren't accepted in instance definitions; fixed. * Improvements to the behaviour of dialogs on resize and to their overall appearance. On macOS, clipboard paste into text fields is now instantaneous, but on the other hand an obscure McCLIM bug has slowed down the file chooser dialog. * The user parameter *dialog-font-size* (previously ignored in CLIM versions of the LKB) now controls font size in dialog text fields, the parse history menu, and the Lkb Top window (although for the latter, a size change only takes effect after the window is re-initialised via the menu command Options->Shrink/Expand menu). * Several cosmetic improvements to windows, including opening with dimensions more appropriate to content size, more consistent margins and info line shading, and no graphical artefacts after resizing and scrolling. * Implemented the LkbWishlist request for user control of font size in generator result windows; these and other list-like windows (such as the 'Apply all lex rules' window) obey the *parse-tree-font-size* user parameter. * Code reviews have prompted some re-implementation, improving maintainability and substantially speeding up parsing. * The LKB-FOS source code is getting to be stable; the subversion repository http://svn.delph-in.net/lkb/branches/fos/ should be in sync with releases of pre-built binaries. However, the build process is not fully automated and a few minor patches to McCLIM are not yet in the repository. 30 October 2018 * Implements the new TdlRfc specification of TDL syntax (from TypeDef onwards in the BNF). BlockComments may not be nested. Taking into account recent discussion on the developers list, Identifiers are specified using a 'blacklist' approach: they can contain any character apart from whitespace and a set of 20 or so punctuation characters which are used in operators etc. * TDL error messages have been made clearer, and recovery after errors is more reliable. Error messages now give line numbers and character positions, rather than byte positions (addressing one of the suggestions in LkbWishlist). A new parameter *brief-format-messages-p* (default nil) controls whether the TDL reader outputs error and warning messages with a brief `filename:line:column' location or with a longer textual description. * Patterns containing wildcards (e.g. in generator trigger rules) didn't work; fixed. * Fixed a bug in unserialize-semantics-indices which caused an error with KRG. * Reading large lexicons is much more efficient. Lexicon batch check and generator indexing are also now much faster for very large lexicons; there are significant improvements for Zhong/zhs and HeGram/lexicon_big.tdl * In the file chooser dialog, the environment variable DELPHINHOME directory is included in the left pane 'Places' list. * On macOS, the alt (option) key can be used in dialog text fields to enter a wider range of characters - this is particularly useful with non-western European language keyboard layouts. * On macOS, LKB.app is a standalone application which provides a way of starting the LKB without going through the Unix command line: just double click it. It can even be put in the Dock. 15 June 2018 * Reduced memory allocation in MRS construction, resulting in a significant speed increase when there are large numbers of parses. * Decreased the size of DAG node structures. * More detailed diagnostic message when re-unification fails when attempting to display a tree (which could happen if an edge is only in the chart because the packing restrictor had suppressed a path that would normally have failed). * The value for *lexdb-params* did not get set properly if changed in the settings dialog - fixed. * Added a new parameter *gen-start-symbol*, analogous to *start-symbol* for parsing; if *gen-start-symbol* is not set then it defaults to *start-symbol* 25 April 2018 * The postgresql lexical database facility is now enabled, and works in both Linux and macOS. * Fixed an interface problem with scroll wheel mice, in which scrolling when the mouse pointer was over a dialog box button was erroneously interpreted as a mouse click. * Further time and space improvements in GLB computation. 5 February 2018 * On both Linux and macOS, lui-initialize expects the LUI executable file to be on the user's PATH. * Speed improvements to grammar loading and yet more to GLB computation, giving a big boost for grammars with large type hierarchies. * (More consistent interpretation of the value of the find-infl-pos function for multiwords is still experimental and is not yet integrated). 7 December 2017 * More convenient viewing of the type hierarchy: the type hierarchy window commands are now Zoom In, Zoom Out, Show/Hide Types, Show/Hide Defns. * The interface is more responsive after the 'Scoped MRS' command when there are thousands of scopings. 21 November 2017 * First coordinated release of binaries for Linux and macOS (both 64 bit). * The standard emacs interface to the LKB is available, based on the open source SLIME package (includes 'Show source', emacs shortcut commands etc). Porting it from the proprietary Franz ELI package gave an opportunity to fix bugs such as missing functions. * Type hierarchy windows now display the ancestors as well as the descendants of the target type. * Added zoom in and out commands to type hierarchy windows. Also, the 'view hierarchy' dialog has a new check box to specify whether type constraints should also be displayed in the hierarchy. * Enabled the menu commands 'Apply lex rule...' and 'Apply all lex rules' on all relevant feature structure and lexical rule application windows. Also in this interface, when a lexical rule application is selected but fails the location of unification failure is reported. * More consistent use of the LKB Top window and the terminal window for output. * Some bugs in Postscript printing are fixed, but there are still a few problems in this area. * Very efficient GLB computation, even for large type partitions. * Added :lkb-fos and :lkb-v5.5 to *features* to distinguish this version of the LKB. 9 September 2017 * The type hierarchy redundant edges bug described at http://moin.delph-in.net/LkbBugs is fixed. (Although it did not lead to incorrect results, it cluttered up the type hierarchy display). * Hiding GLB types in the type hierarchy display sometimes led to spurious edges being shown - fixed. * Type hierarchy display is faster for large hierarchies, i.e. 20000 types or more. * Clearer visualisation of subtype relationships in type hierarchy window, and of chart edge relationships in chart window (all links now go from left to right). * The last line of feature structure windows was missing - fixed. 31 August 2017 * The macOS 'Input Sources' menu can be used to switch between language scripts, and the 'Keyboard Viewer' provides convenient multilingual text entry. * Window titles are now able to show any Unicode character. * Much faster GLB computation for large type partitions. * To guard against slips of the fingers, in the terminal window 3 consecutive Control-D's are required to exit. * Grammar and lexicon files are no longer subject to Unicode NFC normalisation. 3 August 2017 * Initial release.