Pstone is a software package which enables students to experience writing, and debugging a grammar to cover a set of sentences.
In particular, it supports an assignment, typically taking 2 to 4 hours, which gives students:
The documentation consists of
Pstone runs on SunOS 5.6 (a Unix variant). It is written in C and Tcl, and uses Tcl 7.0 and Tk 4.0, which are included in the distribution. The distribution also includes the source files and the code does, so it probably can be ported to other environments.
Pstone is freeware, produced by Nigel Ward with contributions from T. Maeda.
This release is the first version (Version 1). It has been used successfully in my natural language processing graduate course, but has not been exhaustively tested.
The name `Pstone' is an amalgam of P for parser, PST for phrase structure tree, and Stone from the Stone Soup story (although I no longer have the time to stir even if people contribute juicy pieces).
Pstone's parser code is not something that you could read to learn how parsers work, nor how to program elegantly. There exist parsers which are much cleaner; the Natural Language Registry is a good place to look.
Pstone's parser is no-frills. More advanced parsers exist; again see the Natural Language Registry. (The reason for using simple context free grammars is that they are simple and well understood. Students who use Pstone become aware of the weaknesses, which motivates to learn about more advanced topics, such as probabilistic parsing, lexical disambiguation, simpler models of grammar, more powerful models of grammar, and automatic learning of grammars from corpora.)
Pstone's corpus is tiny. For real corpora, including especially the Penn Treebank, see the Linguistic Data Consortium. (Pstone's corpus size was chosen to be small enough to allow students to approach a feeling of accomplishment, but large enough to make the assignment challenging.)
Pstone's uses fairly ad hoc criteria for evaluating the parse trees produced by student grammars; better software for this includes the EVALB bracket scorer.
Invoking browser in cases where the locations of Tcl/Tk etc. were not set properly during installation results in the unenlightening error message browser not found.
Invoking the parser with a one-word sentence on the command line results in a confusing usage message.
The browser displays no trees when first invoked, until the user clicks the `prev input' or `next input' buttons.
The current set of reference trees does not include a separate test set. Thus it is currently possible for students to examine the reference parse trees and from them reverse-engineer the grammar which I used to create them, and so get a perfect match score without much effort.
It would be nice if the browser included a button for invoking the parser.
The browser could support search for sentences which: contain certain words, or parse with certain non-terminals, or are overly ambiguous, or parse incorrectly, or parse correctly with one grammar but not with another, and so on.
The browser could display a lot more information.
The appearance of the browser could be better.
Corpora for other languages would be nice.
It would be nice to have a more compact distribution for use by sites with Tcl/Tk already installed. And of course Pstone should use the latest version of Tcl/Tk.
... let me know which of these are important to you, to help me prioritize.