- create-parser
-
create-parser grammarfile
Creates a parser with grammarfile.
The first parser created is assigned to the variable defaultparser.
- parse
-
parse sentence (parser)
Parses the sentence using the given parser.
If no parser is given, $defaultparser is used.
- parse-word
-
parse-word word (cat) (parser)
Parses the word using the given category and parser.
If no category is given, uses the categories of sub-lexical rules.
If no parser is given, $defaultparser is used.
- parse-lattice
-
parse-lattice fsmfile (parser)
Parses the lattice stored in fsmfile using the given parser.
If no parser is given, $defaultparser is used.
Uses the PARSECAT property of the fsmfile as the root category if given.
- morphemes
-
morphemes string (twolevel) (parser)
Prints the morphemes in the given string.
Multi-character tags are quoted.
If 'twolevel' is the second argument, prints the two-level
representation of the morphemes, with the output of the
morphology on the upper side, and the output of the
tokenizer on the lower side.
Uses $defaultparser if parser is not given.
- tokens
-
tokens string (parser)
Prints the tokens in the given string.
Multi-character tags are quoted.
Uses $defaultparser if parser is not given.
- apply-to-tokens
-
apply-to-tokens network (parser)
Applies the network to the output of the parser's tokenizer.
network can be a c-fsm file or expression.
Uses $defaultparser if parser is not given.
- analyze-string
-
analyze-string string (limit) (parser)
Lists paths through input string produced
by morph config of current grammar
If limit is token, only tokenization done
If limit is morph, tokenization skipped
and morph analysis applied to single word or multiword
Uses $defaultparser if parser is not given.
- save-morphemes-to-file
-
save-morphemes-to-file string filename (parser)
Applies the morphology of parser to the given string.
Saves the resulting network on filename.
Uses $defaultparser if parser is not given.
- save-tokens-to-file
-
save-tokens-to-file string filename (parser)
Applies the tokenizer of parser to the given string.
Saves the resulting network on filename.
Uses $defaultparser if parser is not given.
- grammar-files
-
grammar-files (chart)
Prints the grammar files related to chart.
Uses $defaultchart if chart is not given.
- parse-file
-
parse-file file outputdir
Parses the text file using $defaultparser.
Prints the resulting prolog files in outputdir.
If the parser has a property_weights_file, then parse-file
will only print the most probable parse. Disable this with:
setx property_weights_file "" $defaultparser
parse-file file (-parser parser) (-parseProc
proc) (-parseData data)
This pattern is like parse-testfile except that the sentences
are broken using the sentence breaker in parser.
If no parser is given, $defaultparser is used.
parse-file file outputdir is equivalent to
parse-file file -parseProc defaultParseFileProc -parseData outputdir
- parse-testfile
-
parse-testfile filename (start (stop))
(-parser parser)
(-parseProc parseProc)
(-parseData parseData)
(-statsPrefix statsPrefix)
(-outputPrefix outputPrefix)
(-goldPrefix goldPrefix)
(-mostProbable)
Parses each of the sentences in filename from start to stop.
If a value is given for start but not stop, then
parse-testfile will only parse that sentence.
(e.g. parse-testfile foo 7 will just parse sentence 7).
If neither start nor stop is given, then the whole testfile is parsed.
start and stop can be the index of the desired sentence or
a string to match in the desired sentence.
If stop is "end", then XLE will parse to the end of the testfile.
parser defaults to $defaultparser.
If parseProc is given, then 'parseProc parser $sentence $index parseData'
will be called on every sentence instead of defaultParseProc.
If statsPrefix is given, then parse-testfile will print
the stats files on statsPrefix.new, statsPrefix.stats,
and statsPrefix.errors.
If outputPrefix is given, then defaultParseProc will print
a packed prolog file for sentence 1 on outputPrefixS1.pl,
a packed prolog file for sentence 2 on outputPrefixS2.pl,
and so on.
If outputPrefix ends with the slash character
and the sentence doesn't have its own sentence ID, then
the sentence ID will be S followed by the sentence number.
If goldPrefix is given, then defaultParseProc will compare
the output file with the corresponding file in goldPrefix
using triples match and return the f-score.
If '-mostProbable' is given with outputPrefix then
parse-testfile will only print the most probable analysis for each sentence.
See the documentation for more details.
- make-testfile
-
make-testfile filename (testfilename) (parser)
Breaks filename into sentences and prints the
sentences on testfilename.
If testfilename is not given, uses filename.new.
If parser is not given, uses $defaultparser.
- diff-testfiles
-
diff-testfiles filename1 (filename2)
Looks for mismatches in the number of solutions for
sentences in filename1 and filename2. It also compares
the expected number of solutions against the actual
number of solutions in filename2. If filename2 is
omitted, it defaults to filename1.
- sort-stats
-
sort-stats filename (compareFn)
Sorts the statistics in filename using compareFn.
Writes the results on filename.sorted.
The default compareFn is subtrees-per-second.
Also possible: subtrees-per-word, words-per-second,
seconds-per-sentence, words-per-sentence,
subtrees-per-sentence, and solutions-per-sentence.
- set-character-encoding
-
set-character-encoding filename encoding
Sets the character encoding for filename to encoding.
set-character-encoding stdio encoding
Sets the character encoding of stdin, stdout, and stderr to encoding.
set-character-encoding prolog encoding
Sets the default character encoding of Prolog.
set-character-encoding prolog encoding chart
Sets the character encoding of Prolog for chart.
set-character-encoding * encoding
Sets the character encoding of stdio and prolog to encoding.
Sets the default character encoding of files to encoding.
- create-generator
-
create-generator grammarfile
Creates a generator with the given grammar file.
- generate-from-file
-
generate-from-file filename (enumerate) (generator)
Generates from the Prolog file in filename.
enumerate can be first or all (default is all).
generator is the generator to generate with.
generator defaults to $defaultgenerator.
Returns the number of solutions.
- generate-from-directory
-
generate-from-directory directory (enumerate) (generator)
Generates from the Prolog files in directory.
enumerate can be first or all (default is all).
generator is the generator to generate with.
generator defaults to $defaultgenerator.
- generate-from-morphemes
-
generate-from-morphemes morphemes (generator)
Generates from the morphemes in morphemes.
generator is the generator to generate with.
generator defaults to $defaultgenerator.
- set-gen-outputs
-
set-gen-out(stringfile (msgfile))
Sets the standard output and standard error files
for the generator. The defaults are stdout and stderr.
These defaults may be restored by set-gen-outwith
no arguments.
- close-gen-outputs
-
close-gen-outputs
Closes the standard output and standard error files
for the generator.
- set-gen-adds
-
set-gen-adds addmode addlist (OTmark) (generator)
For use with underspecified generator input.
'set-gen-adds remove addlist' replaces the current
list of features to be removed with addlist.
These features are removed from the input before generation.
'set-gen-adds add addlist' replaces the current
list of addable features with addlist.
The generator is free to add addable features
to the input as needed.
If OTmark is given, then the attributes are
added at a cost, depending on GENOPTIMALITYORDER.
'set-gen-adds ignore addlist' is like doing
'set-gen-adds remove addlist' and
'set-gen-adds add addlist'. This is useful for
generating paradigms.
NB: addlist must be enclosed in quotes.
create-generator defaults to set-gen-adds add @INTERNALATTRIBUTES.
- get-gen-adds
-
get-gen-adds addmode (generator)
'get-gen-adds add' returns a list of addable features and their OT marks.
'get-gen-adds remove' returns a list of features
to be removed from the generator input.
generator defaults to $defaultgenerator.
- regenerate
-
regenerate string
Parses string using the default parser, picks the
first f-structure, and generates from it using the
grammar used by the default parser.
- regenerate-testfile
-
regenerate-testfile testfile (start) (stop)
This is like parse-testfile, only it regenerates each
item in the test file instead of parsing it.
Checks to make sure that the input to the parser
is contained in the output of the generator.
- regenerate-morphemes
-
regenerate-morphemes (parser)
Tests the generation morphology by applying it to
the morphemes of the valid trees in the given parser.
If no parser is given, then $defaultparser is used.
- debug-gen-input
-
debug-gen-input bad_fs good_fs (generator)
Tries to determine why the f-structure in bad_fs
doesn't generate while the f-structure in good_fs does.
If no generator is given, then $defaultgenerator is used.
- print-tree
-
print-tree (filename) (window)
Prints a tree as a postscript file.
If filename="*" or not given, chooses a new file name.
If window is not given, prints the tree window of the first chart.
- print-tree-as-sexp
-
print-tree-as-sexp (filename) (window)
Prints a tree as an sexp.
If filename="*" or not given, chooses a new file name.
If window is not given, prints the tree window of the first chart.
- find-tree
-
find-tree filename (chart)
Finds a tree in the given chart.
If no chart is given, defaultchart is used.
filename must contain a tree printed by print-tree-as-sexp.
- print-tree-morphemes
-
print-tree-morphemes tree filename
prints the leaf morphemes of the given tree on the given file.
tree can be a DTree, a Graph, or a Chart.
- get-tree-morphemes
-
get-tree-morphemes (tree)
Extracts the leaf morphemes of the given tree.
tree can be a DTree, a Graph, or a Chart.
tree defaults to the default parser.
- print-fs
-
print-fs (filename) (window)
Prints an fstructure as a postscript file.
If filename="*" or not given, chooses a new filename
unless the sentence has a SENTENCE_ID associated
with it, in which case it includes its value.
If window is not given, prints the phi projection.
- print-fs-as-lex
-
print-fs-as-lex (filename) (window)
Prints an fstructure as a lexical entry.
If filename="*" or not given, chooses a new filename
unless the sentence has a SENTENCE_ID associated
with it, in which case it includes its value.
If window is not given, prints the phi projection.
- print-fs-as-prolog
-
print-fs-as-prolog (filename) (window)
Prints an fstructure as a Prolog term.
If filename="*" or not given, chooses a new filename
unless the sentence has a SENTENCE_ID associated
with it, in which case it includes its value.
If window is not given, prints the phi projection.
- print-chart-graph
-
print-chart-graph (filename) (parser)
Prints the chart as a contexted lexical entry.
If filename="*" or not given, chooses a new filename
unless the sentence has a SENTENCE_ID associated
with it, in which case it includes its value.
If parser is not given, prints $defaultparser.
- print-prolog-chart-graph
-
print-prolog-chart-graph (filename) (chart) (selected)
Prints the chart as a contexted prolog term.
If filename="*" or not given, chooses a new filename
unless the sentence has a SENTENCE_ID associated
with it, in which case it includes its value.
If printing something from read-prolog-chart-graph,
then the same file name is used by default
(if the file name doesn't have a slash in it, then
the file is printed to default_print_dir).
If selected is set to 1, the currently selected choices
are included in the prolog term. selected defaults to 0.
If parser is not given, prints $defaultparser.
chart can be either a Chart or a Graph.
- read-prolog-chart-graph
-
read-prolog-chart-graph filename (chart)
Inverse of print-prolog-chart-graph.
Displays the packed f-structure.
If there are no c-structure facts, then the prolog variable
ids are used instead of edge ids in the f-structure display.
If chart is not given, uses an internal chart.
- display-prolog-files
-
display-prolog-files filelist (chart)
Displays the first Prolog file in filelist.
Use "next-sentence" and "prev-sentence" to see other files.
If filelist is a directory, displays the Prolog
files in that directory.
If chart is not given, uses an internal chart.
- make-bug-grammar
-
make-bug-grammar (chart) (-targetdir targetdir)
Makes a self-contained copy of the grammar that is useful
for reporting bugs. All of the files are copied into targetdir,
which makes it easy to tar and distribute the grammar.
chart defaults to $defaultchart. targetdir defaults to
the next bug directory that is a sister of the grammar's directory.
- make-release-grammar
-
make-release-grammar (chart) (-targetdir targetdir) (-noencrypt)
Makes a self-contained copy of the grammar that is useful
for giving to others. All of the files are copied into targetdir,
which makes it easy to tar and distribute the grammar.
chart defaults to $defaultchart. targetdir defaults to
release-$date at the same level as the grammar's directory.
Files listed in ENCRYPTFILES are encrypted using encrypt-lexicon-file
unless -noencrypt is given.
- encrypt-lexicon-file
-
encrypt-lexicon-file filename
Encrypts the headwords in filename.
Encrypted headwords begin with a space.
Prints encrypted file on filename.new.
- encrypt-property-weights-file
-
encrypt-property-weights-file filename
Encrypts the headwords in filename.
Encrypted headwords begin with a space.
Prints encrypted file on filename.new.
- print-rule
-
print-rule rulename (chart)
Prints the definition of rulename in LFG notation.
Prints the rulename of the grammar given in chart.
chart defaults to $defaultchart.
- print-rule-network
-
print-rule-network rulename (chart)
Prints the rule network for rulename of the grammar given in chart.
chart defaults to $defaultchart.
- print-rule-statistics
-
print-rule-statistics chart
Prints statistics for the rules in the grammar of chart.
- print-lex-entry
-
print-lex-entry headword (cat) (chart)
Prints the effective entry for headword in LFG notation.
If cat is non-empty, only print that category.
Prints the entry in the grammar given by chart.
chart defaults to $defaultchart.
- print-feature-declarations
-
print-feature-declarations (chart)
Prints feature declarations in the grammar of the given chart.
chart defaults to $defaultparser.
- print-unused-feature-declarations
-
print-unused-feature-declarations (chart)
Prints unused feature declarations in the grammar of the given chart.
chart defaults to $defaultparser.
- print-unused-grammar-choices
-
print-unused-grammar-choices (rule) (chart)
Prints the unused grammar choices in rule.
rule defaults to all the rules.
chart defaults to $defaultparser.
- print-prolog-grammar
-
print-prolog-grammar filename (-cfg) (-lexentries)
Prints the current grammar onto filename using Prolog.
If -cfg is included, print the rules in CFG notation.
If -lexentries is included, print the lexical entries.
- print-token-headwords
-
print-token-headwords filename (-cats) (chart)
Prints the token headwords for chart onto filename.
If -cats is included, print the categories as well.
- set-performance-vars
-
set-performance-vars filename (chart)
Set the performace variables for the given chart.
chart defaults to $defaultparser.
- print-performance-vars
-
print-performance-vars (filename) (chart)
Prints the performace variables of the given chart on filename.
chart defaults to $defaultparser.
If no filename, prints on stdout.
- sum-print-subtrees
-
sum-print-subtrees filename (outfilename)
Prints a summary of a print_subtrees output
that has been stored on filename.
If outfilename is given, prints summary on this file.
If sum-print-subtrees-cutoff is set, then only those
lines that exceed the cutoff will be printed.
Note: there is no need to delete spurious lines
when storing the output of print_subtrees.
- chart-statistics
-
chart-statistics (type) (cutoff)
Prints statistics on the number of times each category
appears in the chart and the number of times that each
mother category dominates a non-lexical daughter category.
This is useful for making the grammar faster.
type = all prints all of the edges.
type = graphs (the default) prints only edges with graphs.
type = nogoods prints only nogood edges.
Lines that have fewer items than cutoff are suppressed.
- cat-ranges
-
cat-ranges category (type) (showSubtreeCounts)
Prints the edge ranges of the specified category.
type = all prints all of the edges.
type = graphs (the default) prints only edges with graphs.
- print-ambiguity-sources
-
print-ambiguity-sources (chart)
Prints possible sources of ambiguity for the given chart.
chart defaults to $defaultchart.
print-ambiguity-sources edge# (chart)
Prints possible sources of ambiguity for the given edge.
chart defaults to $defaultchart.
print-ambiguity-sources edge#:subtree# (chart)
Prints possible sources of ambiguity for the given subtree.
chart defaults to $defaultchart.
- setx
-
setx var value (chart)
Like set, only it verifies that var already exists.
This is useful for detecting typos.
If chart is given, then stores the value of var on chart.
- getx
-
getx var (chart)
Verifies that var already exists.
This is useful for detecting typos.
If chart is given, then retrieves the value of var from chart.
- set_stdout
-
set_stdout (filename (mode))
Sets the standard output stream to the given file.
Mode can be w (write) or a (append).
Mode defaults to w (write).
If no filename is given, use the original stdout.
- set_stderr
-
set_stderr (filename (mode))
Sets the standard error stream to the given file.
Mode can be w (write) or a (append).
Mode defaults to w (write).
If no filename is given, use the original stderr.
If filename is stdout, use the current stdout.
- close-all
-
close-all (chart)
Closes all of the XLE windows.
If chart is given, only closes the chart's windows.
- close-all-except
-
close-all-except chart
Closes all of the XLE windows except those associated with chart.
- set-OT-rank
-
set-OT-rank mark1 mark2 (chart)
Sets mark1 to have the same rank as mark2.
If mark2 is a number, makes mark1 have that rank.
Returns the original rank of mark1.
chart defaults to $defaultchart.
- prepend-tokenizer
-
prepend-tokenizer tokenizer (chart)
Prepends the given tokenizer to the tokenizer cascade in
chart.
Acts as if the tokenizer had been prepended to TOKENIZE in the
morph config.
The tokenizer can be an entry like {[patterns onomasticon.fst]}.
This only affects the chart, not the underlying grammar.
chart defaults to $defaultchart.
- pop-tokenizer
-
pop-transducer (chart)
Pops the first tokenizer off the tokenizer cascade in
chart.
This only affects the chart, not the underlying grammar.
chart defaults to $defaultchart.
- prepend-analyzer
-
prepend-analyzer analyzer (chart)
Prepends the given analyzer to the analyzer cascade in
chart.
Acts as if the analyzer had been prepended to ANALYZE USEALL in the
morph config.
The analyzer can be an entry like {[patterns onomasticon.fst]}.
This only affects the chart, not the underlying grammar.
chart defaults to $defaultchart.
- prepend-priority-analyzer
-
prepend-priority-analyzer analyzer (chart)
Prepends the given analyzer to the analyzer cascade in
chart.
Acts as if the analyzer had been prepended to ANALYZE USEFIRST in the
morph config.
The analyzer can be an entry like {[patterns onomasticon.fst]}.
This only affects the chart, not the underlying grammar.
chart defaults to $defaultchart.
- prepend-multiwords
-
prepend-multiwords transducer (chart)
Prepends the given transducer to the multiword cascade in
chart.
Acts as if the transducer had been prepended to MULTIWORD in the
morph config.
The transducer cannot be an entry like {[patterns onomasticon.fst]}.
This only affects the chart, not the underlying grammar.
chart defaults to $defaultchart.
- create-transducer
-
create-transducer infile outfile
Creates a morphological transducer in outfile.
from the specifications in infile.
See xle morphology documentation for details.
- count_statistical_features
-
count_statistical_features graph (-per-sentence)
Counts the statistical features in graph.
If -per-sentence, then only count one instance of each feature.
(-per-sentence is useful when the sentences are packed.)
The counts can be accumulated using parse-testfile
with a -parseProc that calls count_statistical_features.
They can be printed with print_statistical_features.
- print_statistical_features
-
print_statistical_features chart
filename (cutoff)
Prints the accumulated feature counts for chart to
filename.
If cutoff is non-zero, then only print features that
have
at least cutoff counts and don't print the counts.
- read_statistical_features
-
read_statistical_features chart filename
Adds the feature counts in filename to the
accumulated counts in chart.
This is useful for merging feature counts from different corpora.