DynGenPar
Dynamic Generalized Parser
|
main class More...
Public Member Functions | |
Parser (TokenSource *tokenSource) | |
virtual | ~Parser () |
bool | isToken (CatArg cat) const |
void | addToken (CatArg cat) |
bool | isLiteral (const QList< Cat > &list) const |
is a given list of categories a literal? More... | |
void | initCaches () |
clears all caches, then computes the nullable categories and the initial graph More... | |
void | addRule (CatArg cat, const Rule &rule) |
adds a new rule to the grammar, updates the nullable categories and the initial graph and clears the other caches More... | |
void | loadCfg (const Cfg &cfg) |
Cfg | getCfg () |
get a Cfg object back from the parser, for serialization More... | |
bool | loadPmcfg (const Pmcfg &pmcfg) |
loads a PMCFG in standard form, converting it to the internal representation More... | |
bool | addPmcfgRule (Pmcfg &pmcfg, CatArg cat, const Rule &rule) |
adds a new rule to the grammar (both the PMCFG and the internal representation), updates the nullable categories and the initial graph and clears the other caches More... | |
QList< Match > | parse (int *errorPos=0, Cat *errorToken=0, int *incrementalPos=0, QList< StackItem > *incrementalStacks=0, QList< Match > *incrementalMatches=0, LexerState *lexerState=0) |
parse the input string More... | |
QList< Match > | parse (ParseState *parseState) |
overloaded version using ParseState, for bindings More... | |
void | saveState (ParseState *parseState) |
saves the current parser state, only meaningful during a parse operation More... | |
Predictions | computePredictions (const QList< StackItem > &stacks) const |
compute a set of predictions from the stacks produced by an incremental parse More... | |
Predictions | computePredictions (const ParseState &parseState) const |
overloaded version using ParseState, for bindings More... | |
QHash< Cat, QSet< Cat > > | expandNonterminalPrediction (CatArg cat) const |
expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent More... | |
QHash< Cat, QSet< Cat > > | expandNonterminalPredictionC (CatArg cat) |
expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent More... | |
MultiPredictions | computeMultiPredictions (const QList< StackItem > &stacks) const |
compute a set of multi-token predictions from the stacks produced by an incremental parse More... | |
MultiPredictions | computeMultiPredictions (const ParseState &parseState) const |
overloaded version using ParseState, for bindings More... | |
QHash< Cat, QSet< QList< Cat > > > | expandNonterminalPredictionMulti (CatArg cat) const |
expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent More... | |
QHash< Cat, QSet< QList< Cat > > > | expandNonterminalPredictionMultiC (CatArg cat) |
expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent More... | |
ConstrainedPredictions | computeConstrainedPredictions (const QList< StackItem > &stacks) const |
compute a set of predictions from the stacks produced by an incremental parse More... | |
ConstrainedPredictions | computeConstrainedPredictions (const ParseState &parseState) const |
overloaded version using ParseState, for bindings More... | |
QHash< Cat, QSet< Cat > > | expandNonterminalPredictionC (CatArg cat, const NextTokenConstraints &nextTokenConstraints) |
expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent More... | |
QHash< Cat, QSet< Cat > > | expandNonterminalPredictionC (CatArg cat, const QList< NextTokenConstraints > &nextTokenConstraintsList) |
expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent More... | |
ConstrainedMultiPredictions | computeConstrainedMultiPredictions (const QList< StackItem > &stacks) const |
compute a set of multi-token predictions from the stacks produced by an incremental parse More... | |
ConstrainedMultiPredictions | computeConstrainedMultiPredictions (const ParseState &parseState) const |
overloaded version using ParseState, for bindings More... | |
QHash< Cat, QSet< QList< Cat > > > | expandNonterminalPredictionMultiC (CatArg cat, const NextTokenConstraints &nextTokenConstraints) |
expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent More... | |
QHash< Cat, QSet< QList< Cat > > > | expandNonterminalPredictionMultiC (CatArg cat, const QList< NextTokenConstraints > &nextTokenConstraintsList) |
expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent More... | |
Public Attributes | |
grammar | |
RuleSet | rules |
grammar rules More... | |
TokenSet | tokens |
tokens More... | |
Cat | startCat |
start category More... | |
additional information needed for PMCFGs | |
QHash< Cat, QPair< Cat, QList< Cat > > > | pseudoCats |
pseudo-categories, used to represent PMCFGs internally More... | |
QHash< Cat, QPair< Cat, int > > | componentCats |
maps categories which represent components of a multi-component category to the category and component index they represent More... | |
QHash< Cat, QList< Cat > > | catComponents |
maps multi-component categories to the list of their components More... | |
Protected Attributes | |
TokenSource * | inputSource |
input source More... | |
main class
Definition at line 1158 of file dyngenpar.h.
|
inline |
Definition at line 1161 of file dyngenpar.h.
|
inlinevirtual |
Definition at line 1163 of file dyngenpar.h.
adds a new rule to the grammar (both the PMCFG and the internal representation), updates the nullable categories and the initial graph and clears the other caches
true
on success, false
on failureDefinition at line 974 of file dyngenpar.cpp.
adds a new rule to the grammar, updates the nullable categories and the initial graph and clears the other caches
Definition at line 689 of file dyngenpar.cpp.
|
inline |
Definition at line 1165 of file dyngenpar.h.
ConstrainedMultiPredictions DynGenPar::Parser::computeConstrainedMultiPredictions | ( | const QList< StackItem > & | stacks | ) | const |
compute a set of multi-token predictions from the stacks produced by an incremental parse
An extended category is either a nonterminal or a "literal", meaning a nonempty list of tokens appearing in sequence in a rule.
The table is represented as a (possibly multi-valued) hash table mapping a list of categories (containing either exactly one nonterminal or at least one terminal) to
For terminal/literal predictions, the constraints are immediately validated. For nonterminal predictions, this must be done during expansion.
Definition at line 3299 of file dyngenpar.cpp.
|
inline |
overloaded version using ParseState, for bindings
Definition at line 1240 of file dyngenpar.h.
ConstrainedPredictions DynGenPar::Parser::computeConstrainedPredictions | ( | const QList< StackItem > & | stacks | ) | const |
compute a set of predictions from the stacks produced by an incremental parse
For terminal predictions, the constraints are immediately validated. For nonterminal predictions, this must be done during expansion.
Definition at line 3147 of file dyngenpar.cpp.
|
inline |
overloaded version using ParseState, for bindings
Definition at line 1229 of file dyngenpar.h.
MultiPredictions DynGenPar::Parser::computeMultiPredictions | ( | const QList< StackItem > & | stacks | ) | const |
compute a set of multi-token predictions from the stacks produced by an incremental parse
An extended category is either a nonterminal or a "literal", meaning a nonempty list of tokens appearing in sequence in a rule.
The table is represented as a (possibly multi-valued) hash table mapping a list of categories (containing either exactly one nonterminal or at least one terminal) to
Definition at line 2806 of file dyngenpar.cpp.
|
inline |
overloaded version using ParseState, for bindings
Definition at line 1218 of file dyngenpar.h.
Predictions DynGenPar::Parser::computePredictions | ( | const QList< StackItem > & | stacks | ) | const |
compute a set of predictions from the stacks produced by an incremental parse
Definition at line 2515 of file dyngenpar.cpp.
|
inline |
overloaded version using ParseState, for bindings
Definition at line 1210 of file dyngenpar.h.
expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent
only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens
also expand each category only once because we do not need the full parse trees, only the last category
Definition at line 2575 of file dyngenpar.cpp.
expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent
only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens
also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints
Definition at line 2774 of file dyngenpar.cpp.
QHash< Cat, QSet< Cat > > DynGenPar::Parser::expandNonterminalPredictionC | ( | CatArg | cat, |
const NextTokenConstraints & | nextTokenConstraints | ||
) |
expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent
only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens
also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints
This overload also enforces the next token constraints passed as a second argument.
Definition at line 3215 of file dyngenpar.cpp.
QHash< Cat, QSet< Cat > > DynGenPar::Parser::expandNonterminalPredictionC | ( | CatArg | cat, |
const QList< NextTokenConstraints > & | nextTokenConstraintsList | ||
) |
expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent
only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens
also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints
This overload also enforces the disjunctive (inclusive OR) list of next token constraints passed as a second argument, i.e. if any of the next token constraint sets in nextTokenConstraintsList matches, the prediction is accepted.
Definition at line 3242 of file dyngenpar.cpp.
QHash< Cat, QSet< QList< Cat > > > DynGenPar::Parser::expandNonterminalPredictionMulti | ( | CatArg | cat | ) | const |
expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent
only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens
also expand each category only once because we do not need the full parse trees, only the last category
Definition at line 2918 of file dyngenpar.cpp.
QHash< Cat, QSet< QList< Cat > > > DynGenPar::Parser::expandNonterminalPredictionMultiC | ( | CatArg | cat | ) |
expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent
only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens
also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints
Definition at line 3124 of file dyngenpar.cpp.
QHash< Cat, QSet< QList< Cat > > > DynGenPar::Parser::expandNonterminalPredictionMultiC | ( | CatArg | cat, |
const NextTokenConstraints & | nextTokenConstraints | ||
) |
expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent
only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens
also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints
This overload also enforces the next token constraints passed as a second argument.
Definition at line 3397 of file dyngenpar.cpp.
QHash< Cat, QSet< QList< Cat > > > DynGenPar::Parser::expandNonterminalPredictionMultiC | ( | CatArg | cat, |
const QList< NextTokenConstraints > & | nextTokenConstraintsList | ||
) |
expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent
only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens
also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints
This overload also enforces the disjunctive (inclusive OR) list of next token constraints passed as a second argument, i.e. if any of the next token constraint sets in nextTokenConstraintsList matches, the prediction is accepted.
Definition at line 3425 of file dyngenpar.cpp.
|
inline |
get a Cfg object back from the parser, for serialization
Definition at line 1176 of file dyngenpar.h.
void DynGenPar::Parser::initCaches | ( | void | ) |
clears all caches, then computes the nullable categories and the initial graph
should be called after each direct grammar modification (addRule takes care of updating the caches, which is more efficient than clearing.)
Definition at line 578 of file dyngenpar.cpp.
is a given list of categories a literal?
true
if the given list of categories is a literal, i.e. contains only tokens, false
otherwise Definition at line 553 of file dyngenpar.cpp.
|
inline |
Definition at line 1164 of file dyngenpar.h.
|
inline |
Definition at line 1169 of file dyngenpar.h.
bool DynGenPar::Parser::loadPmcfg | ( | const Pmcfg & | pmcfg | ) |
loads a PMCFG in standard form, converting it to the internal representation
Rules containing categories which are neither tokens nor have rules are discarded, as they're unreachable and as we cannot transform them without knowing the dimension of the unused categories.
true
on success, false
on failureDefinition at line 932 of file dyngenpar.cpp.
QList< Match > DynGenPar::Parser::parse | ( | int * | errorPos = 0 , |
Cat * | errorToken = 0 , |
||
int * | incrementalPos = 0 , |
||
QList< StackItem > * | incrementalStacks = 0 , |
||
QList< Match > * | incrementalMatches = 0 , |
||
LexerState * | lexerState = 0 |
||
) |
parse the input string
[out] | errorPos | if non-NULL , is filled with -1 on success and with the number of accepted tokens before the error occurred on error (Caution: This might not be a good position indicator to show to the end user. For some token sources, lexerState contains a more user-centric position indicator which can be obtained through that token source's API.) |
[out] | errorToken | if non-NULL , is filled with 0 (epsilon) on success and with the token triggering the error on error. |
[in,out] | incrementalPos | should be NULL for a non-incremental parse, a pointer to a negative integer to start an incremental parse or a pointer to a nonnegative integer to continue an incremental parse. It will be set to the current end of input if non-NULL . |
[in,out] | incrementalStacks | should be NULL for a non-incremental, non-predictive parse or a pointer to QList<StackItem> (initially empty) for an incremental parse or when needed for prediction. It represents the internal parser states. Normally, this will be the list of stacks at the end of the parsing process. However, if an error occurred, that list would always be empty, so instead, we return the list of stacks before the token triggering the error, thus allowing to use the prediction functionality to report what token would have been expected instead of the faulty one. (An empty list of stacks means that the input was expected to end before the faulty token.) |
[in,out] | incrementalMatches | should be NULL for a non-incremental parse. For an incremental parse, it can be NULL if you do not need to get your matches back a second time when there is no new input. If set to non-NULL , it will be returned as is if there is no new input in an incremental parse, or updated to the current return value otherwise. |
[in,out] | lexerState | if non-NULL , is filled with the lexer state at the end of the (incremental) parsing process. (In case of an error, it is filled with the lexer state where the error occurred, i.e. before shifting the faulty token, to allow reporting error positions accurately.) It is useful to allow rewinding to a previous position with a stateful lexer. It can be NULL for a non-incremental or a sequential incremental parse (i.e. if rewinding is not needed) or if a stateless token source (stateless lexer, token buffer etc.) is used. When starting a new incremental parse, the lexer state pointed to should be a null (default-constructed) LexerState. If the lexerState pointer is NULL , all rewind operations for the lexer will be passed a null (default-constructed) LexerState; stateful lexers will then fail any rewind operations. |
Definition at line 2437 of file dyngenpar.cpp.
|
inline |
overloaded version using ParseState, for bindings
Definition at line 1185 of file dyngenpar.h.
|
inline |
saves the current parser state, only meaningful during a parse operation
This method is intended to be used in callbacks executed within parse. When called with no running parse operation, parseState will be reset.
Definition at line 1196 of file dyngenpar.h.
maps multi-component categories to the list of their components
used during the import of PMCFG rules in the standard representation
can be left out if the internal representation is used
Definition at line 1300 of file dyngenpar.h.
maps categories which represent components of a multi-component category to the category and component index they represent
also used to look up whether a category is a component of a multi-component category
Definition at line 1294 of file dyngenpar.h.
|
protected |
input source
Definition at line 1305 of file dyngenpar.h.
pseudo-categories, used to represent PMCFGs internally
maps a pseudo-category to:
Note that, when parsing a PMCFG, the initial graph and the set of nullable categories are the ones for the context-free approximation. The PMCFG constraints are only evaluated during the matching resp. reducing steps.
Also note that tokens are always 1-dimensional, so tokens may not be pseudo-categories nor the actual component for a pseudo-category.
Definition at line 1288 of file dyngenpar.h.
RuleSet DynGenPar::Parser::rules |
grammar rules
Definition at line 1258 of file dyngenpar.h.
Cat DynGenPar::Parser::startCat |
start category
Definition at line 1262 of file dyngenpar.h.
TokenSet DynGenPar::Parser::tokens |
tokens
Definition at line 1260 of file dyngenpar.h.