Description A
concrete syntax tree or
parse tree
is an ordered, rooted tree that
represents the syntactic structure of a string according to some formal grammar. In Rascal parse trees, the interior nodes are labeled by rules of the grammar, while the leaf nodes are labeled by terminals (characters) of the grammar.
Tree
is the universal parse tree data type in Rascal and can be used to represent parse trees for any language.
-
Tree
is a subtype of the type Node.
- All SyntaxDefinition types (non-terminals) are sub-types of
Tree
- All ConcreteSyntax expressions produce parse trees the types of which are non-terminals
- Trees can be annotated in various ways, see IDEConstruction features. Most importantly the
\loc
annotation always points to the source location of any (sub) parse tree.
Parse trees are usually analyzed and constructed using
ConcreteSyntax expressions and patterns.
Advanced users may want to create tools that analyze any parse tree, regardless of the
SyntaxDefinition that generated it, you can manipulate them on the abstract level.
In
Tree is the full definition of
Tree
,
Production
and
Symbol
. A parse tree is a nested tree structure of type
Tree
.
- Most internal nodes are applications (
appl
) of a Production
to a list of children Tree
nodes. Production
is the abstract representation of a SyntaxDefinition rule, which consists of a definition of an alternative for a Symbol
by a list of Symbols
.
- The leaves of a parse tree are always characters (
char
), which have an integer index in the UTF8 table.
- Some internal nodes encode ambiguity (
amb
) by pointing to a set of alternative Tree
nodes.
The
Production
and
Symbol
types are an abstract notation for rules in
SyntaxDefinitions, while the
Tree
type is the actual notation
for parse trees.
Parse trees are called parse forests when they contain
amb
nodes.
You can analyze and manipulate parse trees in three ways:
The type of a parse tree is the symbol that it's production produces, i.e.
appl(prod(sort("A"),[],{}),[])
has type
A
. Ambiguity nodes
Each such a non-terminal type has
Tree
as its immediate super-type.
The
ParseTree
library provides:
- associativity: Choice under associativity is flattened.
- Condition: constructors for declaring preconditions and postconditions on symbols
- doc: Annotate a parse tree node with a documentation string.
- docs: Annotate a parse tree node with documentation strings for several locations.
- implode: Implode a parse tree according to a given (ADT) type.
- isNonTerminalType:
- link: Annotate a parse tree node with the target of a reference.
- links: Annotate a parse tree node with multiple targets for a reference.
- loc: Annotate a parse tree node with a source location.
- message: Annotate a parse tree node with an (error) message.
- messages: Annotate a parse tree node with a list of (error) messages.
- parse: Parse input text (from a string or a location) and return a parse tree.
- priority: Nested priority is flattened.
- Production:
- saveParser: Save the current object parser to a file.
- Symbol:
- Tree: The Tree data type as produced by the parser.
- treeAt: Select the innermost Tree of a given type which is enclosed by a given location.
- TreeSearchResult: Tree search result type for treeAt.
- unparse: Yield the string of characters that form the leafs of the given parse tree.
Pitfalls For historical reasons the name of the annotation is "loc" and this interferes with the Rascal keyword
loc
for the type of
Locations.
Therefore the annotation name has to be escaped as
\loc
when it is declared or used.