![]() |
|
Navigation |
Synopsis Syntax definition and parser generation for new languages.
Description All source code analysis projects need to extract information directly from the source code.
There are two main approaches to this:
Examples Let's use the Exp language as example. It contains the following elements:
module demo::lang::Exp::Concrete::WithLayout::Syntax layout Whitespace = [\t-\n\r\ ]*;Now you may parse and manipulate programs in the EXP language. Let's demonstrate parsing an expression: rascal>import demo::lang::Exp::Concrete::WithLayout::Syntax; ok rascal>import ParseTree; ok rascal>parse(#start[Exp], "2+3*4"); start(sort("Exp")): `2+3*4` Tree: appl(prod(start(sort("Exp")),[layouts("Whitespace"),label("top",sort("Exp")),layouts("Whitespace")],{}),[appl(prod(layouts("Whitespace"),[\iter-star(\char-class([range(9,10),range(13,13),range(32,32)]))],{}),[appl(regular(\iter-star(\char-class([range(9,10),range(13,13),range(32,32)]))),[])[@loc=|file://-|(0,0,<1,0>,<1,0>)]])[@loc=|file://-|(0,0,<1,0>,<1,0>)],appl(prod(sort("Exp"),[sort("Exp"),layouts("Whitespace"),lit("+"),layouts("Whitespace"),sort("Exp")],{assoc(left())}),[appl(prod(sort("Exp"),[lex("IntegerLiteral")],{}),[appl(prod(lex("IntegerLiteral"),[iter(\char-class([range(48,57)]))],{}),[appl(regular(iter(\char-class([range(48,57)]))),[char(50)])[@loc=|file://-|(0,1,<1,0>,<1,1>)]])[@loc=|file://-|(0,1,<1,0>,<1,1>)]])[@loc=|file://-|(0,1,<1,0>,<1,1>)],appl(prod(layouts("Whitespace"),[\iter-star(\char-class([range(9,10),range(13,13),range(32,32)]))],{}),[appl(regular(\iter-star(\char-class([range(9,10),range(13,13),range(32,32)]))),[])[@loc=|file://-|(1,0,<1,1>,<1,1>)]])[@loc=|file://-|(1,0,<1,1>,<1,1>)],appl(prod(lit("+"),[\char-class([range(43,43)])],{}),[char(43)]),appl(prod(layouts("Whitespace"),[\iter-star(\char-class([range(9,10),range(13,13),range(32,32)]))],{}),[appl(regular(\iter-star(\char-class([range(9,10),range(13,13),range(32,32)]))),[])[@loc=|file://-|(2,0,<1,2>,<1,2>)]])[@loc=|file://-|(2,0,<1,2>,<1,2>)],appl(prod(sort("Exp"),[sort("Exp"),layouts("Whitespace"),lit("*"),layouts("Whitespace"),sort("Exp")],{assoc(left())}),[appl(prod(sort("Exp"),[lex("IntegerLiteral")],{}),[appl(prod(lex("IntegerLiteral"),[iter(\char-class([range(48,57)]))],{}),[appl(regular(iter(\char-class([range(48,57)]))),[char(51)])[@loc=|file://-|(2,1,<1,2>,<1,3>)]])[@loc=|file://-|(2,1,<1,2>,<1,3>)]])[@loc=|file://-|(2,1,<1,2>,<1,3>)],appl(prod(layouts("Whitespace"),[\iter-star(\char-class([range(9,10),range(13,13),range(32,32)]))],{}),[appl(regular(\iter-star(\char-class([range(9,10),range(13,13),range(32,32)]))),[])[@loc=|file://-|(3,0,<1,3>,<1,3>)]])[@loc=|file://-|(3,0,<1,3>,<1,3>)],appl(prod(lit("*"),[\char-class([range(42,42)])],{}),[char(42)]),appl(prod(layouts("Whitespace"),[\iter-star(\char-class([range(9,10),range(13,13)...First we import the syntax definition and the ParseTree module that provides the parsing functionality. Finally, we parse 2+3*4 using the start symbol Exp .
Don't be worried, we are just showing the resulting parse tree here. It intended for programs and not for humans. The points we want to make are:
Pitfalls
![]() |