antlr4 - ANTLR - how to skip missing tokens in a 'for' loop -
i'm developing 'toy' language learn antlr.
my construct for
loop this.
for(4,10){ //program expressions };
i have grammar think works, it's little ugly. i'm not sure i've handled semantically unimportant tokens well.
for example, comma in middle there appears token, it's unimportant parser, needs 2 , 3 loop bounds. means when see child()
elements parts of loop
token, have skip unimportant ones.
you can see best if examine antlr viewer , @ parse tree this. red arrows point tokens think redundant.
feel should making more use of skip()
feature am, can't see how insert grammar tokens @ level.
loop: 'for(' foridxitem ',' foridxitem '){' (programexpression)+ '}'; foridxitem: num #forindexnumÌ | var #forindexvar;
the short answer antlr produces parse-tree, there cruft step on or otherwise ignore when walking tree.
the longer answer there tension between skipping cruft in lexer , producing tokens of limited syntactic value nonetheless necessary writing unambiguous rules.
for example, identify for(
candidate skipping, yet syntactically required. conversely, parameters comma without syntactic meaning. so, might clean in lexer (and parser) way:
for: 'for(' -> pushmode(params) ; endloop: '}' ; ws: .... -> skip() ; mode params; num: .... ; var: .... ; comma: ',' -> skip() ; endparams: '){' -> skip(), popmode() ; p_ws: .... -> skip() ;
your parer rule becomes
loop: foridxitem* programexpression+ endloop ; foridxitem: num | var ; programexpression: .... ;
that should clean tree fair bit.
Comments
Post a Comment