antlr4 - ANTLR - how to skip missing tokens in a 'for' loop -


i'm developing 'toy' language learn antlr.

my construct for loop this.

for(4,10){ //program expressions };

i have grammar think works, it's little ugly. i'm not sure i've handled semantically unimportant tokens well.

for example, comma in middle there appears token, it's unimportant parser, needs 2 , 3 loop bounds. means when see child() elements parts of loop token, have skip unimportant ones.

you can see best if examine antlr viewer , @ parse tree this. red arrows point tokens think redundant.

enter image description here

feel should making more use of skip() feature am, can't see how insert grammar tokens @ level.

loop: 'for(' foridxitem ',' foridxitem '){' (programexpression)+ '}'; foridxitem: num #forindexnumÌ | var #forindexvar;

the short answer antlr produces parse-tree, there cruft step on or otherwise ignore when walking tree.

the longer answer there tension between skipping cruft in lexer , producing tokens of limited syntactic value nonetheless necessary writing unambiguous rules.

for example, identify for( candidate skipping, yet syntactically required. conversely, parameters comma without syntactic meaning. so, might clean in lexer (and parser) way:

for: 'for(' -> pushmode(params) ; endloop: '}' ; ws: .... -> skip() ;  mode params; num: .... ; var: .... ; comma: ',' -> skip() ; endparams: '){' -> skip(), popmode() ; p_ws: .... -> skip() ; 

your parer rule becomes

loop: foridxitem* programexpression+ endloop ; foridxitem: num | var ; programexpression: .... ; 

that should clean tree fair bit.


Comments

Popular posts from this blog

apache - PHP Soap issue while content length is larger -

asynchronous - Python asyncio task got bad yield -

javascript - Complete OpenIDConnect auth when requesting via Ajax -