Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ANTLR backend: malformed lexer file when using subtraction in regular expression #276

Closed
andreasabel opened this issue Dec 15, 2019 · 0 comments
Assignees
Labels
bug Java/ANTLR lexer Concerning the generated lexer
Milestone

Comments

@andreasabel
Copy link
Member

Start. S ::= Name ;
token Name (char - [ "(){};.@\" \n\t" ]) + ;

This yields a ANTLR lexer file containing:

// This Antlr4 file was machine-generated by the BNF converter
lexer grammar testLexer;
// Predefined regular expressions in BNFC
fragment LETTER  : CAPITAL | SMALL ;
fragment CAPITAL : [A-Z\u00C0-\u00D6\u00D8-\u00DE] ;
fragment SMALL   : [a-z\u00DF-\u00F6\u00F8-\u00FF] ;
fragment DIGIT   : [0-9] ;


Name : ~[
 "().;@{}]+;





// Whitespace
WS : (' ' | '\r' | '\t' | '\n' | '\f')+ ->  skip;
// Escapable sequences
fragment
Escapable : ('"' | '\\' | 'n' | 't' | 'r' | 'f');
ErrorToken : . ;

This raises the following errors:

java  org.antlr.v4.Tool -lib test -package test test/testLexer.g4
error(50): testLexer.g4:10:8: syntax error: '[' came as a complete surprise to me
error(50): testLexer.g4:11:1: syntax error: '"' came as a complete surprise to me
error(50): testLexer.g4:11:3: syntax error: ')' came as a complete surprise to me while looking for lexer rule element
error(50): testLexer.g4:11:6: syntax error: '@' came as a complete surprise to me
error(50): testLexer.g4:11:9: syntax error: ']' came as a complete surprise to me
@andreasabel andreasabel added this to the 2.8.4 milestone Dec 15, 2019
@andreasabel andreasabel self-assigned this Dec 15, 2019
andreasabel added a commit that referenced this issue Dec 15, 2019
@andreasabel andreasabel added the lexer Concerning the generated lexer label Dec 15, 2019
andreasabel added a commit that referenced this issue Jan 19, 2020
Don't use showLitChar for unicode characters!

b8701c3 broke #249 for Java/ANTLR in the lexer,
c49d1fd for Java/CUP.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Java/ANTLR lexer Concerning the generated lexer
Projects
None yet
Development

No branches or pull requests

1 participant