Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ANTLR4 Java backend #155

Merged
merged 30 commits into from
Dec 8, 2015
Merged

Add ANTLR4 Java backend #155

merged 30 commits into from
Dec 8, 2015

Conversation

gapag
Copy link

@gapag gapag commented Oct 23, 2015

I wrote an ANTLR4 backend hopefully substituting the old J(F)Lex+CUP .
ANTLR is a lexer+parser generator originally targetting Java. It has many commonalities with BNFC, but it deals mainly with parse trees whereas BNFC has its focus on ASTs.

The motivation is that

  1. these tools are old and at least one of them is not supported anymore.
  2. ANTLR has interesting features to manipulate parse trees, which could be integrated in BNFC.
  3. ANTLR has also some other target languages (e.g. Python)

Commit c7ba720 's message contains further details about what has been done and what is to do.

Summarizing, what I did is

  • make ANTLR a non-default option for the Java backend
  • CF to ANTLR lexer/parser translation
  • Java Makefiles calling ANTLR
  • Blacklist Java keywords (abstract, native ...) (ONLY for ANTLR backend)

System tests pass successfully on my machine.

The directory .idea is created by the IDE I use and you can of course remove it from the merge -- unless there are other people willing to use my same project.

I did not set ANTLR as the default Java backend because I'd like to have more people trying it before promoting it.

Thanks for your time.

Gabriele Paganelli added 9 commits October 14, 2015 11:05
This commit applies the changes and it does not structurally
change the status quo.

Currently the JavaLexerParser type defined in BNFC.Options
is pushed down to other Java specific files of this commit.
This has to be removed for cleansiness, since the
tool-specific features of JLex, JFlex and ANTLR4 can be
decided for once before the code generation begins.
The different backends are being differentiated.
The file Java.hs now prepares structures containing parsing and lexing
functions contained in the modules relative to the various lexing/parsing
tools.
…le support (works with both (JLex|JFlex->CUP) and AntlrV4)
…class.

TestRunner compares the result of the parsing of the two different Java backends.
DONE:
- ANTLR lexer/parser translation
- Makefiles
- Blacklist Java keywords (abstract, native ...) ONLY for ANTLR backend
- System tests pass correctly

ToDo:
- Composable visitors for ASTs
- Make integer and double literals be varsize (issue 59)
- Generate some mapping AST <--> parse trees to take advantage
  of ANTLR's facilities
- Add position information
- Add layout directives for Java backend
- Implement the following test schema:
    cup accepts ==> antlr accepts and
                cup AST equals antlr AST
- Make ANTLR4 support abstract to use it in other backends
	(C#, Python, JavaScript)

Known bugs: name clashes are not entirely avoided
- Implicit tokens are given the name Surrogate_id_SYMB_<n> where <n> is an
  increasing integer.
  The motivation is that ANTLR lexer ids must start with an uppercase letter.
  The current implementation does nothing to prevent clashes with user-defined
  tokens. ANTLR will complain about this when trying to compile.

Test.java:
The generated Test.java is more complicated than the J(F)lex/CUP backend.
This because ANTLR does not by default
  1) terminate abruptly parsing/lexing if error occurr, but tries to continue
  2) expect to make sense of everything until EOF.
@gapag
Copy link
Author

gapag commented Oct 23, 2015

Oops. Doctests fail. I will fix this on Monday. I will also fix the classpath in the example tests, right now there is no antlr library in the javatools.jar. Sorry!

@@ -0,0 +1,12 @@
<component name="libraryTable">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit too IDE-specific to be in the repository.
Although, I like the idea of using some package manager for java (maven?) to get the dependencies instead of the (rather ugly) javatools.jar...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove it, by itself it does not help to automate the dependencies -- it is not a maven input file AFAIK.

I write the following just so that you can follow my line of though and see if it is the same of yours:
I guess it should be necessary to have some maven file with such dependencies and run maven if necessary when you run the tests, which implies you should have maven on your test server.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have direct experience using Maven anyway, I usually use it through IDEs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I was thinking. Maven is usually available as a distribution package so it shouldn't be a problem to have it on the test server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I was just thinking out loud :-)
Can you remove the .idea directory though?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will make some commits to address all your comments.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before tonight.

@gdetrez
Copy link
Contributor

gdetrez commented Oct 29, 2015

Hi Gabriele.

Sorry it took me so long to get back to you on this. I think this is definitely a great addition to bnfc as we have sort of fallen behind with respect to the parsing tools available for java.

Globally, this looks good and coherent with the rest of the code. (Thanks in particular for taking the time to add system tests!!). So most of my comments at this point are about polishing.

Can you look at the following in addition the the line comments:

  • I don't like TODOs and FIXMEs. First they have a tendency to stick around forever and second they make it difficult to have a discussion about the problem. I suggest the following:
    • if it's trivial, just fix it
    • if it's more work but it doesn't really matter (like potential refactorings), just remove the TODO
    • for anything else (future work, questions, potential bug...), open an issue
  • There is a few places where the code could be simplified a bit (e.g.: ( "PARSER", (executable parmake)) -> ( "PARSER", executable parmake),, you can run hlint at least on the new files to get reformating suggestion (you can run hlint on the file you only modified but you'll get a lot of suggestion about the old code as well... :-/)

As a FYI, I'm globally trying to follow this as a style guide, although like for hlint, the legacy code is still a mess at this point...

@gapag
Copy link
Author

gapag commented Oct 29, 2015

Great, I will try to stick to the style guide and polish the code as much as I can. I will push the commits before tonight. Thanks to you for reviewing the code.

@gapag
Copy link
Author

gapag commented Nov 19, 2015

I meant haddock not HSpec. Shame on me.


cf2jlex' str cf = cf2jlex str cf False
cf2jflex' str cf = cf2jlex str cf True

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this needs fixing. I suggest the following solution: reorder the argument of cf2jlex and use cf2jlex True and cf2jlex False instead.

Or, if you really want two different functions that's ok but give them names without quotes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll follow your advice, I'll fix this before next week.

@gdetrez
Copy link
Contributor

gdetrez commented Nov 19, 2015

This is great!

One last thing, if I may: can you write a short help section about the antlr backend in the doc? It doesn't have to be very long, just a quick example of how you use it is enough. You can look at what I wrote for pygment for an example.

@gapag
Copy link
Author

gapag commented Nov 19, 2015

I will do it, before next week too.

@gdetrez
Copy link
Contributor

gdetrez commented Dec 8, 2015

Oh, I somehow missed the notification that you updated this branch :-/
Thanks for your fixes, merging now!

gdetrez added a commit that referenced this pull request Dec 8, 2015
@gdetrez gdetrez merged commit ecf4fed into BNFC:master Dec 8, 2015
@andreasabel andreasabel added this to the 2.8.2 milestone Nov 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants