Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make C parser reentrant #349

Closed
wangjia184 opened this issue Mar 19, 2021 · 7 comments · Fixed by #351
Closed

Make C parser reentrant #349

wangjia184 opened this issue Mar 19, 2021 · 7 comments · Fixed by #351
Assignees
Labels
C enhancement parser Issues concerning parser generating
Milestone

Comments

@wangjia184
Copy link

wangjia184 commented Mar 19, 2021

The *.y template file generated by BNFC using global varilables to store the parsed AST and internal status.

/* Global variables holding parse results for entrypoints. */
Proc YY_RESULT_Proc_ = 0;
ListProc YY_RESULT_ListProc_ = 0;

This means the parser is not thread-safe.

Please consider enabling Thread Local Storage for them.
Here is a thread_local macro suggested here.

#ifndef thread_local
# if __STDC_VERSION__ >= 201112 && !defined __STDC_NO_THREADS__
#  define thread_local _Thread_local
# elif defined _WIN32 && ( \
       defined _MSC_VER || \
       defined __ICL || \
       defined __DMC__ || \
       defined __BORLANDC__ )
#  define thread_local __declspec(thread) 
/* note that ICC (linux) and Clang are covered by __GNUC__ */
# elif defined __GNUC__ || \
       defined __SUNPRO_C || \
       defined __xlC__
#  define thread_local __thread
# else
#  error "Cannot define thread_local"
# endif
#endif
@andreasabel
Copy link
Member

andreasabel commented Mar 19, 2021

I am not very proficient in C. Could you contribute a pull request?

Or, if you do not want to touch the Haskell code, could you describe precisely (e.g. by diffs) how the output of the C Backend should change to implement your feature?

@andreasabel andreasabel added C enhancement parser Issues concerning parser generating labels Mar 19, 2021
@wangjia184
Copy link
Author

Hi @andreasabel

  • if I run a single test case (means the test case call the C parser only once), everything works fine;
  • If I run my test cases, even used mutex to synchronize the calls to the C parser. It raises occasional errors. And the error changes, sometime it says syntax error in different location; sometimes it says memory error...

It seems BNFC C parser is not ready to be used in this way. So I give up this approach, and wrap the C parser into a child process. And everytime I need to parse a source file, launch the child process and exit after parsing the file.

The following changes are what I have attempted in *.y file.

First I added a macro in the top part.

#ifndef thread_local
# if __STDC_VERSION__ >= 201112 && !defined __STDC_NO_THREADS__
#  define thread_local _Thread_local
# elif defined _WIN32 && ( \
       defined _MSC_VER || \
       defined __ICL || \
       defined __DMC__ || \
       defined __BORLANDC__ )
#  define thread_local __declspec(thread) 
/* note that ICC (linux) and Clang are covered by __GNUC__ */
# elif defined __GNUC__ || \
       defined __SUNPRO_C || \
       defined __xlC__
#  define thread_local __thread
# else
#  error "Cannot define thread_local"
# endif
#endif

Then added thread_local before all global variables under the following line

/* Global variables holding parse results for entrypoints. */
thread_local Proc YY_RESULT_Proc_ = 0;
thread_local ListProc YY_RESULT_ListProc_ = 0;
...

Reset all global variables in the beginning of the parser method.

Proc pProc(FILE *inp)
{
  YY_RESULT_Proc_ = 0;
  YY_RESULT_ListProc_ = 0;
  //...

//....
}

It just does not work and fail with occasional and different errors..

@andreasabel
Copy link
Member

With bison you have to do some extra work to get a reentrant parser. https://www.gnu.org/software/bison/manual/html_node/Pure-Decl.html

Maybe we could add change BNFC to generate reentrant parsers by default.

@andreasabel andreasabel self-assigned this Mar 24, 2021
@andreasabel andreasabel added this to the 2.9.2 milestone Mar 24, 2021
@andreasabel andreasabel changed the title Add thread_local to global variables in C parser Make C parser reentrant Mar 24, 2021
andreasabel added a commit that referenced this issue Mar 24, 2021
Use INITIAL instead of defining our own YYINITIAL.
Saves us the initial BEGIN, works also for reentrant lexer.
andreasabel added a commit that referenced this issue Mar 24, 2021
Reorganized the input for bison a bit, to prepare for removal of
global YY_RESULT variables.

Got rid of the forward declaration for yyerror.

Entrypoints can be at the end of the file, since nothing depends on them.
andreasabel added a commit that referenced this issue Mar 24, 2021
This is more robust than using global variables, prepares for making
parser reentrant.
andreasabel added a commit that referenced this issue Mar 24, 2021
Parser.h defined things that are only for internal use in the lexer.
These can be automatically defined by bison using the %defines pragma.
andreasabel added a commit that referenced this issue Mar 24, 2021
Phew! That took me a whole day to figure out, but in the end, there
aren't too many changes.
@andreasabel
Copy link
Member

@wangjia184 Wow, I spent the whole day, but now the parser generated should be reentrant. Can you test PR #351 if it solves your problem?

@wangjia184
Copy link
Author

Wow, that's fast, Thanks @andreasabel , I will try

Another small issue may block running multiple C parsers from current process is -- when syntax error occurs, the error message is directly printed to stderr. Which make it impossible to differentiate where the error is from. Perhaps add some api (e.g. getLastErrorMessage) and the returned pointer should be freed by caller.

Seems I need learn how to build the project, never used Haskell before, will have a try for the new pr, thank you very much

andreasabel added a commit that referenced this issue Mar 25, 2021
Use INITIAL instead of defining our own YYINITIAL.
Saves us the initial BEGIN, works also for reentrant lexer.
andreasabel added a commit that referenced this issue Mar 25, 2021
Reorganized the input for bison a bit, to prepare for removal of
global YY_RESULT variables.

Got rid of the forward declaration for yyerror.

Entrypoints can be at the end of the file, since nothing depends on them.
andreasabel added a commit that referenced this issue Mar 25, 2021
This is more robust than using global variables, prepares for making
parser reentrant.
andreasabel added a commit that referenced this issue Mar 25, 2021
Parser.h defined things that are only for internal use in the lexer.
These can be automatically defined by bison using the %defines pragma.
andreasabel added a commit that referenced this issue Mar 25, 2021
Phew! That took me a whole day to figure out, but in the end, there
aren't too many changes.

UPDATE: also use yyextra argument in lexer instead of global variable
literal_buffer.
andreasabel added a commit that referenced this issue Apr 1, 2021
Use INITIAL instead of defining our own YYINITIAL.
Saves us the initial BEGIN, works also for reentrant lexer.
andreasabel added a commit that referenced this issue Apr 1, 2021
Reorganized the input for bison a bit, to prepare for removal of
global YY_RESULT variables.

Got rid of the forward declaration for yyerror.

Entrypoints can be at the end of the file, since nothing depends on them.
andreasabel added a commit that referenced this issue Apr 1, 2021
This is more robust than using global variables, prepares for making
parser reentrant.
andreasabel added a commit that referenced this issue Apr 1, 2021
Parser.h defined things that are only for internal use in the lexer.
These can be automatically defined by bison using the %defines pragma.
andreasabel added a commit that referenced this issue Apr 1, 2021
Phew! That took me a whole day to figure out, but in the end, there
aren't too many changes.

UPDATE: also use yyextra argument in lexer instead of global variable
literal_buffer.
andreasabel added a commit that referenced this issue Apr 2, 2021
Use INITIAL instead of defining our own YYINITIAL.
Saves us the initial BEGIN, works also for reentrant lexer.
andreasabel added a commit that referenced this issue Apr 2, 2021
Reorganized the input for bison a bit, to prepare for removal of
global YY_RESULT variables.

Got rid of the forward declaration for yyerror.

Entrypoints can be at the end of the file, since nothing depends on them.
andreasabel added a commit that referenced this issue Apr 2, 2021
This is more robust than using global variables, prepares for making
parser reentrant.
andreasabel added a commit that referenced this issue Apr 2, 2021
Parser.h defined things that are only for internal use in the lexer.
These can be automatically defined by bison using the %defines pragma.
andreasabel added a commit that referenced this issue Apr 2, 2021
Phew! That took me a whole day to figure out, but in the end, there
aren't too many changes.

UPDATE: also use yyextra argument in lexer instead of global variable
literal_buffer.
andreasabel added a commit that referenced this issue Apr 2, 2021
Use INITIAL instead of defining our own YYINITIAL.
Saves us the initial BEGIN, works also for reentrant lexer.
andreasabel added a commit that referenced this issue Apr 2, 2021
Reorganized the input for bison a bit, to prepare for removal of
global YY_RESULT variables.

Got rid of the forward declaration for yyerror.

Entrypoints can be at the end of the file, since nothing depends on them.
andreasabel added a commit that referenced this issue Apr 2, 2021
This is more robust than using global variables, prepares for making
parser reentrant.
andreasabel added a commit that referenced this issue Apr 2, 2021
Parser.h defined things that are only for internal use in the lexer.
These can be automatically defined by bison using the %defines pragma.
andreasabel added a commit that referenced this issue Apr 2, 2021
The C variants are now parametrized to also cover the C++ backends.
andreasabel added a commit that referenced this issue Apr 2, 2021
The C variants are now parametrized to also cover the C++ backends.
andreasabel added a commit that referenced this issue Apr 2, 2021
@andreasabel
Copy link
Member

@wangjia184: Sorry, I hijacked this issue for one part of it: removing global variables from the C/C++ parser. Likely, the issue is not fixed for you with that. If you could provide me with a test scenario that I could run to chase the remaining problems, I could give it another try.
Concretely, I would need some self-contained test that runs parsers in parallel (as you seem to do) and exhibits the problems you report (syntax/memory errors).

@wangjia184
Copy link
Author

Thanks @andreasabel for the efforts, I am willing to help. but since I am quite busy these days, dont have time to look into it.
I intend to attach a debugger to diagnose the issue. I may get back to you when I have something, but please expect soon about that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C enhancement parser Issues concerning parser generating
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants