Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: Drop use of trashbox and setjmp #138

Merged
merged 1 commit into from
Nov 25, 2014

Conversation

vhda
Copy link
Contributor

@vhda vhda commented Nov 23, 2014

Add new token type to identify EOF and use it to break out of loops, properly
cleaning all local memory allocations until parser terminates.

@b4n
Copy link
Member

b4n commented Nov 23, 2014

You missed two loops that can now become infinite:

begin
  if current publisher <> 'publish' then
    signal UE_FailStatement

and

BEGIN
WHEN

As I don't have push write on your branch (well, I didn't test but I shouldn't ^^) here's the diff:

diff --git a/sql.c b/sql.c
index 4b6851c..6f0ca48 100644
--- a/sql.c
+++ b/sql.c
@@ -1292,7 +1292,8 @@ static void parseStatements (tokenInfo *const token, const boolean exit_on_endif
                                         * of a nested BEGIN END block.  So read the next token
                                         * after the THEN and restart the LOOP.
                                         */
-                                       while (! isKeyword (token, KEYWORD_then))
+                                       while (! isKeyword (token, KEYWORD_then) &&
+                                                  ! isType (token, TOKEN_EOF))
                                                readToken (token);

                                        readToken (token);
@@ -1493,6 +1494,7 @@ static void parseStatements (tokenInfo *const token, const boolean exit_on_endif
                         */
                        while ( ! stmtTerm                               && 
                                        ! (   isKeyword (token, KEYWORD_end)     ||
+                                                 isType (token, TOKEN_EOF)          ||
                                                 (isCmdTerm(token))              )       
                                        )
                        {

Feel free to squash it into your commit or anything.

@vhda
Copy link
Contributor Author

vhda commented Nov 23, 2014

Darn! I'll update the commit later tonight.
Update: forgot to say thanks to @b4n! Thanks! :)

@masatake could I kindly ask you to also review this change?

@vhda
Copy link
Contributor Author

vhda commented Nov 24, 2014

Fixed bugs found by @b4n.
Once again, thanks for your help.

@b4n
Copy link
Member

b4n commented Nov 24, 2014

I just quickly reviewed the new version and the changes look good to me.

@vhda
Copy link
Contributor Author

vhda commented Nov 24, 2014

Still wanted to hear from @masatake because he also worked on this file lately.

@masatake
Copy link
Member

I'll review.

@masatake
Copy link
Member

if true then
for idx in z{

How about this input with "ctags -o - -language-force=SQL FILE" ?
I tried and the ctags process entered into infinite loop.
I'm not sure whether this is related to your change or not.

@masatake
Copy link
Member

*I cannot find the time for reviewing now. So I wrote here how my tool says)

if true then
for idx in z(
This also leads ctags an infinite loop.

@b4n
Copy link
Member

b4n commented Nov 25, 2014

Indeed, good catch. That's because of a convoluted test in skipToMatched() that actually is wrong:

while (! (isType (token, close_token) && (nest_level == 0) && !isType (token, TOKEN_EOF)))

has the check for TOKEN_EOF reversed. It should be:

while (! (isType (token, close_token) && (nest_level == 0)) && !isType (token, TOKEN_EOF))

Although it could actually be written a lot simpler like this:

while (nest_level > 0 && !isType (token, TOKEN_EOF))

as actually only a token of type close_token can lower the nest_level value, so the token type check in the loop condition is redundant.

@vhda
Copy link
Contributor Author

vhda commented Nov 25, 2014

Com'on @b4n! You have a fix before I am able to read the email? :P

@b4n
Copy link
Member

b4n commented Nov 25, 2014

hehe :)

BTW, I would think that such mistakes would be easier avoided by using more straightforward (IMO) test conditions in the loops, like instead of ! (a || b || c) use ! a && ! b && ! c -- or at least not have them combined in weird ways like this:

while ( ! stmtTerm                               && 
        ! (   isKeyword (token, KEYWORD_end)     ||
             (isCmdTerm(token))                  ||
             (isType(token, TOKEN_EOF))          )
        )

which, could more or less obviously be rewritten using only one kind of test:

while ( ! stmtTerm && 
        ! isKeyword (token, KEYWORD_end) &&
        ! isCmdTerm(token) &&
        ! isType(token, TOKEN_EOF))

What do you think, should we clean up the tests in sql.c?

@b4n
Copy link
Member

b4n commented Nov 25, 2014

Just noticed going through the tests that addition on line 1183 is redundant: only need to check for EOF in negative conditions (if it is a keyword it can't be EOF). However that's only a remark, maybe we still want to keep the check for EOF in all loops for future use (though, it wouldn't prevent somebody from writing a bad new loop).

@vhda
Copy link
Contributor Author

vhda commented Nov 25, 2014

I should be able to spend some time on this tonight.
Any suggestions/rules regarding indentation?

@b4n
Copy link
Member

b4n commented Nov 25, 2014

Wait, I was giving it a look and ended up mostly doing it already :) If you could fix the bug spotted by @masatake in this PR, as it's introduced here, I could submit a second commit with proposed cleanup.

@masatake
Copy link
Member

If you have time, could you try http://www.srpmix.org/tmp/nano-fuzz ?

There are some input for sql parser in our Units//input.. The can be used as the base input.
nano-fuzz generates many files from the base input by adding one byte or removing one byte.
Give the generated file to ctags - o - --language-force=SQL > /dev/null.
You will find crashes or infinite loops.

I want to merge nano-fuzz to testing.mak. However, next two weeks, I will not have much time.
However, nano-fuzz may help you to spot a bug. Runinng valgrind also interesting.

If you got a crash or an infinite loop, use units shrink like

./units shrink "timeout 1 ../ctags -o - --language-force=SQL %s > /dev/null" /home/yamato/var/ctags-github/misc/sql/Units-3184782.sql.t-input.sql-large-108-91 /tmp/XXX

for debugging. @vhda, with quick reviewing I cannot find anything wrong.
So merge freely. However, this may be good time to fix much potential bugs and find bug patterns as @b4n finds.

Add new token type to identify EOF and use it to break out of loops, properly
cleaning all local memory allocations until parser terminates.
@vhda
Copy link
Contributor Author

vhda commented Nov 25, 2014

@b4n I've fixed the bug that @masatake found.
Please feel free to rebase your patch over mine.

@b4n
Copy link
Member

b4n commented Nov 25, 2014

@masatake sounds interesting indeed :)

BTW, to quickly spot where the problem you found came from I just ran inside GDB and stopped the program when it went into the infinite loop. The stack trace was simple enough for me to spot the issue instantly -- so, also use GDB :)

Also, I went through all tests to try and simplify them, and I couldn't find any other that looked wrong even after simplifying them, so I believe it's safe on this front now -- e.g. I'm quite confident this patch doesn't add new issues.

@b4n
Copy link
Member

b4n commented Nov 25, 2014

@vhda good, thanks. Should we merge this first and I make a new PR, or should I make a new PR with your changes and mines, or would you like to pull mines in this one?

b4n added a commit that referenced this pull request Nov 25, 2014
sql: Drop use of trashbox and setjmp
@b4n b4n merged commit 10ec933 into universal-ctags:master Nov 25, 2014
@b4n
Copy link
Member

b4n commented Nov 25, 2014

Nevermind, I merged and opened #149

@vhda vhda deleted the sql/drop-trashbox branch November 25, 2014 14:03
masatake pushed a commit to masatake/ctags that referenced this pull request Mar 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants