sql: Drop use of trashbox and setjmp #138

vhda · 2014-11-23T20:38:15Z

Add new token type to identify EOF and use it to break out of loops, properly
cleaning all local memory allocations until parser terminates.

b4n · 2014-11-23T22:02:36Z

You missed two loops that can now become infinite:

begin
  if current publisher <> 'publish' then
    signal UE_FailStatement

and

BEGIN
WHEN

As I don't have push write on your branch (well, I didn't test but I shouldn't ^^) here's the diff:

diff --git a/sql.c b/sql.c
index 4b6851c..6f0ca48 100644
--- a/sql.c
+++ b/sql.c
@@ -1292,7 +1292,8 @@ static void parseStatements (tokenInfo *const token, const boolean exit_on_endif
                                         * of a nested BEGIN END block.  So read the next token
                                         * after the THEN and restart the LOOP.
                                         */
-                                       while (! isKeyword (token, KEYWORD_then))
+                                       while (! isKeyword (token, KEYWORD_then) &&
+                                                  ! isType (token, TOKEN_EOF))
                                                readToken (token);

                                        readToken (token);
@@ -1493,6 +1494,7 @@ static void parseStatements (tokenInfo *const token, const boolean exit_on_endif
                         */
                        while ( ! stmtTerm                               && 
                                        ! (   isKeyword (token, KEYWORD_end)     ||
+                                                 isType (token, TOKEN_EOF)          ||
                                                 (isCmdTerm(token))              )       
                                        )
                        {

Feel free to squash it into your commit or anything.

vhda · 2014-11-23T22:04:46Z

Darn! I'll update the commit later tonight.
Update: forgot to say thanks to @b4n! Thanks! :)

@masatake could I kindly ask you to also review this change?

vhda · 2014-11-24T00:18:26Z

Fixed bugs found by @b4n.
Once again, thanks for your help.

b4n · 2014-11-24T16:38:59Z

I just quickly reviewed the new version and the changes look good to me.

vhda · 2014-11-24T16:48:51Z

Still wanted to hear from @masatake because he also worked on this file lately.

masatake · 2014-11-25T00:13:17Z

I'll review.

masatake · 2014-11-25T11:56:33Z

if true then
for idx in z{

How about this input with "ctags -o - -language-force=SQL FILE" ?
I tried and the ctags process entered into infinite loop.
I'm not sure whether this is related to your change or not.

masatake · 2014-11-25T12:11:16Z

*I cannot find the time for reviewing now. So I wrote here how my tool says)

if true then
for idx in z(
This also leads ctags an infinite loop.

b4n · 2014-11-25T12:35:28Z

Indeed, good catch. That's because of a convoluted test in skipToMatched() that actually is wrong:

while (! (isType (token, close_token) && (nest_level == 0) && !isType (token, TOKEN_EOF)))

has the check for TOKEN_EOF reversed. It should be:

while (! (isType (token, close_token) && (nest_level == 0)) && !isType (token, TOKEN_EOF))

Although it could actually be written a lot simpler like this:

while (nest_level > 0 && !isType (token, TOKEN_EOF))

as actually only a token of type close_token can lower the nest_level value, so the token type check in the loop condition is redundant.

vhda · 2014-11-25T12:37:21Z

Com'on @b4n! You have a fix before I am able to read the email? :P

b4n · 2014-11-25T12:44:04Z

hehe :)

BTW, I would think that such mistakes would be easier avoided by using more straightforward (IMO) test conditions in the loops, like instead of ! (a || b || c) use ! a && ! b && ! c -- or at least not have them combined in weird ways like this:

while ( ! stmtTerm                               && 
        ! (   isKeyword (token, KEYWORD_end)     ||
             (isCmdTerm(token))                  ||
             (isType(token, TOKEN_EOF))          )
        )

which, could more or less obviously be rewritten using only one kind of test:

while ( ! stmtTerm && 
        ! isKeyword (token, KEYWORD_end) &&
        ! isCmdTerm(token) &&
        ! isType(token, TOKEN_EOF))

What do you think, should we clean up the tests in sql.c?

b4n · 2014-11-25T12:58:50Z

Just noticed going through the tests that addition on line 1183 is redundant: only need to check for EOF in negative conditions (if it is a keyword it can't be EOF). However that's only a remark, maybe we still want to keep the check for EOF in all loops for future use (though, it wouldn't prevent somebody from writing a bad new loop).

vhda · 2014-11-25T13:07:21Z

I should be able to spend some time on this tonight.
Any suggestions/rules regarding indentation?

b4n · 2014-11-25T13:09:21Z

Wait, I was giving it a look and ended up mostly doing it already :) If you could fix the bug spotted by @masatake in this PR, as it's introduced here, I could submit a second commit with proposed cleanup.

masatake · 2014-11-25T13:36:42Z

If you have time, could you try http://www.srpmix.org/tmp/nano-fuzz ?

There are some input for sql parser in our Units//input.. The can be used as the base input.
nano-fuzz generates many files from the base input by adding one byte or removing one byte.
Give the generated file to ctags - o - --language-force=SQL > /dev/null.
You will find crashes or infinite loops.

I want to merge nano-fuzz to testing.mak. However, next two weeks, I will not have much time.
However, nano-fuzz may help you to spot a bug. Runinng valgrind also interesting.

If you got a crash or an infinite loop, use units shrink like

./units shrink "timeout 1 ../ctags -o - --language-force=SQL %s > /dev/null" /home/yamato/var/ctags-github/misc/sql/Units-3184782.sql.t-input.sql-large-108-91 /tmp/XXX

for debugging. @vhda, with quick reviewing I cannot find anything wrong.
So merge freely. However, this may be good time to fix much potential bugs and find bug patterns as @b4n finds.

Add new token type to identify EOF and use it to break out of loops, properly cleaning all local memory allocations until parser terminates.

vhda · 2014-11-25T13:42:14Z

@b4n I've fixed the bug that @masatake found.
Please feel free to rebase your patch over mine.

b4n · 2014-11-25T13:44:47Z

@masatake sounds interesting indeed :)

BTW, to quickly spot where the problem you found came from I just ran inside GDB and stopped the program when it went into the infinite loop. The stack trace was simple enough for me to spot the issue instantly -- so, also use GDB :)

Also, I went through all tests to try and simplify them, and I couldn't find any other that looked wrong even after simplifying them, so I believe it's safe on this front now -- e.g. I'm quite confident this patch doesn't add new issues.

b4n · 2014-11-25T13:46:55Z

@vhda good, thanks. Should we merge this first and I make a new PR, or should I make a new PR with your changes and mines, or would you like to pull mines in this one?

sql: Drop use of trashbox and setjmp

b4n · 2014-11-25T13:58:03Z

Nevermind, I merged and opened #149

st: Import the latest code from Ruby

vhda mentioned this pull request Nov 23, 2014

[ada, oldC] setjmp/longjmp related memory leaks #90

Open

vhda force-pushed the sql/drop-trashbox branch from a6c8006 to 442a4f4 Compare November 24, 2014 00:14

sql: Drop use of trashbox and setjmp

f44c0b2

Add new token type to identify EOF and use it to break out of loops, properly cleaning all local memory allocations until parser terminates.

vhda force-pushed the sql/drop-trashbox branch from 442a4f4 to f44c0b2 Compare November 25, 2014 13:39

b4n added a commit that referenced this pull request Nov 25, 2014

Merge pull request #138 from vhda/sql/drop-trashbox

10ec933

sql: Drop use of trashbox and setjmp

b4n merged commit 10ec933 into universal-ctags:master Nov 25, 2014

vhda deleted the sql/drop-trashbox branch November 25, 2014 14:03

masatake pushed a commit to masatake/ctags that referenced this pull request Mar 12, 2020

Merge pull request universal-ctags#138 from k-takata/import-st

99db460

st: Import the latest code from Ruby

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: Drop use of trashbox and setjmp #138

sql: Drop use of trashbox and setjmp #138

vhda commented Nov 23, 2014

b4n commented Nov 23, 2014

vhda commented Nov 23, 2014

vhda commented Nov 24, 2014

b4n commented Nov 24, 2014

vhda commented Nov 24, 2014

masatake commented Nov 25, 2014

masatake commented Nov 25, 2014

masatake commented Nov 25, 2014

b4n commented Nov 25, 2014

vhda commented Nov 25, 2014

b4n commented Nov 25, 2014

b4n commented Nov 25, 2014

vhda commented Nov 25, 2014

b4n commented Nov 25, 2014

masatake commented Nov 25, 2014

./units shrink "timeout 1 ../ctags -o - --language-force=SQL %s > /dev/null" /home/yamato/var/ctags-github/misc/sql/Units-3184782.sql.t-input.sql-large-108-91 /tmp/XXX

vhda commented Nov 25, 2014

b4n commented Nov 25, 2014

b4n commented Nov 25, 2014

b4n commented Nov 25, 2014

sql: Drop use of trashbox and setjmp #138

sql: Drop use of trashbox and setjmp #138

Conversation

vhda commented Nov 23, 2014

b4n commented Nov 23, 2014

vhda commented Nov 23, 2014

vhda commented Nov 24, 2014

b4n commented Nov 24, 2014

vhda commented Nov 24, 2014

masatake commented Nov 25, 2014

masatake commented Nov 25, 2014

masatake commented Nov 25, 2014

b4n commented Nov 25, 2014

vhda commented Nov 25, 2014

b4n commented Nov 25, 2014

b4n commented Nov 25, 2014

vhda commented Nov 25, 2014

b4n commented Nov 25, 2014

masatake commented Nov 25, 2014

./units shrink "timeout 1 ../ctags -o - --language-force=SQL %s > /dev/null" /home/yamato/var/ctags-github/misc/sql/Units-3184782.sql.t-input.sql-large-108-91 /tmp/XXX

vhda commented Nov 25, 2014

b4n commented Nov 25, 2014

b4n commented Nov 25, 2014

b4n commented Nov 25, 2014