Skip to content
This repository has been archived by the owner on Oct 3, 2021. It is now read-only.

Issue with zilu benchmark #1159

Closed
hernanponcedeleon opened this issue Oct 7, 2020 · 10 comments
Closed

Issue with zilu benchmark #1159

hernanponcedeleon opened this issue Oct 7, 2020 · 10 comments
Assignees
Labels
C Task in language C issue with benchmark

Comments

@hernanponcedeleon
Copy link
Contributor

benchmark07_linear has true as expected result.
However, isn't the following a violation?

Initial state

  • i=0
  • n=1
  • k=-3999
  • flag=true

End state

  • k=1
  • n=1

Am I missing something?

@hernanponcedeleon
Copy link
Contributor Author

hernanponcedeleon commented Oct 8, 2020

I also think many of the benchmarks can result in overflow, thus leading to undefined behaviour. Below some examples

benchmark12_linear

x = 1610614784
y = 805306368
t = 805306368
y=y+x -> overflow

benchmark28_linear

i = 65536
i * i -> overflow

In the later case, the problem is not just the overflow, the following assignment also leads to a violation

i=-1
j=-2

@MartinSpiessl
Copy link
Contributor

I think you are correct, thanks for the hint!. In the paper they were mainly concerned with (linear) invariants and probably didn't account for overflows.

For benchmark07_linear I think they might have forgotten to constrain k to non-negative values.

That probably explains why none of the verifiers in table 1 in the publication (https://lijiaying.github.io/papers/ase17.pdf) were able to "find the invariant".
Other tasks where this is the case need manual investigation as well I think, these are at least benchmarks 8,20,28,30,52,53.

@hernanponcedeleon
Copy link
Contributor Author

According to Dartagnan, all the following would result in overflow

benchmark12_linear
benchmark27_linear
benchmark28_linear
benchmark29_linear
benchmark31_disjunctive
benchmark39_conjunctive
benchmark40_polynomial
benchmark44_disjunctive
benchmark45_disjunctive
benchmark46_disjunctive
benchmark47_linear
benchmark48_linear
benchmark50_linear
benchmark53_polynomial

And even if the overflow problem would be solved in benchmark28_linear (also none tool in that paper succeed here), I still think the result is wrong (see my comment above)

@hernanponcedeleon
Copy link
Contributor Author

Not sure what is the best course of action to solve the benchmarks for the reachability property (solution might be benchmarks independent), but if we don't find a solution before the verification tasks freeze (tomorrow), I think we should at least remove them from the reachability category and add them to the no-overflow category.

@MartinSpiessl
Copy link
Contributor

I think there is a misunderstanding. Tomorrow is not the final task freezing. You need to have made your PR by tomorrow if you want your new tasks in the competition. PRs that are made later will not be used in the competition, unles the jury votes otherwise. There is time to fix tasks in the repo and in existing PRs until 13th of November:

Nov. 13, 2020 Freezing of verification tasks (pull requests after deadline only merged after jury approval)

@hernanponcedeleon
Copy link
Contributor Author

ohhhh that's very good to know.
I was panicking because I found many UB (also in other benchmarks) and I was not sure I was gonna be able to propose solutions to all of them before the deadline.

@MartinSpiessl
Copy link
Contributor

The first deadline is just to prevent people from adding completely new tasks shortly before the freezing. If there are bugs in the already existing tasks, we accept fixes until the deadline on 13th of November. If we find bugs after that, then those tasks are excluded for SV-COMP 2021 on a separate branch, where they will simply be removed. PRs with fixes can of course still be made, but only against master and not on that SV-COMP 2021 branch.
@dbeyer maybe you can confirm that I am not completely wrong here.

@MartinSpiessl
Copy link
Contributor

CPAchecker actually found even more overflows:

benchmark01_conjunctive.i
benchmark06_conjunctive.i
benchmark07_linear.i
benchmark12_linear.i
benchmark15_conjunctive.i
benchmark21_disjunctive.i
benchmark27_linear.i
benchmark28_linear.i
benchmark29_linear.i
benchmark30_conjunctive.i
benchmark31_disjunctive.i
benchmark39_conjunctive.i
benchmark40_polynomial.i
benchmark42_conjunctive.i
benchmark44_disjunctive.i
benchmark45_disjunctive.i
benchmark46_disjunctive.i
benchmark47_linear.i
benchmark48_linear.i
benchmark49_linear.i
benchmark50_linear.i
benchmark53_polynomial.i

I better get going in fixing all of them 🤕

@MartinSpiessl MartinSpiessl added the C Task in language C label Oct 28, 2020
@MartinSpiessl
Copy link
Contributor

These require quite different fixes, so for now I just marked them as no-overflow with expected verdict false in the yml files.

Here is the fixes I would perform for the first 5 tasks. Only 2 programs have the same fix. Fixing this would become pretty time-consuming and will probably not happen for this year's competition.

benchmark01_conjunctive.i

at start of while loop body insert:

if (__builtin_add_overflow_p (x, y, x)) break;

benchmark06_conjunctive.i

make all types unsigned int

benchmark07_linear.i

bound variable k to be somewhere inbetween -3999<k<INT_MAX-4000*10

benchmark12_linear.i

at start of while loop body insert:

if (__builtin_add_overflow_p (x, y, x)) break;

benchmark15_conjunctive.i

complete rewrite of the precondition?! It currently is:
if (!(low == 0 && mid >= 1 && high == 2*mid)) return 0;
Instead we would want something like this:

low = 0;
if not (high>=2) return 0;
mid = high/2;
high = high - high %2;

@hernanponcedeleon
Copy link
Contributor Author

Fixed in #1191 .

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C Task in language C issue with benchmark
Development

No branches or pull requests

2 participants