Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible issue w/ [*] NOT = #53

Open
jmgnc opened this issue May 2, 2018 · 14 comments
Open

Possible issue w/ [*] NOT = #53

jmgnc opened this issue May 2, 2018 · 14 comments
Labels

Comments

@jmgnc
Copy link

jmgnc commented May 2, 2018

This may be needed to clear up in the spec, but IMO, the results returned for this are not correct.

Given the pattern:
[ a:b[*] NOT = 22 ]

and the object { "type": "a", "b": [ 22, 443 ] }

I believe it should NOT match, as it should be the equivalent to NOT (a:b[*] = 22). There was some confusion about this before, and we may have not made the correct choice where we put the NOT operator.

The spec says:
A Comparison Operator MAY be preceded by the modifier NOT, in which case the resultant Comparison Expression is logically negated.

It does not say the comparison operator is logically negated, but the entire expression is.

But you can see in the verbose output from the matcher, that it is negating the comparison, NOT the expression:

exitObjectType (a): push {0: {u'0': [{u'b': [22, 443], u'type': u'a'}]}}
exitFirstPathComponent (b): pop {0: {u'0': [{u'b': [22, 443], u'type': u'a'}]}}
exitFirstPathComponent (b): push {0: {u'0': [[22, 443]]}}
exitIndexPathStep (*): pop {0: {u'0': [[22, 443]]}}
exitIndexPathStep (*): push {0: {u'0': [22, 443]}}
exitPropTestEqual (NOT = 22): pop {0: {u'0': [22, 443]}}
exitPropTestEqual (NOT = 22): push {0: set([u'0'])}
exitObservationExpression (simple): pop {0: set([u'0'])}
exitObservationExpression (simple): push [(0,)]

MATCH:  [ a:b[*] NOT = 22 ]

It should match on the 22 w/ the =, and then negate it to false via the NOT, instead it is != matching on the 443, and returning true.

@gtback
Copy link
Contributor

gtback commented May 2, 2018

I think I remember @JasonKeirstead saying the intention for the spec was to say

A Comparison Expression MAY be preceded by the modifier NOT, in which case the resultant Comparison Expression is logically negated.

But that's not what it says (☹️), leading to the current confusion.

I agree that the difference between "negated operator" and "negated expression" is subtle but important, and can understand your interpretation. I'm also looking at the part of the spec that says:

The literal '*' indicates that if any of the items contained within a list matches against the Comparison Expression, the Comparison Expression evaluates to true.

I think there is some ambiguity in whether you consider "NOT" to be part of the Comparison Expression or not for purposes of that sentence (in other words, whether it means "there is not any...." or "there is at least one that ... not"). Absent some more definitive statement in a specification, errata, etc., I'm not convinced there's overwhelming evidence the current interpretation is wrong.

For the record, this is also why I'm extremely hesitant to add variables or any other features to the spec. Their meaning may be clear in isolation, but when combined with other features (like how the * and NOT combine in this case), it could lead to ambiguity and misinterpretation.

cc: @chisholm @clenk @ikiril01 @treyka

@jmgnc
Copy link
Author

jmgnc commented May 2, 2018

yeah, now that I'm reading that:

The literal '*' indicates that if any of the items contained within a list matches against the Comparison Expression, the Comparison Expression evaluates to true.

The first Comparison Expression doesn't make any sense here. Because that would be self referential. How can the list compare against itself? Since the list is IN the Comparison Expression.

This should probably changed to read something like this:

The literal '*' indicates that all of the items in the list shall be tried as a value to be evaluated w/ the Comparison Operator and other value. If any of these evaluations are true, then the result is true.

I think this is one of those cases where we "everyone" knew what we wanted to say, but we didn't say it very well.

re: variables
Well, that's why we need to test it out, and run through some real test data, and figure these things out.

@jmgnc
Copy link
Author

jmgnc commented May 8, 2018

We do not have a !LIKE operator, but it can be done with MATCHES.

@jmgnc
Copy link
Author

jmgnc commented May 8, 2018

Here is a truth table. I took a while to decide that [*] != 22 for the empty list should be false, as there needs to be an item to be not equal.

imag3623

@chisholm
Copy link
Contributor

chisholm commented May 8, 2018

Yeah... I always think it's weird when an operator and its supposed complement can ever evaluate the same way. (E.g. = and != both evaluating to the same thing.) But here's a possibly more thorough treatment.

The [*] creates an implicit logical OR. It is the same as

list[0] <op> <val>
OR
list[1] <op> <val>
OR
etc...

If you imagine implementing this as an OR fold/reduction, e.g. reduce() (python2) or functools.reduce() (python3), what would you need as the initializer/identity? For OR, you'd need False, and for AND you'd need True. E.g. when OR'ing, you can start with an accumulator equal to False and keep OR'ing into it: when you encounter a True value, you can short-circuit evaluate the whole thing to True. When AND'ing,
you start with a True and wait until you encounter a False, then short-circuit evaluate the whole thing to False. Since this is an OR, we start with False, and since the list is empty, that's the result. If it were an implicit AND, you'd have the opposite result.

So:

expr [] [22] [22, 443]
= 22 F T T
!= 22 F F T
NOT (= 22) T F F
NOT (!= 22) T T F

Maybe that's way too theoretical, but I am trying to come up with something more solid than "there needs to be an item to be not equal". :) And if there is ever a need to implicitly AND values, where there
may potentially not be any, this addresses that too. It should eval to True.

And I do think that a unary negation operator which applies to the whole comparison expression needs to be positioned before the comparison expression, not in the middle before the operator. Greg made a comment in a different issue that it would be unwise for "NOT =" and "!=" to act differently, and I agree. That would be very counterintuitive.

I would prefer that the current matcher behavior stay as-is, and we revisit this in the next implementation, according to the next version of the spec, which hopefully is worded more clearly.

@chisholm
Copy link
Contributor

chisholm commented May 8, 2018

We do not have a !LIKE operator, but it can be done with MATCHES.

How about "NOT IN", and similar for ISSUBSET and ISSUPERSET?

@jmgnc
Copy link
Author

jmgnc commented May 8, 2018

@chisholm that definition matches Python's any function: https://docs.python.org/2/library/functions.html#any

The case of AND'ing is the all function:
https://docs.python.org/2/library/functions.html#all

Interestingly, with this, I think you can test for an empty list via: lst[*] NOT = 22 AND lst[i] NOT != 22 though I haven't fully worked through the tables to verify this.

@JasonKeirstead
Copy link
Member

JasonKeirstead commented May 9, 2018

I think this conversation has taken a left turn at a fork in the road and gone 1000 miles, when it should have turned right at the start.

[ a:b[*] NOT = 22 ]

This is not even supposed to be a valid pattern in the first place. It's nonsense... has anyone ever seen a grammar structured like that, where the boolean NOT to invert an expression was jammed into the middle of the expression? Of course not.. that was not our intention. The intention was this:

[ NOT a:b[*] = 22 ]

That is what I was saying that @gtback referenced 6 days ago - the text is SUPPOSED to say this:

A Comparison Expression MAY be preceded by the modifier NOT, in which case the resultant Comparison Expression is logically negated.

That change to the text fixes all of these problems. The NOT is supposed to be applied to the entire expression, as a unit. You evaluate the expression, then NOT it... as one would naturally expect.

NOT (= 22)
NOT (!= 22)

I do not believe these are intended to be valid SCO patterns. They are hard to read and thus hard to make any sense of as a reader. If the normative text is corrected as I say above, these are no longer valid.

@gtback
Copy link
Contributor

gtback commented May 9, 2018

@JasonKeirstead : Python supports <item> not in <list> (and is in fact generally considered better style than not <item> in <list>), but I agree it's pretty rare.

The normative text says what it says, even if it's not what was intended. The fact that the current wording is not obviously wrong or unimplementable only affirms that the best thing we can do at this point in this library is to do nothing.

If there's overwhelming evidence that other implementations behave differently, then for interoperability reasons I can see the benefit of standardizing on one approach vs. another for an ambiguous part of the spec.

@gtback gtback closed this as completed May 9, 2018
@JasonKeirstead
Copy link
Member

@gtback in that case, "not in" is the operator though. You can't do that with any operator in python.

@chisholm
Copy link
Contributor

chisholm commented May 9, 2018

I think this conversation has taken a left turn at a fork in the road and gone 1000 miles, when it should have turned right at the start.
[snip]
This is not even supposed to be a valid pattern in the first place. It's nonsense...

Actually, we did turn nearly a year ago. I had similar comments about the grammar structure. This seems like yet another issue stemming from confusion surrounding that NOT operator... :(

@jmgnc
Copy link
Author

jmgnc commented May 9, 2018

Interestingly, I stated the same thing in that thread (that it applies to the results of the expression and NOT the operator.

@gtback
Copy link
Contributor

gtback commented May 15, 2018

I don't think the semantics of NOT are clear enough in the spec that there is a straightforward obvious fix for this right now, but I'll leave this open so we don't forget to address it in the future.

@gtback gtback reopened this May 15, 2018
@gtback gtback added the bug label Jun 12, 2018
@JasonKeirstead
Copy link
Member

Lets continue this discussion in oasis-tcs/cti-stix2#85

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants