-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix "alpha" production in CQL2. #789
Conversation
the single quote. Move query.json to the search extension. I inadvertantly checked that in during some TB18 work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Waiting for feedback from @eseglem before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely fixes the issue I was having with alpha
. Swapped the update into the parser I have been working on and it is producing the expected results.
I just had a few other comments brought on by the other updates.
I may be missing something stating otherwise but I believe the two rules: "'..'"
and "'" ".." "'"
are functionally different. As one doesn't allow for white space and the other would. If that is not the case, may want to add some more clarification on that.
|
||
lteq = lt eq; | ||
comparisonOperator = "=" # equal | ||
| "<" ">" # not equal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a change from what the grammar said before, but seeing it this way is more readable. It made me realize that I had technically made a mistake in my parser, and combined them into a single token "<>"
.
This definition would potentially allow something like < >
with extra spaces between them. I don't necessarily feel that this is wrong. It just feels a bit odd in contrast with things like ..
and escaped quotes which are exact matches.
|
||
character = alpha | digit | escapedquote; | ||
character = alpha | digit | "''"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that might be worth keeping is escapedquote = "''"
to make sure its clearly distinct from regular single quotes.
| "\xE000".."\xFFFD" # See note 8. | ||
| "\x10000".."\x10FFFF"; # See note 9. | ||
|
||
# Note 8: Private Use, CJK Compatibility Ideographs, Alphabetic Presentation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be a good reason for including them but honestly both of these "Note" sections feel distracting. They are around 20% of the whole file now and have not been useful to me inside the grammar definition. There is already a link to where the charsets came from if someone wants to look into them. As well a variety of references for Unicode in general. I think something along the lines of Additional Unicode Blocks. See: https://www.unicode.org/Public/UCD/latest/ucd/Blocks.txt
(or some other link) would serve the same purpose and not take as much space.
|
||
instantParameter = dateInstantString | ||
| timestampInstantString | ||
| quote dotdot quote | ||
| "'..'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would actually be a change in the definition compared to the current one, wouldn't it? It becomes a single token vs the three it was: "'" ".." "'"
It potentially allowed for whitespace between the quotes and the ..
before and prevents it now. This goes along with the other comment about the comparison operators.
Fix "alpha" production in CQL2 to include missing characters and exclude the single quote. I also removed some of the single-character productions like
colon = ":";
and simply used the literal string. I think that simplified the grammar a bit and makes it a little easier to read.I also move
query.json
to the search extension. I inadvertantly checked that in during some TB18 work.