-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix some issues with painless's strings #22393
Conversation
// `'` inside of a string delimited with `"` should be ok | ||
assertEquals("test'", exec("\"test'\"")); | ||
// `"` inside of a string delimited with `'` should be ok | ||
assertEquals("test\"", exec("'test\"'")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one didn't work without the lexer change. I think we intended it to work.
throw location.createError(new IllegalArgumentException("unexpected character [" + getErrorDisplay(text) + "].", lnvae)); | ||
String message = "unexpected character [" + getErrorDisplay(text) + "]."; | ||
char firstChar = text.charAt(0); | ||
if ((firstChar == '\'' || firstChar == '"') && text.length() - 2 > 0 && text.charAt(text.length() - 2) == '\\') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about using fancy lexer changes to make this more correct or allowing all escapes in the lexer but detecting invalid escapes in the walker. I'm not sure which is better but I'd already written this one so I figured I start with it.
boolean doubleQuoted = string.charAt(0) == '"'; | ||
|
||
// Strip the leading and trailing quotes | ||
string = string.substring(1, ctx.STRING().getText().length() - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do all of these things (stripping last char, backslash escape and quote escape) in one pass with a string builder? Right now 3 copies are made of the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can. I figured the replacements would mostly end up as noops so it wasn't worth it. Especially with this on the compile side. If you'd prefer it I'm happy to do it though.
@rjernst, I replaced the replaces with walking a string builder. I expect I could save some CPU by copying the strings in longer chunks if possible but I figure this performance is adequate and more readable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
StringBuilder string = new StringBuilder(ctx.STRING().getText()); | ||
|
||
// Strip the leading and trailing quotes and replace the escape sequences with their literal equivalents | ||
int src = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add asserts for the leading and trailing quotes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can do.
1. Escape sequences we're working. For example `\\` is now correctly interpreted as `\` instead of `\\`. Same with `\'` being `'` and `\"` being `"`. 2. `'` delimited strings weren't allowed to contain `"`s but it looked like they were intended to support it. Now they do. 3. Improves the error message when the script contains an invalid escape sequence inside a string to include a list of the valid escape sequences. Closes elastic#22372
06b9073
to
6648937
Compare
1. Escape sequences we're working. For example `\\` is now correctly interpreted as `\` instead of `\\`. Same with `\'` being `'` and `\"` being `"`. 2. `'` delimited strings weren't allowed to contain `"`s but it looked like they were intended to support it. Now they do. 3. Improves the error message when the script contains an invalid escape sequence inside a string to include a list of the valid escape sequences. Closes #22372
@nik9000 shouldn't this be backported to 5.2? |
@clintongormley, probably. I'm not sure why I didn't do it automatically. The workaround I mention in |
1. Escape sequences we're working. For example `\\` is now correctly interpreted as `\` instead of `\\`. Same with `\'` being `'` and `\"` being `"`. 2. `'` delimited strings weren't allowed to contain `"`s but it looked like they were intended to support it. Now they do. 3. Improves the error message when the script contains an invalid escape sequence inside a string to include a list of the valid escape sequences. Closes #22372
5.2: 3a80f74 |
\\
is now correctlyinterpreted as
\
instead of\\
. Same with\'
being'
and\"
being"
.'
delimited strings weren't allowed to contain"
s but it lookedlike they were intended to support it. Now they do.
escape sequence inside a string to include a list of the valid
escape sequences.
Closes #22372