Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't output extra whitespace in YAML multiline #993

Merged
merged 1 commit into from
Aug 22, 2024

Conversation

Gusted
Copy link
Contributor

@Gusted Gusted commented Aug 22, 2024

This resolves a particular issue with parsing YAML multiline, for example:

a: |
  multiline literal
  line 2

The regex used would capture the amount of indentation in the third capture group and then use that as a kind of "status" to know which lines are part of the indented multiline. However, because its a captured group it has to be assigned a token which was TextWhitespace. This meant that the indentation was outputted after the multiline, technically it should be seen as an non-captured group, but then its no longer to refer to it in the regex. Therefore I've gone with the solution to add a new token, Ignore, which will not be emitted as a token in the iterator, which can safely be used to make use of capture groups but not have them show up in the output.

Before

image

After

image

@alecthomas
Copy link
Owner

Seems reasonable, but an earlier PR that was merged also regenned the enums and this now conflicts.

This resolves a particular issue with parsing YAML multiline, for
example:
```yaml
a: |
  multiline literal
  line 2
```

The regex used would capture the amount of indentation in the third
capture group and then use that as a kind of "status" to know which
lines are part of the indented multiline. However, because its a
captured group it has to be assigned a token which was `TextWhitespace`.
This meant that the indentation was outputed after the multiline,
technically it should be seen as an non-captured group, but then its no
longer to refer to it in the regex. Therefore I've gone with the
solution to add a new token, Ignore, which will not be emitted as a
token in the iterator, which can safely be used to make use of capture
groups but not have them show up in the output.
@Gusted
Copy link
Contributor Author

Gusted commented Aug 22, 2024

No big deal, rebased the PR and regenerated the enums.

@alecthomas alecthomas merged commit 4d11870 into alecthomas:master Aug 22, 2024
2 checks passed
@alecthomas
Copy link
Owner

Thanks!

@Gusted Gusted deleted the extra-spaces-yaml branch August 22, 2024 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants