Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix panic in annotated_snippet dependency (GitHub issue #4968). #5629

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Raekye
Copy link

@Raekye Raekye commented Dec 7, 2022

Ref #4968. Changes based on the code review suggestions in #5039. Tabs are counted as config.tab_spaces columns, but the annotated_snippet dependency counts tabs as 1 space. This difference may lead to a panic in annotated_snippet, which can be seen in the test-case added (with this PR's change, it should nolonger panic).

Note that #5039 is about changing tabs to "align to config.tab_spaces columns". But the code review suggestion by camsteffen also fixes the panic. This PR doesn't change the arithmetic (tabs are still always counted as a fixed number of spaces) - but just addresses the panic in the dependency.

@ytmimi
Copy link
Contributor

ytmimi commented Dec 7, 2022

Thanks for your first contribution to rustfmt 🎉

when you have a moment, could you expand on why the change implemented here addresses the panic in annotated_snippet. I want to make sure I fully understand the fix.

Also, if this does resolve the issue would you mind looking into the following related issues to see if they're also solved by this:

It would be great to add additional test cases for each issue that is also resolved by this PR.

@Raekye
Copy link
Author

Raekye commented Dec 8, 2022

Hi, testing locally, this PR does fix the examples in the first 3 links. I can add them as proper test cases if desired.

Actually, this PR is not directly related to the changes in #5039. Currently, rustfmt always treats tabs as config.tab_spaces spaces. #5039 desires to make tabs count as "up to config.tab_spaces spaces, as needed to round the current column up to the next multiple of config.tab_spaces". For example, if tab_spaces is 4 and a tab is inserted at effective column 98, #5039 makes this tab count as 2 spaces so the next character is at column 100.

#5039 would prevent rustfmt from producing an error at all in those cases, but does not address the following issue (which is only what this PR attempts to fix; this PR doesn't include the variable-width arithmetic for tabs). If rustfmt tries to display an error using the annotated_snippet dependency, there may be a panic in the dependency due to the following discrepancy:

  • rustfmt fills its line buffer with tabs (when applicable), and and counts tabs as >= 1 character (based on tab_spaces) for the effective width/offsets.
  • annotated_snippet, given input with tabs and offsets to highlight, counts tabs as 1 character.
  • the effective width is larger than the "actual" offset, so what rustfmt tells annotated_snippet is the range, may be out of the range annotated_snippet considers, and it panics.

I had been running locally the changes suggested by camsteffen in #5039 to fix this discrepancy. Since I only just got around to trying to submit a proper PR/fix, I forgot that #5039 itself isn't directly related. The suggested changes (which this PR is for) is for rustfmt to build its internal buffer always replacing tabs with the corresponding number of spaces (which is orthogonal to the discussion of how many spaces a tab should be worth). This way, the "effective width" and "actual offsets" are consistent.

I took camsteffen's suggestions as is - which includes counting each character c as c.len_utf8() width. I believe it has the following effect: the line width is based on actual bytes, as opposed to visual characters. Whether this is desired is debatable. It's correctness with respect to annotated_snippet depends on whether annotated_snippet interprets offsets as byte offsets or character offsets. Based on this source, it does seem like annotated_snippets uses byte offsets, which is what the change 1 -> c.len_utf8() is correct for.

Hopefully my explanation is good.

The example in #5431, as far as what I can tell the user wants, is the same, and should be fixed as well.

I believe #5546 wishes there to be "soft tabs" and "hard tabs":

  • soft tabs: the current behaviour - line length is based on "visual" columns
  • hard tabs: line length is based on actual characters (or maybe even bytes): i.e. a tab is always 1 column

This PR doesn't add that feature.

@ytmimi
Copy link
Contributor

ytmimi commented Dec 8, 2022

Hopefully my explanation is good.

It was great! Thank you for being so thorough 😁. It also helped that you linked to the implementation used by annotated_snippets.

testing locally, this PR does fix the examples in the first 3 links. I can add them as proper test cases if desired.

Please and thank you! It's always good to have more test cases!

@Raekye
Copy link
Author

Raekye commented Dec 9, 2022

Hi, my apologies, I just saw/realized that you asked to add the additional test cases in your original post.

I also have to apologize for unclear wording on my part. By "this PR does fix the examples in the first 3 links", I meant it fixes rustfmt causing annotated_snippet to panic. To reiterate/summarize, rustfmt and annotated_snippet count characters/offsets differently. If rustfmt decides to report an error, there may be a panic in annotated_snippet because the range rustfmt passes to annotated_snippet may be out of range (for annotated_snippet). To me, this is the only (hard) bug (and I don't think this would be controversial).

#5039 desires to relax when rustfmt decides there is an error, by counting tabs as "up to tab_spaces alignment" number of spaces. But the PR (as is) still leaves the possibility of panicking in annotated_snippet.

#5546 also desires different rustfmt behaviour, revolving around whether max_width is the maximum number of characters (tab = 1 character) or columns (tab = tab_spaces columns, or with #5039, "up to tab_spaces alignment" columns).

This PR doesn't change rustfmt's behaviour of when it decides something is an error - it doesn't resolve GitHub issues #5039 and #5546. The test case in #4968 (comment) is then functionally identical to the test case I have originally added.

I personally feel and would guess that rustfmt needs further discussion if the requested behaviour in #5039 or #5546 is to be adopted. #5546 would require nontrivial work, but if #5039 is definitely desired I am happy to apply the necessary changes in this PR, or submit it as a different fix (hopefully I have made clear now that there are different issues (in the general sense) that could be fixed).

If it's still desired to have #4968 (comment) as a test case, I am happy to add it. If it's desired to have the other examples as "nolonger crashes", I am also happy to add it. But I felt I gave the wrong impression about "fixing" based on my original poor wording.

Also, if this PR is to be accepted, I believe I should change my commit message to be more clear based on this discussion.

@Raekye
Copy link
Author

Raekye commented Aug 27, 2023

I just rebased the changes in this PR with the latest upstream code (and force pushed).

@Raekye
Copy link
Author

Raekye commented Dec 3, 2023

Added missing // rustfmt-error_on_line_overflow: true declaration to the test case; unless I'm mistaken, the bug/panic is triggered only if rustfmt has decided to report an error (and there's a mismatch of character/columns due to hard tabs). So for the test case added, both error_on_unformatted and error_on_line_overflow are needed to trigger the panic.

On 37489e4 (HEAD as of this comment), if you copy tests/target/issue-4968.rs from this PR and run cargo test, it should fail with something like this (I just tried it; hope I didn't make a silly mistake).

thread '<unnamed>' panicked at /home/.../from_snippet.rs:286:9:                                                                            
SourceAnnotation range `(40, 61)` is bigger than source length `52`                                                                                                                                                                         
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace                                                                                                                                                               
test test::idempotence_tests ... FAILED

In this PR, the CI failed becaused rustfmt failed to format that test case - but it reports the error properly. I didn't look deeply into it, but it wasn't obvious to me how to make a test case that's supposed to fail to format (this PR should be fixed before it's accepted - any pointers on how to do such a test case would be appreciated)

* With hard_tabs, rustfmt preserves tabs and counts them as multiple
  characters/columns (based on configuration).
* The annotated_snippet dependency, used to display errors, always
  counts tabs as 1 character.
* If rustfmt tries to report an error on a line containing tabs,
  the indices are mismatched.
* annotated_snippet will display the wrong range of the source code
  slice; in the extreme case, it can panic with out-of-bounds access.
* The test case added in this commit is expected to currently fail,
  since this commit doesn't include the fix.
* This change is based on the code review by camsteffen on PR
  rust-lang#5039 by karyon: have rustfmt internally replace tabs with
  the corresponding number of spaces, so that columns/indices in the
  buffer passed to annotated_snippet are counted unambiguously.
@stswidwinski
Copy link

@ytmimi -- any chance we could review this PR? I'm hitting very similar issues :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants