Including F841 in the default --fix behaviour is a gigantic footgun #6835

ssokolow · 2023-08-24T04:23:03Z

I have a habit of writing my code from the top down and saving part-way through to trigger things like MyPy at a time when they won't be distracting. After installing Ruff and integrating it into my Vim+ALE-based setup that triggers everything (including autofix/autoformat) on save (thank you, rustfmt, for bringing my habits into the 21st century), I started getting confusing errors.

After a couple of times, I realized I wasn't at risk of dementia, but that Ruff was deleting assignments that I hadn't used yet... sometimes at the worst possible time for derailing my memory of how I planned things to work.

Beyond that, while it's very much a "do not rely on this!" thing in the Python language spec, because, in practice, CPython only uses garbage collection to break cycles and uses reference counting for non-cyclical data structures, dropping memory does have Rust-style determinism in CPython, it does alter the drop order in the same way it would in Rust, and Hyrum's law does apply...

...which means the fixer can alter the semantics of the program in ways other than "pulled the rug out from under the train of thought". Take a look at how ruff alters the observed behaviour of this code:

class Foo:
    def __init__(self, msg):
        self.msg = msg
    def __del__(self):
        print(self.msg)

def main():
    print("Yo Ho Ho")
    ham = Foo("bottle of rum")
    print("and a")

main()

ssokolow@monolith ~ % python3 demo.py
Yo Ho Ho
and a
bottle of rum
ssokolow@monolith ~ % ruff --fix demo.py
Found 1 error (1 fixed, 0 remaining).
ssokolow@monolith ~ % python3 demo.py   
Yo Ho Ho
bottle of rum
and a

I think that, if --fix support is offered for F841, it shouldn't be part of the set of fixes that's enabled by default when a new user (or their IDE/editor integration plugin) just blindly runs ruff --fix.

The text was updated successfully, but these errors were encountered:

charliermarsh · 2023-08-24T13:58:03Z

I think this will be solved once we respect suggested fixes on the CLI, since these are already "suggested" and not "automatic" fixes \cc @zanieb

charliermarsh · 2023-08-24T13:59:09Z

I'm going to merge with #4185 for that reason.

charliermarsh · 2023-08-24T13:59:49Z

Thank you for reporting and sorry that it's been inconvenient for you. If you want, you can turn this off by marking it as unfixable.

ssokolow · 2023-08-24T14:12:57Z

Thanks.

I'll probably want to look up how to get ALE to specify --unfixable instead. I can't be dependant on PyQt or Django in all my projects and, as Rust's ecosystem matures and the allure of strong compile-time correctness beckons, I continue to push Python+MyPy further and further into the "throwaway single-file script with no project folder" niche.

ssokolow · 2023-08-24T21:32:18Z

For the record, it's let g:ale_python_ruff_options = '--unfixable=F841' in your .vimrc.

## Summary This PR adds support for PEP 701 in the parser to use the new tokens emitted by the lexer to construct the f-string node. ### Grammar Without an official grammar, the f-strings were parsed manually. Now that we've the specification, that is being used in the LALRPOP to parse the f-strings. ### `string.rs` This file includes the logic for parsing string literals and joining the implicit string concatenation. Now that we don't require parsing f-strings manually a lot of code involving the same is removed. Earlier, there were 2 entry points to this module: * `parse_string`: Used to parse a single string literal * `parse_strings`: Used to parse strings which were implicitly concatenated Now, there are 3 entry points: * `parse_string_literal`: Renamed from `parse_string` * `parse_fstring_middle`: Used to parse a `FStringMiddle` token which is basically a string literal without the quotes * `concatenate_strings`: Renamed from `parse_strings` but now it takes the parsed nodes instead. So, we just need to concatenate them into a single node. > A short primer on `FStringMiddle` token: This includes the portion of text inside the f-string that's not part of the expression and isn't an opening or closing brace. For example, in `f"foo {bar:.3f{x}} bar"`, the `foo `, `.3f` and ` bar` are `FStringMiddle` token content. ### `Constant::kind` changed in the AST ***Discussion in the official implementation: python/cpython#102855 (comment) This change in the AST is when unicode strings (prefixed with `u`) and f-strings are used in an implicitly concatenated string value. For example, ```python u"foo" f"{bar}" "baz" " some" ``` Pre Python 3.12, the kind field would be assigned only if the prefix was on the first string. So, taking the above example, both `"foo"` and `"baz some"` (implicit concatenation) would be given the `u` kind: <details><summary>Pre 3.12 AST:</summary> ```python Constant(value='foo', kind='u'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='baz some', kind='u') ``` </details> But, post Python 3.12, only the string with the `u` prefix will be assigned the value: <details><summary>Pre 3.12 AST:</summary> ```python Constant(value='foo', kind='u'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='baz some') ``` </details> Here are some more iterations around the change: 1. `"foo" f"{bar}" u"baz" "no"` <details><summary>Pre 3.12</summary> ```python Constant(value='foo'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='bazno') ``` </details> <details><summary>3.12</summary> ```python Constant(value='foo'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='bazno', kind='u') ``` </details> 2. `"foo" f"{bar}" "baz" u"no"` <details><summary>Pre 3.12</summary> ```python Constant(value='foo'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='bazno') ``` </details> <details><summary>3.12</summary> ```python Constant(value='foo'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='bazno') ``` </details> 3. `u"foo" f"bar {baz} realy" u"bar" "no"` <details><summary>Pre 3.12</summary> ```python Constant(value='foobar ', kind='u'), FormattedValue( value=Name(id='baz', ctx=Load()), conversion=-1), Constant(value=' realybarno', kind='u') ``` </details> <details><summary>3.12</summary> ```python Constant(value='foobar ', kind='u'), FormattedValue( value=Name(id='baz', ctx=Load()), conversion=-1), Constant(value=' realybarno') ``` </details> ### Errors With the hand written parser, we were able to provide better error messages in case of any errors such as the following but now they all are removed and in those cases an "unexpected token" error will be thrown by lalrpop: * A closing delimiter was not opened properly * An opening delimiter was not closed properly * Empty expression not allowed The "Too many nested expressions in an f-string" was removed and instead we can create a lint rule for that. And, "The f-string expression cannot include the given character" was removed because f-strings now support those characters which are mainly same quotes as the outer ones, escape sequences, comments, etc. ## Test Plan 1. Refactor existing test cases to use `parse_suite` instead of `parse_fstrings` (doesn't exists anymore) 2. Additional test cases are added as required Updated the snapshots. The change from `parse_fstrings` to `parse_suite` means that the snapshot would produce the module node instead of just a list of f-string parts. I've manually verified that the parts are still the same along with the node ranges. ## Benchmarks #7263 (comment) fixes: #7043 fixes: #6835

This PR adds support for PEP 701 in the parser to use the new tokens emitted by the lexer to construct the f-string node. Without an official grammar, the f-strings were parsed manually. Now that we've the specification, that is being used in the LALRPOP to parse the f-strings. This file includes the logic for parsing string literals and joining the implicit string concatenation. Now that we don't require parsing f-strings manually a lot of code involving the same is removed. Earlier, there were 2 entry points to this module: * `parse_string`: Used to parse a single string literal * `parse_strings`: Used to parse strings which were implicitly concatenated Now, there are 3 entry points: * `parse_string_literal`: Renamed from `parse_string` * `parse_fstring_middle`: Used to parse a `FStringMiddle` token which is basically a string literal without the quotes * `concatenate_strings`: Renamed from `parse_strings` but now it takes the parsed nodes instead. So, we just need to concatenate them into a single node. > A short primer on `FStringMiddle` token: This includes the portion of text inside the f-string that's not part of the expression and isn't an opening or closing brace. For example, in `f"foo {bar:.3f{x}} bar"`, the `foo `, `.3f` and ` bar` are `FStringMiddle` token content. ***Discussion in the official implementation: python/cpython#102855 (comment) This change in the AST is when unicode strings (prefixed with `u`) and f-strings are used in an implicitly concatenated string value. For example, ```python u"foo" f"{bar}" "baz" " some" ``` Pre Python 3.12, the kind field would be assigned only if the prefix was on the first string. So, taking the above example, both `"foo"` and `"baz some"` (implicit concatenation) would be given the `u` kind: <details><summary>Pre 3.12 AST:</summary> ```python Constant(value='foo', kind='u'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='baz some', kind='u') ``` </details> But, post Python 3.12, only the string with the `u` prefix will be assigned the value: <details><summary>Pre 3.12 AST:</summary> ```python Constant(value='foo', kind='u'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='baz some') ``` </details> Here are some more iterations around the change: 1. `"foo" f"{bar}" u"baz" "no"` <details><summary>Pre 3.12</summary> ```python Constant(value='foo'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='bazno') ``` </details> <details><summary>3.12</summary> ```python Constant(value='foo'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='bazno', kind='u') ``` </details> 2. `"foo" f"{bar}" "baz" u"no"` <details><summary>Pre 3.12</summary> ```python Constant(value='foo'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='bazno') ``` </details> <details><summary>3.12</summary> ```python Constant(value='foo'), FormattedValue( value=Name(id='bar', ctx=Load()), conversion=-1), Constant(value='bazno') ``` </details> 3. `u"foo" f"bar {baz} realy" u"bar" "no"` <details><summary>Pre 3.12</summary> ```python Constant(value='foobar ', kind='u'), FormattedValue( value=Name(id='baz', ctx=Load()), conversion=-1), Constant(value=' realybarno', kind='u') ``` </details> <details><summary>3.12</summary> ```python Constant(value='foobar ', kind='u'), FormattedValue( value=Name(id='baz', ctx=Load()), conversion=-1), Constant(value=' realybarno') ``` </details> With the hand written parser, we were able to provide better error messages in case of any errors such as the following but now they all are removed and in those cases an "unexpected token" error will be thrown by lalrpop: * A closing delimiter was not opened properly * An opening delimiter was not closed properly * Empty expression not allowed The "Too many nested expressions in an f-string" was removed and instead we can create a lint rule for that. And, "The f-string expression cannot include the given character" was removed because f-strings now support those characters which are mainly same quotes as the outer ones, escape sequences, comments, etc. 1. Refactor existing test cases to use `parse_suite` instead of `parse_fstrings` (doesn't exists anymore) 2. Additional test cases are added as required Updated the snapshots. The change from `parse_fstrings` to `parse_suite` means that the snapshot would produce the module node instead of just a list of f-string parts. I've manually verified that the parts are still the same along with the node ranges. #7263 (comment) fixes: #7043 fixes: #6835

@evanrittenhouse

Rebase of #5119 authored by @evanrittenhouse with additional refinements. ## Changes - Adds `--unsafe-fixes` / `--no-unsafe-fixes` flags to `ruff check` - Violations with unsafe fixes are not shown as fixable unless opted-in - Fix applicability is respected now - `Applicability::Never` fixes are no longer applied - `Applicability::Sometimes` fixes require opt-in - `Applicability::Always` fixes are unchanged - Hints for availability of `--unsafe-fixes` added to `ruff check` output ## Examples Check hints at hidden unsafe fixes ``` ❯ ruff check example.py --no-cache --select F601,W292 example.py:1:14: F601 Dictionary key literal `'a'` repeated example.py:2:15: W292 [*] No newline at end of file Found 2 errors. [*] 1 fixable with the `--fix` option (1 hidden fix can be enabled with the `--unsafe-fixes` option). ``` We could add an indicator for which violations have hidden fixes in the future. Check treats unsafe fixes as applicable with opt-in ``` ❯ ruff check example.py --no-cache --select F601,W292 --unsafe-fixes example.py:1:14: F601 [*] Dictionary key literal `'a'` repeated example.py:2:15: W292 [*] No newline at end of file Found 2 errors. [*] 2 fixable with the --fix option. ``` Also can be enabled in the config file ``` ❯ cat ruff.toml unsafe-fixes = true ``` And opted-out per invocation ``` ❯ ruff check example.py --no-cache --select F601,W292 --no-unsafe-fixes example.py:1:14: F601 Dictionary key literal `'a'` repeated example.py:2:15: W292 [*] No newline at end of file Found 2 errors. [*] 1 fixable with the `--fix` option (1 hidden fix can be enabled with the `--unsafe-fixes` option). ``` Diff does not include unsafe fixes ``` ❯ ruff check example.py --no-cache --select F601,W292 --diff --- example.py +++ example.py @@ -1,2 +1,2 @@ x = {'a': 1, 'a': 1} -print(('foo')) +print(('foo')) \ No newline at end of file Would fix 1 error. ``` Unless there is opt-in ``` ❯ ruff check example.py --no-cache --select F601,W292 --diff --unsafe-fixes --- example.py +++ example.py @@ -1,2 +1,2 @@ -x = {'a': 1} -print(('foo')) +x = {'a': 1, 'a': 1} +print(('foo')) \ No newline at end of file Would fix 2 errors. ``` #7790 will improve the diff messages following this pull request Similarly, `--fix` and `--fix-only` require the `--unsafe-fixes` flag to apply unsafe fixes. ## Related Replaces #5119 Closes #4185 Closes #7214 Closes #4845 Closes #3863 Addresses #6835 Addresses #7019 Needs follow-up #6962 Needs follow-up #4845 Needs follow-up #7436 Needs follow-up #7025 Needs follow-up #6434 Follow-up #7790 Follow-up #7792 --------- Co-authored-by: Evan Rittenhouse <evanrittenhouse@gmail.com>

@evanrittenhouse

Rebase of #5119 authored by @evanrittenhouse with additional refinements. ## Changes - Adds `--unsafe-fixes` / `--no-unsafe-fixes` flags to `ruff check` - Violations with unsafe fixes are not shown as fixable unless opted-in - Fix applicability is respected now - `Applicability::Never` fixes are no longer applied - `Applicability::Sometimes` fixes require opt-in - `Applicability::Always` fixes are unchanged - Hints for availability of `--unsafe-fixes` added to `ruff check` output ## Examples Check hints at hidden unsafe fixes ``` ❯ ruff check example.py --no-cache --select F601,W292 example.py:1:14: F601 Dictionary key literal `'a'` repeated example.py:2:15: W292 [*] No newline at end of file Found 2 errors. [*] 1 fixable with the `--fix` option (1 hidden fix can be enabled with the `--unsafe-fixes` option). ``` We could add an indicator for which violations have hidden fixes in the future. Check treats unsafe fixes as applicable with opt-in ``` ❯ ruff check example.py --no-cache --select F601,W292 --unsafe-fixes example.py:1:14: F601 [*] Dictionary key literal `'a'` repeated example.py:2:15: W292 [*] No newline at end of file Found 2 errors. [*] 2 fixable with the --fix option. ``` Also can be enabled in the config file ``` ❯ cat ruff.toml unsafe-fixes = true ``` And opted-out per invocation ``` ❯ ruff check example.py --no-cache --select F601,W292 --no-unsafe-fixes example.py:1:14: F601 Dictionary key literal `'a'` repeated example.py:2:15: W292 [*] No newline at end of file Found 2 errors. [*] 1 fixable with the `--fix` option (1 hidden fix can be enabled with the `--unsafe-fixes` option). ``` Diff does not include unsafe fixes ``` ❯ ruff check example.py --no-cache --select F601,W292 --diff --- example.py +++ example.py @@ -1,2 +1,2 @@ x = {'a': 1, 'a': 1} -print(('foo')) +print(('foo')) \ No newline at end of file Would fix 1 error. ``` Unless there is opt-in ``` ❯ ruff check example.py --no-cache --select F601,W292 --diff --unsafe-fixes --- example.py +++ example.py @@ -1,2 +1,2 @@ -x = {'a': 1} -print(('foo')) +x = {'a': 1, 'a': 1} +print(('foo')) \ No newline at end of file Would fix 2 errors. ``` #7790 will improve the diff messages following this pull request Similarly, `--fix` and `--fix-only` require the `--unsafe-fixes` flag to apply unsafe fixes. ## Related Replaces #5119 Closes #4185 Closes #7214 Closes #4845 Closes #3863 Addresses #6835 Addresses #7019 Needs follow-up #6962 Needs follow-up #4845 Needs follow-up #7436 Needs follow-up #7025 Needs follow-up #6434 Follow-up #7790 Follow-up #7792 --------- Co-authored-by: Evan Rittenhouse <evanrittenhouse@gmail.com>

charliermarsh added fixes Related to suggested fixes for violations cli Related to the command-line interface labels Aug 24, 2023

charliermarsh closed this as not planned Won't fix, can't repro, duplicate, stale Aug 24, 2023

dhruvmanila mentioned this issue Sep 3, 2023

Add support for parsing f-string as per PEP 701 #7041

Merged

zanieb mentioned this issue Oct 3, 2023

Update CLI to respect fix applicability #7769

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Including F841 in the default --fix behaviour is a gigantic footgun #6835

Including F841 in the default --fix behaviour is a gigantic footgun #6835

ssokolow commented Aug 24, 2023

charliermarsh commented Aug 24, 2023

charliermarsh commented Aug 24, 2023

charliermarsh commented Aug 24, 2023

ssokolow commented Aug 24, 2023 •

edited

Loading

ssokolow commented Aug 24, 2023

Including F841 in the default --fix behaviour is a gigantic footgun #6835

Including F841 in the default --fix behaviour is a gigantic footgun #6835

Comments

ssokolow commented Aug 24, 2023

charliermarsh commented Aug 24, 2023

charliermarsh commented Aug 24, 2023

charliermarsh commented Aug 24, 2023

ssokolow commented Aug 24, 2023 • edited Loading

ssokolow commented Aug 24, 2023

ssokolow commented Aug 24, 2023 •

edited

Loading