Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argument Clinic: DSLParser.parse_converter() can return an incorrect kwargs dict #106359

Closed
erlend-aasland opened this issue Jul 3, 2023 · 10 comments · Fixed by #106361
Closed
Labels
topic-argument-clinic type-bug An unexpected behavior, bug, or error

Comments

@erlend-aasland
Copy link
Contributor

erlend-aasland commented Jul 3, 2023

Originally posted by @AlexWaygood in #106354 (comment):

Tools/clinic/clinic.py:5039: error: Incompatible return value type (got
"tuple[str, bool, dict[str | None, Any]]", expected
"tuple[str, bool, dict[str, Any]]")  [return-value]
                    return name, False, kwargs
                           ^~~~~~~~~~~~~~~~~~~

I think mypy is flagging a real bug in the code here. The issue is this block of code here:

cpython/Tools/clinic/clinic.py

Lines 5033 to 5039 in d694f04

case ast.Call(func=ast.Name(name)):
symbols = globals()
kwargs = {
node.arg: eval_ast_expr(node.value, symbols)
for node in annotation.keywords
}
return name, False, kwargs

The function is annotated as return tuple[str, bool, KwargDict], and KwargDict is a type alias for dict[str, Any]. It's important that the dictionary that's the third element of the tuple only has strings as keys. If it doesn't, then this will fail:

return_converter = return_converters[name](**kwargs)

The code as written, however, doesn't guarantee that all the keys in the kwargs dictionary will be strings. In the dictionary comprehension, we can see that annotation is an ast.Call instance. That means that annotation.keywords is of type list[ast.keyword] -- we can see this from typeshed's stubs for the ast module (which are an invaluable reference if you're working with ASTs in Python!): https://github.com/python/typeshed/blob/18d45d62aabe68fce78965c4920cbdeddb4b54db/stdlib/_ast.pyi#L324-L329. If annotation.keywords is of type list[ast.keyword], that means that the node variable in the dictionary comprehension is of type keyword, which means (again using typeshed's ast stubs), that node.arg is of type str | None: https://github.com/python/typeshed/blob/18d45d62aabe68fce78965c4920cbdeddb4b54db/stdlib/_ast.pyi#L516-L520. AKA, the keys in this dictionary are not always guaranteed to be strings -- there's a latent bug in this code!

Linked PRs

@erlend-aasland
Copy link
Contributor Author

erlend-aasland commented Jul 3, 2023

Suggested fixes by Alex: #106354 (comment) (C&P):

AKA, the keys in this dictionary are not always guaranteed to be strings -- there's a latent bug in this code!

Here's two possible ways of fixing this latent bug:

(1)

--- a/Tools/clinic/clinic.py
+++ b/Tools/clinic/clinic.py
@@ -5048,6 +5048,7 @@ def parse_converter(annotation: ast.expr | None) -> tuple[str, bool, KwargDict]:
                 kwargs = {
                     node.arg: eval_ast_expr(node.value, symbols)
                     for node in annotation.keywords
+                    if isinstance(node.arg, str)
                 }
                 return name, False, kwargs
             case _:

(2)

--- a/Tools/clinic/clinic.py
+++ b/Tools/clinic/clinic.py
@@ -5045,10 +5045,11 @@ def parse_converter(annotation: ast.expr | None) -> tuple[str, bool, KwargDict]:
                 return name, False, {}
             case ast.Call(func=ast.Name(name)):
                 symbols = globals()
-                kwargs = {
-                    node.arg: eval_ast_expr(node.value, symbols)
-                    for node in annotation.keywords
-                }
+                kwargs: dict[str, Any] = {}
+                for node in annotation.keywords:
+                    if not isinstance(node.arg, str):
+                        raise TypeError("Unexpectedly found a keyword node with a non-str arg!")
+                    kwargs[node.arg] = eval_ast_expr(node.value, symbols)
                 return name, False, kwargs
             case _:
                 fail(

You tell me which would be more correct here (or if another solution would be better than either of these)!

@erlend-aasland
Copy link
Contributor Author

erlend-aasland commented Jul 3, 2023

In the spirit of Argument Clinic, I would think it would be best to catch the error and bail (fail(...)); so a modified variant of the second suggestion, @AlexWaygood

@AlexWaygood
Copy link
Member

Here's a minimal repro of the situation where the arg attribute of an ast.keyword node could be None:

>>> module = ast.parse('foo(**kwargs)')
>>> call = module.body[0].value
>>> assert isinstance(call, ast.Call)
>>> keyword = call.keywords[0]
>>> assert isinstance(keyword, ast.keyword)
>>> keyword.arg is None
True

@erlend-aasland
Copy link
Contributor Author

Actually, parse_converter cannot be called unless the param declaration is valid, which means it's guaranteed to be a str. Adding an assert should be sufficient.

@AlexWaygood
Copy link
Member

And here's a failing test case:

--- a/Lib/test/test_clinic.py
+++ b/Lib/test/test_clinic.py
@@ -327,6 +327,9 @@ def test_param_default_expression(self):
         s = self.parse_function_should_fail("module os\nos.access\n    follow_symlinks: int = sys.maxsize")
         self.assertEqual(s, "Error on line 0:\nWhen you specify a named constant ('sys.maxsize') as your default value,\nyou MUST specify a valid c_default.\n")

+    def test_kwarg_splat(self):
+        self.parse_function("module os\nos.access\n    follow_symlinks: int(**{'c_default': 'MAXSIZE'})")
+

Add this test, and this is the result if you run python -m test test_clinic:

test test_clinic failed -- Traceback (most recent call last):
  File "C:\Users\alexw\coding\cpython\Lib\test\test_clinic.py", line 331, in test_kwarg_splat
    self.parse_function("module os\nos.access\n    follow_symlinks: int(**{'c_default': 'MAXSIZE'})")
  File "C:\Users\alexw\coding\cpython\Lib\test\test_clinic.py", line 865, in parse_function
    block = self.parse(text)
            ^^^^^^^^^^^^^^^^
  File "C:\Users\alexw\coding\cpython\Lib\test\test_clinic.py", line 861, in parse
    parser.parse(block)
  File "C:\Users\alexw\coding\cpython\Tools\clinic\clinic.py", line 4481, in parse
    self.state(line)
  File "C:\Users\alexw\coding\cpython\Tools\clinic\clinic.py", line 4751, in state_parameters_start
    return self.next(self.state_parameter, line)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\alexw\coding\cpython\Tools\clinic\clinic.py", line 4521, in next
    self.state(line)
  File "C:\Users\alexw\coding\cpython\Tools\clinic\clinic.py", line 4989, in state_parameter
    converter = dict[name](c_name or parameter_name, parameter_name, self.function, value, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: keywords must be strings

test_clinic failed (1 error)

== Tests result: FAILURE ==

clinic should fail if you do a kwarg-splat in a function call as an annotation -- but it should fail gracefully using the fail() function, rather than crashing with a TypeError

@erlend-aasland
Copy link
Contributor Author

Here's a minimal repro of the situation where the arg attribute of an ast.keyword node could be None:

I've failed to reproduce it in clinic, though.

@AlexWaygood
Copy link
Member

Here's a minimal repro of the situation where the arg attribute of an ast.keyword node could be None:

I've failed to reproduce it in clinic, though.

I just did; see #106359 (comment)

erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Jul 3, 2023
…rser

DSLParser.parse_converter() could return unusable kwdicts in some rare cases
@erlend-aasland
Copy link
Contributor Author

Yeah, I managed to do a similar repro; PR up.

@AlexWaygood
Copy link
Member

Yeah, I managed to do a similar repro; PR up.

Dang, you beat me to it; I was also working on a PR :D

@erlend-aasland
Copy link
Contributor Author

Sorry! :) Oh, well you kinda wrote the entire patch. I just filled in a tiny blank!

erlend-aasland added a commit that referenced this issue Jul 3, 2023
…106361)

DSLParser.parse_converter() could return unusable kwdicts in some rare cases

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 3, 2023
…rser (pythonGH-106361)

DSLParser.parse_converter() could return unusable kwdicts in some rare cases

(cherry picked from commit 0da4c88)

Co-authored-by: Erlend E. Aasland <erlend@python.org>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
AlexWaygood added a commit that referenced this issue Jul 3, 2023
…arser (GH-106361) (#106364)

gh-106359: Fix corner case bugs in Argument Clinic converter parser (GH-106361)

DSLParser.parse_converter() could return unusable kwdicts in some rare cases

(cherry picked from commit 0da4c88)

Co-authored-by: Erlend E. Aasland <erlend@python.org>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
carljm added a commit to carljm/cpython that referenced this issue Jul 3, 2023
* main: (167 commits)
  pythongh-91053: make func watcher tests resilient to other func watchers (python#106286)
  pythongh-104050: Add more type hints to Argument Clinic DSLParser() (python#106354)
  pythongh-106359: Fix corner case bugs in Argument Clinic converter parser (python#106361)
  pythongh-104146: Remove unused attr 'parameter_indent' from clinic.DLParser (python#106358)
  pythongh-106320: Remove private _PyErr C API functions (python#106356)
  pythongh-104050: Annotate Argument Clinic DSLParser attributes (python#106357)
  pythongh-106320: Create pycore_modsupport.h header file (python#106355)
  pythongh-106320: Move _PyUnicodeWriter to the internal C API (python#106342)
  pythongh-61215: New mock to wait for multi-threaded events to happen (python#16094)
  Document PYTHONSAFEPATH along side -P (python#106122)
  Replace the esoteric term 'datum' when describing dict comprehensions (python#106119)
  pythongh-104050: Add more type hints to Argument Clinic DSLParser() (python#106343)
  pythongh-106320: _testcapi avoids private _PyUnicode_EqualToASCIIString() (python#106341)
  pythongh-106320: Add pycore_complexobject.h header file (python#106339)
  pythongh-106078: Move DecimalException to _decimal state (python#106301)
  pythongh-106320: Use _PyInterpreterState_GET() (python#106336)
  pythongh-106320: Remove private _PyInterpreterState functions (python#106335)
  pythongh-104922: Doc: add note about PY_SSIZE_T_CLEAN (python#106314)
  pythongh-106217: Truncate the issue body size of `new-bugs-announce-notifier` (python#106329)
  pythongh-104922: remove PY_SSIZE_T_CLEAN (python#106315)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-argument-clinic type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants