Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-108494: Argument Clinic: inline parsing code for positional-only parameters in the limited C API #108622

Merged

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Aug 29, 2023

It fixes some code which uses private converters, like _PyLong_UnsignedShort_Converter. Inlined code only uses the limited C API.

But it still includes internal headers. It is difficult to not include them, because they are needed when the parsing code is not inlined, and this depends not only on the specified converter, but on other converters and other used features.

There is a behavior change: converters like _PyLong_UnsignedShort_Converter raise ValueError for negative integer, but the inlined code raise OverflowError. It is very difficult to detect negative integer in the limited C API. It can be fixed by changing exception in PyLong_AsUnsignedLong (see #74019).

I also reorganized the code for keyword parameters, so it will be easier to enable inlining once we add a public version of _PyArg_UnpackKeywords in the limited C API, which is the only stopper.

All tests (except one) are passed with using the limited C API (if possible).

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR avoids private _PyArg_BadArgument() and _PyArg_CheckPositional() functions in the limited C API, and generate similar code for that. What about adding these functions to the limited C API instead?

The advantage of your approach is that we can target limited C API older than version 3.13 which is appealing. But is it worth it?

pl = '' if min_pos == 1 else 's'
parser_code.append(normalize_snippet(f"""
if ({nargs} != {min_pos}) {{{{
PyErr_Format(PyExc_TypeError, "{{name}} expected {min_pos} argument{pl}, got %zd", {nargs});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generate many static C strings for similar error messages. Would it make sense to pass name and min_pos (and "s" for plural) as argument to PyErr_Format(), and have a unique format string? It would be faster but make the Python binary smaller and use less memory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_PyArg_BadArgument() is has 4 parameters, but it is not generic enough yet. It formats an error message in the style of some error messages formatted in PyArg_ParseTuple() and PyArg_ParseTupleAndKeywords(), but even these functions has cases which cannot be covered by _PyArg_BadArgument() ("must be %d-item sequence", "must be sequence of length %d"), and there are many other forms of error messages for wrong type of argument in the rest of code. If _PyArg_BadArgument() be general enough and used in many other places outside of Argument Clinic generated code, it will bi worth to add it to the public API.

On other hand, PyErr_Format() is extremely general and flexible variant of _PyArg_BadArgument().

indent=4))
else:
parser_code = [normalize_snippet(f"""
if (!_PyArg_CheckPositional("{{name}}", {nargs}, {min_pos}, {max_args})) {{{{
Copy link
Member

@vstinner vstinner Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about moving _PyArg_CheckPositional() to the limited C API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it can be done, together with _PyArg_NoKeywords(), _PyArg_NoPositional() and _PyArg_NoKwnames().

But the main problem is handling keyword arguments, so I want to add new API only when we see the whole image.

@@ -1264,27 +1267,8 @@ def parser_body(
parser_prototype = self.PARSER_PROTOTYPE_VARARGS
parser_definition = parser_body(parser_prototype, ' {option_group_parsing}')

elif (not requires_defining_class and pos_only == len(parameters) and
not pseudo_args and limited_capi):
# positional-only for the limited C API
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is no longer needed, because there is other PyArg_ParseTuple() below which is used if inlining the parsing code failed. PyArg_UnpackTuple() no longer used, because inlining is always successful in this simple case.

@@ -1401,24 +1426,10 @@ def parser_body(
nargs = f"Py_MIN(nargs, {max_pos})" if max_pos else "0"

if limited_capi:
# positional-or-keyword arguments
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was moved down, to a fallback code used if inlining failed. For now, it always failed for limited_capi, because there is no public variant of _PyArg_UnpackKeywords.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanatory review comments.

@erlend-aasland
Copy link
Contributor

I don't have the bandwith to review the generated C param parsing code. I guess Victor does a better job of reviewing that, than me. If Alex is fine with the clinic.py changes, you are good to go.

BTW, I really like the introduction of the fastcall variables; it makes the code more readable and understandable.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto to what Erlend said: I can't really review the generated C code, but I approve of the concept and the clinic.py changes look good to me

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The overall change LGTM. clinic.py coding style makes reviews really complicated :-( I prefer testing fastcall than new_or_init, that's a nice improvment.

Copy link
Member

@gpshead gpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work! if there are bugs lurking in here (it is hard to visually spot everything in this kind of large clinic.py change) i'm sure they'll show themselves in generated code behaviors that can be fixed up later.

@serhiy-storchaka serhiy-storchaka merged commit 1796c19 into python:main Sep 3, 2023
23 checks passed
@serhiy-storchaka serhiy-storchaka deleted the clinic-limited-capi-inline branch September 3, 2023 14:28
@bedevere-bot
Copy link

There's a new commit after the PR has been approved.

@gpshead, @vstinner, @erlend-aasland, @AlexWaygood: please review the changes made to this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants