Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-1635741: Add a global module state to unicodedata #22712

Merged
merged 1 commit into from
Oct 15, 2020
Merged

bpo-1635741: Add a global module state to unicodedata #22712

merged 1 commit into from
Oct 15, 2020

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Oct 15, 2020

Prepare unicodedata to add a state per module: start with a global
"module" state, pass it to subfunctions which access &UCD_Type. This
change also prepares the conversion of the UCD_Type static type to a
heap type.

https://bugs.python.org/issue1635741

Prepare unicodedata to add a state per module: start with a global
"module" state, pass it to subfunctions which access &UCD_Type. This
change also prepares the conversion of the UCD_Type static type to a
heap type.
@vstinner
Copy link
Member Author

With this PR, I understood that my main concern comes from the PyCapsule API: unicodedata.ucnhash_CAPI.

_PyUnicode_DecodeUnicodeEscape() uses it like this:

ucnhash_CAPI->getcode(NULL, start, (int)namelen, &ch, 0)

No state is passed to _getcode() and so currently it can only access global variables.

I looked at how to create a C function which would be a wrapper to _getcode() which automatically pass a state. Problem: the only portable way is to use it with something like: func(closure, ...), you must pass a state to the wrapper as an argument. There is a non-portable way to really create a closure in C, but it requires libffi which sounds a heavy solution.

Hopefully we don't need to go that far. _PyUnicode_Name_CAPI is private and excluded from the limited C API. We can move it to the internal C API and introduce incompatible change, like require to pass a state. For example, we can add a state in the _PyUnicode_Name_CAPI structure and require the caller to pass it:

ucnhash_CAPI->getcode(ucnhash_CAPI->state, NULL, start, (int)namelen, &ch, 0)

@vstinner vstinner changed the title [WIP] bpo-1635741: Add a global module state to unicodedata bpo-1635741: Add a global module state to unicodedata Oct 15, 2020
@vstinner vstinner merged commit e6b8c52 into python:master Oct 15, 2020
@vstinner vstinner deleted the unicodedata_state branch October 15, 2020 14:22
xzy3 pushed a commit to xzy3/cpython that referenced this pull request Oct 18, 2020
Prepare unicodedata to add a state per module: start with a global
"module" state, pass it to subfunctions which access &UCD_Type. This
change also prepares the conversion of the UCD_Type static type to a
heap type.
adorilson pushed a commit to adorilson/cpython that referenced this pull request Mar 13, 2021
Prepare unicodedata to add a state per module: start with a global
"module" state, pass it to subfunctions which access &UCD_Type. This
change also prepares the conversion of the UCD_Type static type to a
heap type.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants