Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-111140: Adds PyLong_AsNativeBytes and PyLong_FromNative[Unsigned]Bytes functions #114886

Merged
merged 22 commits into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions Doc/c-api/long.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,28 @@ distinguished from a number. Use :c:func:`PyErr_Occurred` to disambiguate.
retrieved from the resulting value using :c:func:`PyLong_AsVoidPtr`.


.. c:function:: PyObject* PyLong_FromNativeBytes(const void* buffer, size_t n_bytes, int endianness)

Create a Python integer from the value contained in the first *n_bytes* of
*buffer*, interpreted as a two's-complement signed number.

*endianness* may be passed ``-1`` for the native endian that CPython was
compiled with, or ``0`` for big endian and ``1`` for little.

.. versionadded:: 3.13


.. c:function:: PyObject* PyLong_FromUnsignedNativeBytes(const void* buffer, size_t n_bytes, int endianness)

Create a Python integer from the value contained in the first *n_bytes* of
*buffer*, interpreted as an unsigned number.

*endianness* may be passed ``-1`` for the native endian that CPython was
compiled with, or ``0`` for big endian and ``1`` for little.

.. versionadded:: 3.13


.. XXX alias PyLong_AS_LONG (for now)
.. c:function:: long PyLong_AsLong(PyObject *obj)

Expand Down Expand Up @@ -332,6 +354,49 @@ distinguished from a number. Use :c:func:`PyErr_Occurred` to disambiguate.
Returns ``NULL`` on error. Use :c:func:`PyErr_Occurred` to disambiguate.


.. c:function:: int PyLong_AsNativeBytes(PyObject *pylong, void* buffer, size_t n_bytes, int endianness)

Copy the Python integer value to a native *buffer* of size *n_bytes*::

int value;
int bytes = PyLong_CopyBits(v, &value, sizeof(value), -1);
if (bytes < 0) {
// Error occurred
return NULL;
}
else if (bytes > sizeof(value)) {
// Overflow occurred, but 'value' contains as much as could fit
}

*endianness* may be passed ``-1`` for the native endian that CPython was
compiled with, or ``0`` for big endian and ``1`` for little.

Returns ``-1`` with an exception raised if *pylong* cannot be interpreted as
an integer. Otherwise, returns the size of the buffer required to store the
encukou marked this conversation as resolved.
Show resolved Hide resolved
value. If this is equal to or less than *n_bytes*, the entire value was
copied.

Unless an exception is raised, all *n_bytes* of the buffer will be written
with as much of the value as can fit. This allows the caller to ignore all
non-negative results, if the intent is to match the typical behavior of a
C-style downcast.

Values are always copied as twos-complement, and sufficient size will be
requested for a sign bit. For example, this may cause an value that fits into
8-bytes when treated as unsigned to request 9 bytes, even though all eight
bytes were copied into the buffer.

Passing *n_bytes* of zero will always return the requested buffer size.

.. note::

When the value does not fit in the provided buffer, the requested size
returned from the function may be larger than necessary. Passing 0 to this
function is not an accurate way to determine the bit length of a value.

.. versionadded:: 3.13


.. c:function:: int PyUnstable_Long_IsCompact(const PyLongObject* op)

Return 1 if *op* is compact, 0 otherwise.
Expand Down
7 changes: 6 additions & 1 deletion Doc/whatsnew/3.13.rst
Original file line number Diff line number Diff line change
Expand Up @@ -560,6 +560,7 @@ Tier 2 IR by Mark Shannon and Guido van Rossum.
Tier 2 optimizer by Ken Jin.)



Deprecated
==========

Expand Down Expand Up @@ -1490,6 +1491,11 @@ New Features
* Add :c:func:`Py_HashPointer` function to hash a pointer.
(Contributed by Victor Stinner in :gh:`111545`.)

* Add :c:func:`PyLong_AsNativeBytes`, :c:func:`PyLong_FromNativeBytes` and
:c:func:`PyLong_FromUnsignedNativeBytes` functions to simplify converting
between native integer types and Python :class:`int` objects.
(Contributed by Steve Dower in :gh:`111140`.)


Porting to Python 3.13
----------------------
Expand Down Expand Up @@ -1549,7 +1555,6 @@ Porting to Python 3.13
platforms, the ``HAVE_STDDEF_H`` macro is only defined on Windows.
(Contributed by Victor Stinner in :gh:`108765`.)


Deprecated
----------

Expand Down
36 changes: 35 additions & 1 deletion Include/cpython/longobject.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,40 @@

PyAPI_FUNC(PyObject*) PyLong_FromUnicodeObject(PyObject *u, int base);

/* PyLong_AsNativeBytes: Copy the integer value to a native variable.
buffer points to the first byte of the variable.
n_bytes is the number of bytes available in the buffer. Pass 0 to request
the required size for the value.
endianness is -1 for native endian, 0 for big endian or 1 for little.
Big endian mode will write the most significant byte into the address
directly referenced by buffer; little endian will write the least significant
byte into that address.

If an exception is raised, returns a negative value.
Otherwise, returns the number of bytes that are required to store the value.
To check that the full value is represented, ensure that the return value is
equal or less than n_bytes.
All n_bytes are guaranteed to be written (unless an exception occurs), and
so ignoring a positive return value is the equivalent of a downcast in C.
In cases where the full value could not be represented, the returned value
may be larger than necessary - this function is not an accurate way to
calculate the bit length of an integer object.
*/
PyAPI_FUNC(int) PyLong_AsNativeBytes(PyObject* v, void* buffer, size_t n_bytes,
int endianness);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this return size_t rather than int?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negative values are required, so we'd have to do Py_ssize_t and not size_t. The cast is just as annoying either way unless we make n_bytes also signed, which is then inconsistent with other APIs (but probably less bad than making it accept int).

We might need some agreed upon guidelines for choosing types for these kinds of purposes. int is a very common return type, and IMHO that makes things easier for people trying to call this from non-C languages, but they probably have no choice but to support Py_ssize_t as a return type too and so it really wouldn't make a huge difference.

I pity the poor CPU that has to convert a 32 billion bit number 🙃

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really need to focus on not mixing up signed/unsigned when I write a comment...

Yes, Py_ssize_t is one of the types that users need to support.
I'd much prefer using Py_ssize_t for sizes. These are arbitrary-sized integers, after all; limited by available memory.


/* PyLong_FromNativeBytes: Create an int value from a native integer
n_bytes is the number of bytes to read from the buffer. Passing 0 will
always produce the zero int.
PyLong_FromUnsignedNativeBytes always produces a non-negative int.
endianness is -1 for native endian, 0 for big endian or 1 for little.

Returns the int object, or NULL with an exception set. */
PyAPI_FUNC(PyObject*) PyLong_FromNativeBytes(const void* buffer, size_t n_bytes,
int endianness);
PyAPI_FUNC(PyObject*) PyLong_FromUnsignedNativeBytes(const void* buffer,
size_t n_bytes, int endianness);

PyAPI_FUNC(int) PyUnstable_Long_IsCompact(const PyLongObject* op);
PyAPI_FUNC(Py_ssize_t) PyUnstable_Long_CompactValue(const PyLongObject* op);

Expand Down Expand Up @@ -50,7 +84,7 @@ PyAPI_FUNC(PyObject *) _PyLong_FromByteArray(
*/
PyAPI_FUNC(int) _PyLong_AsByteArray(PyLongObject* v,
unsigned char* bytes, size_t n,
int little_endian, int is_signed);
int little_endian, int is_signed, int with_exceptions);

/* For use by the gcd function in mathmodule.c */
PyAPI_FUNC(PyObject *) _PyLong_GCD(PyObject *, PyObject *);
151 changes: 151 additions & 0 deletions Lib/test/test_capi/test_long.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import unittest
import sys
import test.support as support

from test.support import import_helper

Expand Down Expand Up @@ -423,6 +424,156 @@ def test_long_asvoidptr(self):
self.assertRaises(OverflowError, asvoidptr, -2**1000)
# CRASHES asvoidptr(NULL)

def test_long_asnativebytes(self):
import math
from _testcapi import (
pylong_asnativebytes as asnativebytes,
pylong_asnativebytes_too_big_n,
SIZE_MAX
)

# Abbreviate sizeof(Py_ssize_t) to SZ because we use it a lot
SZ = int(math.ceil(math.log(SIZE_MAX + 1) / math.log(2)) / 8)
MAX_SSIZE = 2 ** (SZ * 8 - 1) - 1
MAX_USIZE = 2 ** (SZ * 8) - 1
if support.verbose:
print(f"SIZEOF_SIZE={SZ}\n{MAX_SSIZE=:016X}\n{MAX_USIZE=:016X}")

# These tests check that the requested buffer size is correct
for v, expect in [
(0, SZ),
(512, SZ),
(-512, SZ),
(MAX_SSIZE, SZ),
(MAX_USIZE, SZ + 1),
(-MAX_SSIZE, SZ),
(-MAX_USIZE, SZ + 1),
(2**255-1, 32),
(-(2**255-1), 32),
(2**256-1, 33),
(-(2**256-1), 33),
]:
with self.subTest(f"sizeof-{v:X}"):
buffer = bytearray(1)
self.assertEqual(expect, asnativebytes(v, buffer, 0, -1),
"PyLong_AsNativeBytes(v, NULL, 0, -1)")
# Also check via the __index__ path
self.assertEqual(expect, asnativebytes(Index(v), buffer, 0, -1),
"PyLong_AsNativeBytes(Index(v), NULL, 0, -1)")

# We request as many bytes as `expect_be` contains, and always check
# the result (both big and little endian). We check the return value
# independently, since the buffer should always be filled correctly even
# if we need more bytes
for v, expect_be, expect_n in [
(0, b'\x00', 1),
(0, b'\x00' * 2, 2),
(0, b'\x00' * 8, min(8, SZ)),
(1, b'\x01', 1),
(1, b'\x00' * 10 + b'\x01', min(11, SZ)),
(42, b'\x2a', 1),
(42, b'\x00' * 10 + b'\x2a', min(11, SZ)),
(-1, b'\xff', 1),
(-1, b'\xff' * 10, min(11, SZ)),
(-42, b'\xd6', 1),
(-42, b'\xff' * 10 + b'\xd6', min(11, SZ)),
# Extracts 255 into a single byte, but requests sizeof(Py_ssize_t)
(255, b'\xff', SZ),
(255, b'\x00\xff', 2),
(256, b'\x01\x00', 2),
# Extracts successfully (unsigned), but requests 9 bytes
(2**63, b'\x80' + b'\x00' * 7, 9),
# "Extracts", but requests 9 bytes
(-2**63, b'\x80' + b'\x00' * 7, 9),
(2**63, b'\x00\x80' + b'\x00' * 7, 9),
(-2**63, b'\xff\x80' + b'\x00' * 7, 9),

(2**255-1, b'\x7f' + b'\xff' * 31, 32),
(-(2**255-1), b'\x80' + b'\x00' * 30 + b'\x01', 32),
# Request extra bytes, but result says we only needed 32
(-(2**255-1), b'\xff\x80' + b'\x00' * 30 + b'\x01', 32),
(-(2**255-1), b'\xff\xff\x80' + b'\x00' * 30 + b'\x01', 32),

# Extracting 256 bits of integer will request 33 bytes, but still
# copy as many bits as possible into the buffer. So we *can* copy
# into a 32-byte buffer, though negative number may be unrecoverable
(2**256-1, b'\xff' * 32, 33),
(2**256-1, b'\x00' + b'\xff' * 32, 33),
(-(2**256-1), b'\x00' * 31 + b'\x01', 33),
(-(2**256-1), b'\xff' + b'\x00' * 31 + b'\x01', 33),
(-(2**256-1), b'\xff\xff' + b'\x00' * 31 + b'\x01', 33),

# The classic "Windows HRESULT as negative number" case
# HRESULT hr;
# PyLong_CopyBits(<-2147467259>, &hr, sizeof(HRESULT))
# assert(hr == E_FAIL)
(-2147467259, b'\x80\x00\x40\x05', 4),
]:
with self.subTest(f"{v:X}-{len(expect_be)}bytes"):
n = len(expect_be)
buffer = bytearray(n)
expect_le = expect_be[::-1]

self.assertEqual(expect_n, asnativebytes(v, buffer, n, 0),
f"PyLong_AsNativeBytes(v, buffer, {n}, <big>)")
self.assertEqual(expect_be, buffer[:n], "<big>")
self.assertEqual(expect_n, asnativebytes(v, buffer, n, 1),
f"PyLong_AsNativeBytes(v, buffer, {n}, <little>)")
self.assertEqual(expect_le, buffer[:n], "<little>")

# Check a few error conditions. These are validated in code, but are
# unspecified in docs, so if we make changes to the implementation, it's
# fine to just update these tests rather than preserve the behaviour.
with self.assertRaises(SystemError):
asnativebytes(1, buffer, 0, 2)
with self.assertRaises(TypeError):
asnativebytes('not a number', buffer, 0, -1)

# We pass any number we like, but the function will pass an n_bytes
# that is too big to make sure we fail
with self.assertRaises(SystemError):
pylong_asnativebytes_too_big_n(100)

def test_long_fromnativebytes(self):
import math
from _testcapi import (
pylong_fromnativebytes as fromnativebytes,
SIZE_MAX,
)

# Abbreviate sizeof(Py_ssize_t) to SZ because we use it a lot
SZ = int(math.ceil(math.log(SIZE_MAX + 1) / math.log(2)) / 8)
MAX_SSIZE = 2 ** (SZ * 8 - 1) - 1
MAX_USIZE = 2 ** (SZ * 8) - 1

for v_be, expect_s, expect_u in [
(b'\x00', 0, 0),
(b'\x01', 1, 1),
(b'\xff', -1, 255),
(b'\x00\xff', 255, 255),
(b'\xff\xff', -1, 65535),
]:
with self.subTest(f"{expect_s}-{expect_u:X}-{len(v_be)}bytes"):
n = len(v_be)
v_le = v_be[::-1]

self.assertEqual(expect_s, fromnativebytes(v_be, n, 0, 1),
f"PyLong_FromNativeBytes(buffer, {n}, <big>)")
self.assertEqual(expect_s, fromnativebytes(v_le, n, 1, 1),
f"PyLong_FromNativeBytes(buffer, {n}, <little>)")
self.assertEqual(expect_u, fromnativebytes(v_be, n, 0, 0),
f"PyLong_FromUnsignedNativeBytes(buffer, {n}, <big>)")
self.assertEqual(expect_u, fromnativebytes(v_le, n, 1, 0),
f"PyLong_FromUnsignedNativeBytes(buffer, {n}, <little>)")

# Check native endian when the result would be the same either
# way and we can test it.
if v_be == v_le:
self.assertEqual(expect_s, fromnativebytes(v_be, n, -1, 1),
f"PyLong_FromNativeBytes(buffer, {n}, <native>)")
self.assertEqual(expect_u, fromnativebytes(v_be, n, -1, 0),
f"PyLong_FromUnsignedNativeBytes(buffer, {n}, <native>)")


if __name__ == "__main__":
unittest.main()
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Adds :c:func:`PyLong_AsNativeBytes`, :c:func:`PyLong_FromNativeBytes` and
:c:func:`PyLong_FromUnsignedNativeBytes` functions.
2 changes: 1 addition & 1 deletion Modules/_io/textio.c
Original file line number Diff line number Diff line change
Expand Up @@ -2393,7 +2393,7 @@ textiowrapper_parse_cookie(cookie_type *cookie, PyObject *cookieObj)
return -1;

if (_PyLong_AsByteArray(cookieLong, buffer, sizeof(buffer),
PY_LITTLE_ENDIAN, 0) < 0) {
PY_LITTLE_ENDIAN, 0, 1) < 0) {
Py_DECREF(cookieLong);
return -1;
}
Expand Down
3 changes: 2 additions & 1 deletion Modules/_pickle.c
Original file line number Diff line number Diff line change
Expand Up @@ -2162,7 +2162,8 @@ save_long(PicklerObject *self, PyObject *obj)
pdata = (unsigned char *)PyBytes_AS_STRING(repr);
i = _PyLong_AsByteArray((PyLongObject *)obj,
pdata, nbytes,
1 /* little endian */ , 1 /* signed */ );
1 /* little endian */ , 1 /* signed */ ,
1 /* with exceptions */);
if (i < 0)
goto error;
/* If the int is negative, this may be a byte more than
Expand Down
3 changes: 2 additions & 1 deletion Modules/_randommodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -342,7 +342,8 @@ random_seed(RandomObject *self, PyObject *arg)
res = _PyLong_AsByteArray((PyLongObject *)n,
(unsigned char *)key, keyused * 4,
PY_LITTLE_ENDIAN,
0); /* unsigned */
0, /* unsigned */
1); /* with exceptions */
if (res == -1) {
goto Done;
}
Expand Down
2 changes: 1 addition & 1 deletion Modules/_sqlite/util.c
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ _pysqlite_long_as_int64(PyObject * py_val)
sqlite_int64 int64val;
if (_PyLong_AsByteArray((PyLongObject *)py_val,
(unsigned char *)&int64val, sizeof(int64val),
IS_LITTLE_ENDIAN, 1 /* signed */) >= 0) {
IS_LITTLE_ENDIAN, 1 /* signed */, 0) >= 0) {
return int64val;
}
}
Expand Down
Loading
Loading