Incomplete tuple created by PyTuple_New() and accessed via the GC can trigged a crash #59313

pxd · 2012-06-19T21:52:14Z

BPO	15108
Nosy	@arigo, @rhettinger, @jcea, @amauryfa, @pitrou, @vstinner, @corona10, @pablogsal, @erlend-aasland, @iritkatriel
PRs	bpo-15108: Prevent accessing the result tuple from Python in PySequence_Tuple #24510

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2021-02-22.09:15:10.010>
created_at = <Date 2012-06-19.21:52:13.870>
labels = ['type-bug', '3.8', '3.9', '3.10', 'extension-modules', 'invalid']
title = 'Incomplete tuple created by PyTuple_New() and accessed via the GC can trigged a crash'
updated_at = <Date 2021-02-22.09:15:10.009>
user = 'https://bugs.python.org/pxd'

bugs.python.org fields:

activity = <Date 2021-02-22.09:15:10.009>
actor = 'pablogsal'
assignee = 'none'
closed = True
closed_date = <Date 2021-02-22.09:15:10.010>
closer = 'pablogsal'
components = ['Extension Modules']
creation = <Date 2012-06-19.21:52:13.870>
creator = 'pxd'
dependencies = []
files = []
hgrepos = []
issue_num = 15108
keywords = ['patch']
message_count = 22.0
messages = ['163224', '163225', '163226', '164257', '164263', '164273', '164282', '164285', '228344', '228357', '386841', '387029', '387033', '387034', '387036', '387068', '387070', '387079', '387204', '387224', '387492', '387506']
nosy_count = 13.0
nosy_names = ['arigo', 'rhettinger', 'jcea', 'ghaering', 'amaury.forgeotdarc', 'pitrou', 'vstinner', 'lkraav', 'pxd', 'corona10', 'pablogsal', 'erlendaasland', 'iritkatriel']
pr_nums = ['24510']
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue15108'
versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

Linked PRs

GH-59313: Do not allow incomplete tuples, or tuples with NULLs. #117747

pxd · 2012-06-19T21:52:12Z

Hi,

Sporadically, while running an sqlite3 query, the above error is seen.
In order to debug, I modified Objects/tupleobject.c, PyTuple_SetItem() as follows:

	if (!PyTuple_Check(op) || op->ob_refcnt != 1) {
		Py_XDECREF(newitem);
		/*
		 * Temp: Bug XYZ Generate core so that we can debug
		 *  
		 * PyErr_BadInternalCall();
		 * return -1;
		*/
		char errmsg[200];
		sprintf(errmsg, "Bug XYZ: PyTuple_Check(op) = %d "
			"op->ob_refcnt = %d; see core\n", PyTuple_Check(op),
			op->ob_refcnt);
		Py_FatalError(errmsg); 
	}

This generates a core with the following bt. Showing the top few frames only:
(gdb) bt
#0 0x0000000800acd3fc in thr_kill () at thr_kill.S:3
#1 0x0000000800b5e283 in abort () at /build/mnt/src/lib/libc/stdlib/abort.c:65
#2 0x0000000000494acf in Py_FatalError (msg=Variable "msg" is not available.
) at ./../Python/pythonrun.c:1646
#3 0x000000000044e740 in PyTuple_SetItem (op=0x80c6e6308, i=16, newitem=0x80a80d780) at ./../Objects/tupleobject.c:128
#4 0x0000000807298866 in _pysqlite_fetch_one_row (self=0x80b846e48) at _sqlite/cursor.c:402
#5 0x0000000807298bf5 in pysqlite_cursor_iternext (self=0x80b846e48) at _sqlite/cursor.c:898
#6 0x0000000000476943 in PyEval_EvalFrameEx (f=0x80a94d420, throwflag=Variable "throwflag" is not available.
) at ./../Python/ceval.c:2237
#7 0x000000000047acbf in PyEval_EvalFrameEx (f=0x80a94b820, throwflag=Variable "throwflag" is not available.
) at ./../Python/ceval.c:3765
#8 0x000000000047bf09 in PyEval_EvalCodeEx (co=0x808d2ac60, globals=Variable "globals" is not available.
) at ./../Python/ceval.c:2942
...
(gdb) fr 4
#4 0x0000000807298866 in _pysqlite_fetch_one_row (self=0x80b846e48) at _sqlite/cursor.c:402
402 PyTuple_SetItem(row, i, converted);
Current language: auto; currently c
(gdb) l
397 converted = buffer;
398 }

(gdb) p *(PyTupleObject *)row
$11 = {ob_refcnt = 2, ob_type = 0x60fee0, ob_size = 22, ob_item = {0x80a534030}}

'row' was allocated via PyTuple_New()
but, somehow its refcount has become 2 while setting the 16th item!!!

Is this a known issue? If not, what can I do to debug.

Thanks,
Pankaj

jcea · 2012-06-19T21:57:24Z

Could you possibly reproduce this in 2.7, 3.2 and/or "default" (future 3.3)?.

Python 2.6 support is over.

pxd · 2012-06-19T22:11:03Z

sorry, 2.7, 3.2 is not an option currently but I am hoping someone can
provide enough info to help probe this more efficiently. There seem to be references to this issue on the web but no root-cause.

pxd · 2012-06-28T14:20:05Z

I believe I have found the root-cause for this issue.

It is occurring due to the use of the garbage collector in another “memMonitor” thread (we run it periodically to get stats on objects, track mem leaks, etc). Since _pysqlite_fetch_one_row() releases the GIL before calling PyTuple_SetItem(), if the memMonitor is scheduled to run and, say, calls gc.get_objects(), it increments the refcount on all tracked objects (via append_objects()->PyList_Append()->app1()->PY_INCREF()). I have stack traces to confirm. This seems to rule out the use of gc methods (such as get_objects(), get_referrers/referents()) in multi-threaded programs or have them handle SystemError arising from such usage. Agree?

amauryfa · 2012-06-28T15:13:30Z

Thanks for the analysis!
This is quite similar to bpo-793822: gc.get_referrers() can access unfinished tuples. The difference here is that what breaks is not the monitoring tool, but the "main" program!

Here is a simple script inspired from the original bug; PySequence_Tuple() uses PyTuple_SET_ITEM() which is a macro without the ob_refcnt check, but we can make it call _PyTuple_Resize() and fail there. All versions of CPython are affected:

import gc
TAG = object()

def monitor():
    lst = [x for x in gc.get_referrers(TAG)
           if isinstance(x, tuple)]
    t = lst[0]   # this *is* the result tuple
    print(t)     # full of nulls !
    return t     # Keep it alive for some time

def my_iter():
    yield TAG    # 'tag' gets stored in the result tuple
    t = monitor()
    for x in range(10):
        yield x  # SystemError when the tuple needs to be resized

tuple(my_iter())

pitrou · 2012-06-28T16:43:49Z

I wonder why _pysqlite_fetch_one_row() releases the GIL around sqlite3_column_type(). By its name, it doesn't sound like an expensive function.

Another workaround would be to call PyTuple_SET_ITEM instead of PyTuple_SetItem.

pxd · 2012-06-28T17:36:10Z

Wondering the same thing myself, and yes sqlite3_column_type() by itself doesn't seem expensive. I assumed in general it was to allow more responsiveness for apps with huge number of columns (i.e. large tuple size). But we have about 20-25 columns and so I was going to try removing it and seeing the
results. In any case, it seems, fewer GIL acquire/releases will help with throughput.
Are there any guidelines on when GIL should be released?

Re PyTuple_SET_ITEM...yes that's also a possibility but it would then hide genuine bugs.

pitrou · 2012-06-28T17:40:06Z

Are there any guidelines on when GIL should be released?

The GIL should be released:

for CPU-heavy external functions (e.g. compression, cryptography)
for external functions which wait for I/O

Re PyTuple_SET_ITEM...yes that's also a possibility but it would then
hide genuine bugs.

Well, as long as your monitor only increfs the tuple and doesn't mutate
it, there shouldn't be any problem. We use PyTuple_SET_ITEM in many
other places.

lkraav · 2014-10-03T17:28:03Z

I may be seeing this running tracd-1.1.2b1 on python 2.7.7

http://trac.edgewall.org/ticket/11772

pitrou · 2014-10-03T17:52:13Z

What is tracd doing? This issue should only appear when calling one of the debugging functions in the gc module, AFAICT.

iritkatriel · 2021-02-11T23:12:46Z

Still happening in 3.10:

Python 3.10.0a5+ (heads/master:bf2e7e55d7, Feb 11 2021, 23:09:25) [MSC v.1928 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import gc
>>> TAG = object()
>>>
>>> def monitor():
...     lst = [x for x in gc.get_referrers(TAG)
...            if isinstance(x, tuple)]
...     t = lst[0]   # this *is* the result tuple
...     print(t)     # full of nulls !
...     return t     # Keep it alive for some time
...
>>> def my_iter():
...     yield TAG    # 'tag' gets stored in the result tuple
...     t = monitor()
...     for x in range(10):
...         yield x  # SystemError when the tuple needs to be resized
...
>>> tuple(my_iter())
(<object object at 0x00000217225091B0>, <NULL>, <NULL>, <NULL>, <NULL>, <NULL>, <NULL>, <NULL>, <NULL>, <NULL>)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: C:\Users\User\src\cpython-dev\Objects\tupleobject.c:963: bad argument to internal function
>>>

vstinner · 2021-02-15T17:21:38Z

The general issue here is a the PyTuple_New() is unsafe: it immediately tracks the newly created tuple in the GC, whereas the tuple is not initialized yet. If the GIL is released before the tuple is fully populated and something access to this tuple via the GC (ex: gc.get_objects()), accessing the tuple can crash, especially in the Python land (for example, repr(the_tuple) is likely to crash).

IMO the unsafe PyTuple_New() API should be avoided. For example, allocate an array of PyObject* on the stack memory, and then call _PyTuple_FromArray(). This API is safe because it only tracks the tuple once it's fully initialized, and it calls INCREF on items. Problem: this safe and efficient API is currently private.

There are other safe alternatives like Py_BuildValue("(OOO)", item1, item2, item3).

_pysqlite_fetch_one_row() calls PyTuple_New() and releases the GIL at each sqlite3_column_type() call, so yeah, it has this exact bug. By the way, it doesn't check for PyTuple_SetItem() failure, whereas it's currently possible that there is more than one strong reference to the tuple which is being populated (because of the GC issue).

PyTuple_New() is ok-ish if there is no way to trigger a GC collection and if the GIL cannot be released until the tuple is fully initialized.

Maybe we need a private _PyTuple_NewUntracked() API to create a tuple which is not tracked by the GC, and also a _PyTuple_ResizeUntracked() API. By the way, _PyTuple_Resize() sounds like a nonsense since a tuple is supposed to be immutable ;-)

pablogsal · 2021-02-15T17:58:07Z

IMO the unsafe PyTuple_New() API should be avoided.

Is not that simple, there are other APIs that track the tuple as _PyTuple_Resize. This problem also is not unique to tuples, although is mainly prominent in them.

We have this warning in the docs:

> Care must be taken when using objects returned by get_referrers() because some of them could still be under construction and hence in a temporarily invalid state. Avoid using get_referrers() for any purpose other than debugging.

That's because *by thesign* these APIs can potentially access half-initialized objects.

I don't know if is worth to add a new API just for tuples, given that this problem happens with many other objects

pablogsal · 2021-02-15T18:00:14Z

There are other safe alternatives like Py_BuildValue("(OOO)", item1, item2, item3).

That's a lot slower unfortunately

pablogsal · 2021-02-15T18:02:34Z

If the GIL is released before the tuple is fully populated and something access to this tuple via the GC (ex: gc.get_objects()), accessing the tuple can crash, especially in the Python land (for example, repr(the_tuple) is likely to crash).

It can happen even without releasing the GIL: A new tuple is created, then some other object is created using the CAPI, the gc runs, the callback triggers (or the tuplevisit method is invoked) and then kaboom

vstinner · 2021-02-15T22:40:58Z

That's a lot slower unfortunately

Ah sorry, I forgot PyTuple_Pack(3, item1, item2, item3) which should be very efficient. This function is also safe: only track the tuple when it is fully initialized.

This problem also is not unique to tuples, although is mainly prominent in them.

PyList_New() is also affected. Do you think about other types?

PyDict_New() and PySet_New() create empty containers and so are ok.

pablogsal · 2021-02-15T23:01:32Z

PyList_New() is also affected. Do you think about other types?

Any C extension class that implements a new_whatever() method that leaves the class tracked and not ready.

vstinner · 2021-02-16T00:21:19Z

Any C extension class that implements a new_whatever() method that leaves the class tracked and not ready.

I'm not aware of such C extension but they likely exists. If we have such extensions in the stdlib, we can try to fix them.

We cannot fix such GC bugs in third party code, and I don't think that we can hack the GC to work around this issue neither.

erlend-aasland · 2021-02-18T09:34:57Z

Should we still fix sqlite3, or wait for an agreement on python/issues-test-cpython#24510?

pablogsal · 2021-02-18T11:50:01Z

Should we still fix sqlite3, or wait for an agreement on python/issues-test-cpython#24510?

I suggest to let's all agree on how to fix this on the bigger scale first.

rhettinger · 2021-02-22T03:33:01Z

Unless there is a simple, reliable, cheap, and universal solution at hand, consider closing this. Given how long PyTuple_New() has exist, it doesn't seem to be much of a problem in the real world.

Historically, we punted on "crashers" involving either gc.get_referrers or byte code hacks. Both of those reach inside Python's black box, allowing disruption of otherwise sensible invariants.

pablogsal · 2021-02-22T09:15:10Z

I mainly agree on closing this issue as we already have a warning about this behaviour in gc.get_referrers and other friends.

On the other hand I am still a bit afraid of a crash that could happen if the GC does a pass and one of these tuples is in an inconsistent state. Is possible that this crashes Python without the user calling ever any function from the GC module. I need to do some investigation around this, but I agree that this is a separate issue, so I will open a new one if it happens to be relevant.

nascheme · 2023-08-14T23:42:33Z

I'm pretty sure the problem is not only with gc.get_referrers(). The GC expects that objects that are tracked are valid, at least in the sense that tp_tranverse and tp_clear can be called on them. I suppose if they have finalizers, then the finalizers could run too.

Any C extension class that implements a new_whatever() method that leaves the class tracked and not ready.

That sounds like a crash waiting to happen to me. You can get away with it if the GC happens to not run until the class becomes ready.

nascheme · 2023-08-15T16:51:28Z

In #107183, Guido points out that for tuples, they are pre-filled with NULLs and tp_traverse and tp_clear check for the NULLs. So, that would avoid crashes when the GC is running for that case.

pxd mannequin added the type-bug An unexpected behavior, bug, or error label Jun 19, 2012

jcea added build The build process and cross-build and removed build The build process and cross-build labels Jun 19, 2012

pitrou added the extension-modules C modules in the Modules dir label Jun 28, 2012

iritkatriel added 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes labels Feb 11, 2021

vstinner changed the title ~~ERROR: SystemError: ./../Objects/tupleobject.c:118: bad argument to internal function~~ Incomplete tuple created by PyTuple_New() and accessed via the GC can trigged a crash Feb 15, 2021

pablogsal closed this as completed Feb 22, 2021

pablogsal added the invalid label Feb 22, 2021

ezio-melotti transferred this issue from another repository Apr 10, 2022

vstinner mentioned this issue Aug 30, 2022

Add _PyTuple_New_Nonzeroed #96446

Closed

vstinner mentioned this issue Jul 3, 2023

Disallow creation of incomplete/inconsistent objects capi-workgroup/problems#56

Open

This was referenced Jul 23, 2023

gh-107137: Add _PyTupleBuilder API to the internal C API #107139

Closed

C API: Add internal C API to build a tuple: _PyTupleBuilder #107137

Closed

gh-107137: Add _PyTuple_NewNoTrack() internal C API #107183

Closed

bedevere-app bot mentioned this issue Apr 11, 2024

GH-59313: Do not allow incomplete tuples, or tuples with NULLs. #117747

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incomplete tuple created by PyTuple_New() and accessed via the GC can trigged a crash #59313

Incomplete tuple created by PyTuple_New() and accessed via the GC can trigged a crash #59313

pxd mannequin commented Jun 19, 2012 •

edited by bedevere-app bot

Loading

pxd mannequin commented Jun 19, 2012

jcea commented Jun 19, 2012

pxd mannequin commented Jun 19, 2012

pxd mannequin commented Jun 28, 2012

amauryfa commented Jun 28, 2012

pitrou commented Jun 28, 2012

pxd mannequin commented Jun 28, 2012

pitrou commented Jun 28, 2012

lkraav mannequin commented Oct 3, 2014

pitrou commented Oct 3, 2014

iritkatriel commented Feb 11, 2021

vstinner commented Feb 15, 2021

pablogsal commented Feb 15, 2021

pablogsal commented Feb 15, 2021

pablogsal commented Feb 15, 2021

vstinner commented Feb 15, 2021

pablogsal commented Feb 15, 2021

vstinner commented Feb 16, 2021

erlend-aasland commented Feb 18, 2021

pablogsal commented Feb 18, 2021

rhettinger commented Feb 22, 2021

pablogsal commented Feb 22, 2021

nascheme commented Aug 14, 2023

nascheme commented Aug 15, 2023

Incomplete tuple created by PyTuple_New() and accessed via the GC can trigged a crash #59313

Incomplete tuple created by PyTuple_New() and accessed via the GC can trigged a crash #59313

Comments

pxd mannequin commented Jun 19, 2012 • edited by bedevere-app bot Loading

Linked PRs

pxd mannequin commented Jun 19, 2012

jcea commented Jun 19, 2012

pxd mannequin commented Jun 19, 2012

pxd mannequin commented Jun 28, 2012

amauryfa commented Jun 28, 2012

pitrou commented Jun 28, 2012

pxd mannequin commented Jun 28, 2012

pitrou commented Jun 28, 2012

lkraav mannequin commented Oct 3, 2014

pitrou commented Oct 3, 2014

iritkatriel commented Feb 11, 2021

vstinner commented Feb 15, 2021

pablogsal commented Feb 15, 2021

pablogsal commented Feb 15, 2021

pablogsal commented Feb 15, 2021

vstinner commented Feb 15, 2021

pablogsal commented Feb 15, 2021

vstinner commented Feb 16, 2021

erlend-aasland commented Feb 18, 2021

pablogsal commented Feb 18, 2021

rhettinger commented Feb 22, 2021

pablogsal commented Feb 22, 2021

nascheme commented Aug 14, 2023

nascheme commented Aug 15, 2023

pxd mannequin commented Jun 19, 2012 •

edited by bedevere-app bot

Loading