Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TrioInternalError yielding out of an async generator that has opened a nursery #1443

Closed
Contextualist opened this issue Apr 2, 2020 · 2 comments

Comments

@Contextualist
Copy link

Contextualist commented Apr 2, 2020

Python 3.8.1
Trio 0.13.0

What I have seen:

Exception ignored in: <async_generator object GrainRemote._loop.<locals>.heartbeat_n_receive at 0x2b177773a9d0>
Traceback (most recent call last):
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/grain/head.py", line 55, in _loop
    await rq.send((tid>0, r))
RuntimeError: async generator ignored GeneratorExit
Traceback (most recent call last):
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1790, in run
    run_impl(runner, async_fn, args)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1940, in run_impl
    runner.task_exited(task, final_outcome)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1425, in task_exited
    task._parent_nursery._child_finished(task, outcome)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 825, in _child_finished 
    self._check_nursery_closed()
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 819, in _check_nursery_closed
    GLOBAL_RUN_CONTEXT.runner.reschedule(self._parent_task)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1229, in reschedule
    assert task._next_send_fn is None
AssertionError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "frag_range.py", line 133, in <module>
    run_combine(main,
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/grain/combine.py", line 161, in run_combine
    trio.run(boot_combine, subtasks, args, kwargs)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1794, in run
    raise TrioInternalError(
trio.TrioInternalError: internal error in Trio - please file a bug!

Relavent code:

    async def _loop(self, task_status=trio.TASK_STATUS_IGNORED):
        async def heartbeat_s(c):
            while True:
                await c.try_send(b"HBT")
                await trio.sleep(HEARTBEAT_INTERVAL)
        async def heartbeat_n_receive(c):
            async with trio.open_nursery() as _n:
                _n.start_soon(heartbeat_s, c)
                while True:
                    with trio.move_on_after(HEARTBEAT_INTERVAL*HEARTBEAT_TOLERANCE):
                        async for x in c:
                            if x == b"HBT": break
                            yield x
                        else: break
                        continue
                    print(f"remote {self.name} heartbeat response timeout")
                    break
                _n.cancel_scope.cancel()
        async with self._c:
            task_status.started()
            async for x in heartbeat_n_receive(self._c):
                tid, r = pickle.loads(x)
                rq = self.resultq.get(abs(tid))
                if rq:
                    await rq.send((tid>0, r)) # <<< line 55 of grain/head.py
                    # rq is a receive memory channel with 0 buffer
                else:
                    log_event("late_response")
                    print(f"remote {self.name} received phantom job {abs(tid)}'s result")

This has never happened, but I have seen this twice today. Other probably relevant information: the file system and/or network is slow at that time, I am seeing lots of heartbeat response timeout in my log (see the relevant code above). It happens more frequently later. I'm now suspecting that if some hidden weak point in my code get overwhelmed.

Sorry for dumping lots of shallow information. I am not sure what else should I report, and please rename the title to a more precise name.

@oremanj oremanj changed the title TrioInternalError raised by Runner.reschedule's assertion TrioInternalError yielding out of an async generator that has opened a nursery May 19, 2020
@oremanj
Copy link
Member

oremanj commented May 19, 2020

Sorry for the delay in responding to this.

It looks like you're trying to yield out of an async generator that has opened a nursery. This isn't supported; see #264 and #638 for more discussion on why. It's unlikely to ever be supported, but we do want to make it blow up less horribly than what you experienced. The full fix probably requires a change to the Python interpreter, but it's likely that we can detect certain common cases and give a more informative error. (We already do this for cancel scopes, which have basically the same set of problems.)

I'm going to leave this issue open in case someone has the cycles to figure out why this particular case gave a nasty error instead of a useful one. If you're able to give more information about the crashes you're running into, or an even semi-reliable independent reproducer, that would help, but I understand if you can't.

@Contextualist
Copy link
Author

@oremanj Thanks for explanation and the linked issues. I now understand what to fix when rewriting my code. I have not seen this error recurred since the day I reported. Base on the information you provided, I suspect that the error is due to the nursery's cancel scope inside an async generator is cancelled outside the async generator.1

I try to produce a minimal reproducer out of my understanding. The best thing I can get produce two different tracebacks randomly (function names are retained for comparison):

import trio

async def _loop():
    async def heartbeat_s():
        await trio.sleep_forever()
    async def heartbeat_n_receive():
        async with trio.open_nursery() as nursery:
            nursery.start_soon(heartbeat_s)
            with trio.move_on_after(0):
                yield
    rq, _ = trio.open_memory_channel(0)
    async with trio.open_nursery() as _:
        async for x in heartbeat_n_receive():
            await rq.send(x)

trio.run(_loop)

(Note that these were produced with Python 3.8.1, Trio 0.13.0)

Traceback 1 (seen in original post, with additional gc errors):
Exception ignored in: <async_generator object _loop.<locals>.heartbeat_n_receive at 0x2ade7e8cfaf0>
Traceback (most recent call last):
  File "test.py", line 14, in _loop
    await rq.send(x)
RuntimeError: async generator ignored GeneratorExit
Traceback (most recent call last):
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1790, in run
    run_impl(runner, async_fn, args)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1940, in run_impl
    runner.task_exited(task, final_outcome)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1425, in task_exited
    task._parent_nursery._child_finished(task, outcome)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 825, in _child_finished
    self._check_nursery_closed()
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 819, in _check_nursery_closed
    GLOBAL_RUN_CONTEXT.runner.reschedule(self._parent_task)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1229, in reschedule
    assert task._next_send_fn is None
AssertionError
    
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test.py", line 16, in <module>
    trio.run(_loop)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1794, in run
    raise TrioInternalError(
trio.TrioInternalError: internal error in Trio - please file a bug!
Exception ignored in: <function Nursery.__del__ at 0x2ade7bf62280>
Traceback (most recent call last):
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 968, in __del__ 
AssertionError: 
Exception ignored in: <coroutine object _loop at 0x2ade7c788340>
Traceback (most recent call last):
  File "test.py", line 14, in _loop
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 717, in __aexit__
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 854, in _nested_child_finished
AssertionError:
Traceback 2:
Exception ignored in: <async_generator object _loop.<locals>.heartbeat_n_receive at 0x2b69a3f36af0>
Traceback (most recent call last):
  File "test.py", line 14, in _loop
    await rq.send(x)
RuntimeError: async generator ignored GeneratorExit
Traceback (most recent call last):
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1790, in run
    run_impl(runner, async_fn, args)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1940, in run_impl
    runner.task_exited(task, final_outcome)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1410, in task_exited
    self.tasks.remove(task)
KeyError: <Task '__main__._loop' at 0x2b69a3f35fd0>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test.py", line 16, in <module>
    trio.run(_loop)
  File "$HOME/.pyenv/versions/3.8.1/lib/python3.8/site-packages/trio/_core/_run.py", line 1794, in run
    raise TrioInternalError(
trio.TrioInternalError: internal error in Trio - please file a bug!

Hope that these help for future testing! I will follow the updates in the related issues.


1 In particular, await rq.send((tid>0, r)) # <<< line 55 of grain/head.py takes too long. This is a rare situation because my rq is designed to have no back pressure (its receive channel guarantees to wait before the send channel's call). So I still don't fully understand what really happened at that wild day.

@Zac-HD Zac-HD closed this as not planned Won't fix, can't repro, duplicate, stale May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants