Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arbitrary kevent support #578

Open
sorcio opened this issue Jul 29, 2018 · 1 comment
Open

Arbitrary kevent support #578

sorcio opened this issue Jul 29, 2018 · 1 comment

Comments

@sorcio
Copy link
Contributor

sorcio commented Jul 29, 2018

This is part of the long term plan to implement subprocess suport as outlined in #4 (comment) (see "Step 3", copied/pasted below for convenience).


On MacOS and the BSDs, there's this neat thing called "kqueue", which lets you efficiently wait on all kinds of different events in an async-event-loop-friendly way. Currently Trio just uses this to wait for file descriptors to become readable/writable, but in general kqueue is much more flexible. For example, the EVFILT_PROC event type lets you wait for a subprocess to exit.

trio/_core/_io_kqueue.py should provide the ability to wait on arbitrary event types. (Then we'll use that in our subprocess support to wait for EVFILT_PROC events.)

There's already a stub implementation of this, in the monitor_kevent method. This isn't tested or used currently though, so if we're going to start using it we'd want to make sure it actually works. (And possibly change how it works, if there's some other semantics that make it easier to use.)

@njsmith
Copy link
Member

njsmith commented Jul 30, 2018

Okay, I think I've swapped this back into my brain now, and looking at #579...

First, the thing where monitor_kevent didn't actually register the kevent was intentional – if you look at the comment above current_kqueue, it has some discussion of this. The thing is that the semantics for which kevents you can get when are actually super complicated – like you can request kevent X and then get kevent Y, or you can pass the kqueue to some other syscall entirely and then kevent Z magically appears. So my plan was for us to kind of throw up our hands and tell people who want to use the low-level API that OK, here's your raw kqueue, here's a primitive to listen for kevents, you register for what you want and then listen for what you want and please be very careful not to screw things up because that will break everything, but really it's Python so we can't stop you.

But, looking at it again, though, I'm not sure this is a great plan. There are two cases I know of where kevent filters can magically get registered without using KV_ADD. These are:

  • AIO: you can implicitly register a kevent filter by passing a kqueue and some flags to aio_read, etc. It's not supported by MacOS, though, so we can't test it. Also, using AIO from Python would take a lot of work (none of the relevant syscalls are wrapped, and you lose the entire Python I/O stack)... not super interesting or urgent. The API I described above can handle this.

  • EVFILT_PROC with NOTE_TRACK: this allows you to not just monitor a process, but also request that if the process creates any child processes, then you want to automatically (and atomically!) register filters for them as well. This isn't supported by MacOS either. It's pretty useful in theory – I can easily imagine a FreeBSD shop wanting to use this (so... basically just Netflix I guess). But the API I described above can't handle this. For NOTE_TRACK, as soon as you see a notification that a child has been created, you have to immediately – like, inside handle_io – add a note to self._registrations to start directing the child's results to the appropriate place. So this needs a special-case to work, and once we have the special case, it no longer needs the API I described above.

So, never mind all that. Let's drop current_kqueue. Now, what should a basic "add a filter and listen to it and remove it" API look like?

I think we probably want to public functions: one that listens for a single ONESHOT event, and one that listens for events on an ongoing basis. The first can probably be a simple wrapper around the second (that listens until it hears one event and then returns). For the latter... that's basically what #579 makes monitor_kevent (though we might want a way to pass through EV_ONESHOT? I'm not sure if any of the other flags are useful). So that's pretty reasonable. And wait_kevent could become the EV_ONESHOT method.

One more wrinkle: how monitor_kevent reports events. Right now it uses a trio.hazmat.UnboundedQueue. Which is workable, and if you want to leave it as that for now then I don't blame you and we can always revisit it later :-). But UnboundedQueue is a holdover from prehistoric trio and it's a bit odd – this (and an equivalent stub method in _io_windows.py) is literally it's only remaining usage, and it's not actually terribly well suited for this. It provides irrelevant functionality, and also is limited in that we can't mark it as closed when the kevent is unregistered (see the NotImplementedError in notify_fd_close for an example of this being an issue).

So I'm wondering if we should replace the UnboundedQueue with a more specialized object, something like:

@attr.s
class KqueueMonitor:
    data = attr.ib(default=attr.Factory(list))
    closed = attr.ib(default=False)
    waiter = attr.ib(default=None)

    def _append(self, event):
        self.data.append(event)
        if self.waiter is not None:
            _core.reschedule(self.waiter)

    def _close(self):
        self.closed = True
        if self.waiter is not None:
            _core.reschedule(self.waiter)

    def __aiter__(self):
        return self

    async def __anext__(self):
        if not self.data and not self.closed:
            if self.waiter is not None:
                raise ResourceBusyError
            task = _core.current_task()
            self.waiter = task
            def abort_fn(_):
                self.waiter = None
                return _core.Abort.SUCCEEDED
            await _core.wait_task_rescheduled(abort_fn)
        if self.data:
            data = self.data
            self.data = []
            return self.data
        assert self.closed
        raise ClosedResourceError

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants