Should we do anything about timeouts for subparts of an operation? #686

alexshpilkin · 2018-09-27T15:06:56Z

@njsmith has written rather a lot on timeouts and why they should be provided generically by the asyncronous processing library, instead of being passed as an additional parameter to the individual calls.

Imagine my dismay when I saw not one, but two separate timeout arguments in @theelous3’s asks HTTP client library (the only one currently available with trio support). And the accompanying discussion, frankly speaking, felt extremely underwhelming from the point of view of @njsmith’s argument (“what thing is to be accomplished” as opposed to “what sequence of things is going to be performed”).

... And then I realized there was—in my particular context—no obvious way to do better. There is, at any given moment in my application, a pool 10 to 100 HTTP connections managed by asks, and thousands of tasks trying to make an HTTP request. If I enclose the request/get/... call in trio’s move_on_after, the thing that would be time-limited is not the server’s response time, but that time plus the time to acquire a connection from the pool. Which might be what’s needed in a particular situation, but not in mine.

So, I guess my question is both to @njsmith and to @theelous3: how can trio’s timeout API can be usefully applied (or extended to apply) in the situation asks faces?

The text was updated successfully, but these errors were encountered:

alexshpilkin · 2018-09-27T15:16:40Z

As usual, a possible answer came to me while I was writing down the question in detail—not sure it is the answer, though:

Be honest about what’s happening and separate getting a connection from the pool from making the actual request, so that a timeout can be imposed on either one separately or on both at once. Not sure how to prevent user code from grabbing a connection and monopolizing it; but that not sure this needs to be prevented. This requires departing from a requests-like API, but... is that necessarily a problem when there is no de facto standard library for asynchronous HTTP (like requests is for synchronous HTTP)?

P.S.: That reminds me: this, of course, is equally relevant to the design of requests3, so maybe @kennethreitz has already thought of something.
P.P.S.: I also forgot to mention in the original post that this also appears to apply equally to anything that involves thread pools, which are currently in the design phase in trio IIUC.

theelous3 · 2018-09-27T15:18:33Z

To further your dismay, there are actually three. One to be applied for each iter of a stream's body.

We could do something nuts like expose a series of steps around which the user of asks can implement their own timeouts with their async lib of choice, but to what gain? Is it worth it?

Breaking it in to an optional second API that treats creating connections, getting them from the pool, sending the request, reading the headers and status line, reading the body etc.

Edit: fatfingered on mobile. Cont...

Treating the above steps individually we can get this "trio timeout" ethos in to asks, or any other lib.

Fundamentally we write libs to simplify. I don't think it's a great idea to remove the current style of api, but I'm also not against exposing a second more granular api for greater user control.

smurfix · 2018-09-27T15:26:34Z

Hi,

There is, at any given moment in my application, a pool 10 to 100 HTTP connections managed by asks, and thousands of tasks trying to make an HTTP request. If I enclose the |request|/|get|/... call in trio’s |move_on_after| <https://trio.readthedocs.io/en/latest/reference-core.html#a-simple-timeout-example>, the thing that would be time-limited is not the server’s response time, but that time plus the time to acquire a connection from the pool. Which might be what’s needed in a particular situation, but not in mine.

Splitting the get-a-connection-from-the-pool step from the rest looks like a requirement in this case. Consider a server with a 5-minute session cookie timeout. You do not want your interaction with this server blocked in the middle because you're out of connections and everbody else happens to be ssslllooowww.

…

-- -- Matthias Urlichs

alexshpilkin · 2018-09-27T15:37:39Z

@theelous3 Well, if the question is I don’t like having a timeout argument in asks, the answer is explained at length in my first link above (incl. the specific case of the original requests library), but the short version is that the composability sucks, i.e. it becomes extremely tedious to say “I want such-and-such done in five seconds” when such-and-such involves async operations other than a single asks call, the common result being that nobody cares to do so (and ultimately software that appears to hang). Your example of a one-second network operation taking more like one and a half seconds, in particular, sounds like a bug to be fixed in the underlying network API rather than harsh reality to be dealt with. (Let’s assume we’re not CPU-bound and keep the OS’s preemptive scheduling out of the picture.)

If you’re finding it curious that I’m speaking about bunching things together when suggesting that the asks request call be split, well, so am I; but if we (as trio users) settled on design principle (here: don’t roll your own timeouts), we’d better apply it across the board, and my point is we can’t, really, given the current asks API.

alexshpilkin · 2018-09-27T15:45:36Z

@smurfix Well, yeah, but this appears to be a separate point: reserving a connection explicitly, as opposed to separating the time to reserve from the time to make a request; I was specifically thinking of any way to do the latter without doing the former. The rabbit hole of a general API for reserving connections is just too deep: I’ll then want to reserve five connections out of the common pool and put them into a smaller pool to use on my own, or pool connections to separate domains separately (as web browsers commonly do), &c. I’d rather not try to solve that without an actual use case.

njsmith · 2018-09-28T05:01:32Z

I totally get why asks has the connection limit (and aiohttp does something similar), but over time I'm coming to believe that this is just the wrong level to do connection limiting. If you set the asks connection pool to unlimited, and then use your own CapacityLimiter (which could be per-host or whatever), or some more generic tool to control concurrency level, then that part of the problem goes away.

I'm a bit less dogmatic about the "read timeout" then I used to be... it's absurd that there's no good way for requests to provide an overall timeout, but even once we solve that, the "read timeout" is still somewhat useful – there are lots of cases where it will successfully catch things, and you don't have to think about the size of the file being downloaded when you set it.

oremanj · 2019-05-01T06:37:20Z

Renamed this to reflect the open question. I think the answer is "no" -- individual APIs of outside libraries can take a timeout argument if there's some subpart they want to have a separate timeout on; I can't think of a way to make subpart timeouts "better" analogously to how Trio's cancel scopes make entire-operation timeouts "better". But leaving open for now in case anyone has a different opinion.

Zac-HD · 2023-03-17T05:53:35Z

Closing because we haven't seen discussion of this - but if you've found this with an idea please do post it below!

mehaase mentioned this issue Oct 23, 2018

Consider timeouts python-trio/trio-websocket#64

Closed

oremanj changed the title ~~There are timeouts and then there are timeouts~~ Should we do anything about timeouts for subparts of an operation? May 1, 2019

oremanj added cancellation design discussion labels May 4, 2019

Zac-HD closed this as completed Mar 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we do anything about timeouts for subparts of an operation? #686

Should we do anything about timeouts for subparts of an operation? #686

alexshpilkin commented Sep 27, 2018

alexshpilkin commented Sep 27, 2018

theelous3 commented Sep 27, 2018 •

edited

Loading

smurfix commented Sep 27, 2018 via email

alexshpilkin commented Sep 27, 2018

alexshpilkin commented Sep 27, 2018

njsmith commented Sep 28, 2018

oremanj commented May 1, 2019

Zac-HD commented Mar 17, 2023

Should we do anything about timeouts for subparts of an operation? #686

Should we do anything about timeouts for subparts of an operation? #686

Comments

alexshpilkin commented Sep 27, 2018

alexshpilkin commented Sep 27, 2018

theelous3 commented Sep 27, 2018 • edited Loading

smurfix commented Sep 27, 2018 via email

alexshpilkin commented Sep 27, 2018

alexshpilkin commented Sep 27, 2018

njsmith commented Sep 28, 2018

oremanj commented May 1, 2019

Zac-HD commented Mar 17, 2023

theelous3 commented Sep 27, 2018 •

edited

Loading