-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
connect() & timeout #20
Comments
Hanging is not good :) we should add this to our regression tests and fix it. What does the JVM do here? |
Interesting question. In the specific ping-pong case here the error never triggers because In the general case: In the connect(where, timeout), the timeout should happen. Directly and not What happens without timeout specified is that there can be a long, UDP sockets are not usually connected, so I have no idea there. The case of interest is TCP. I will skip over recent efforts to make The local sends a connection message to the remote. If If the local tcp gets an acknowledgement back, it sends The TL;DR is that if the wire breaks after a certain The other issue about whatever connection error not |
Sorry, where do you see this? I'm staring hard at the JavaDoc for |
Ah, I see "Channels" original and still useful concept is: I do not know channels well enough if they themselves even The |
@LeeTibbert do you have a reproduction for this issue as well? I am having trouble replicating, because left to my own devices I haven't got a clue 😀 |
The old "works fine on my machine" excuse, eh? The one line reproducer you used earlier works on my machine.
You should see about a two minute hang, the, given your earlier PRs, the client |
Oh, gosh, it's the same reproducer! Ok will give that a try thanks. |
Huh, I think this is a case of works fine on my machine! 😛 here's the change I made to diff --git a/core/src/main/scala/epollcat/internal/ch/EpollAsyncSocketChannel.scala b/core/src/main/scala/epollcat/internal/ch/EpollAsyncSocketChannel.scala
index 4397375..46eb7df 100644
--- a/core/src/main/scala/epollcat/internal/ch/EpollAsyncSocketChannel.scala
+++ b/core/src/main/scala/epollcat/internal/ch/EpollAsyncSocketChannel.scala
@@ -191,7 +191,8 @@ final class EpollAsyncSocketChannel private (fd: Int)
.netdb
.getaddrinfo(
toCString(addr.getAddress().getHostAddress()),
- toCString(addr.getPort.toString),
+ // toCString(addr.getPort.toString),
+ c"8080",
hints,
addrinfo
) Which immediately gets me this result (notice the 0s at the bottom), no hanging.
|
Ok, I was able to replicate on both JVM and Native. I modified the test("HTTP echo".only) {
val address = new InetSocketAddress(InetAddress.getByName("postman-echo.com"), 8080) I also increased the timeout for the test runner (default is 30s, only works for JVM). override val munitTimeout = 1.hour Which gets me this result:
There are your two minutes, give or take 😉 So if I am understanding correctly, I believe we are matching JVM behavior here. In which case, what to do? If this is a nuisance for our test suite, I can investigate how to make the timeouts work properly. |
Now that is a cunning way to replicate. Matching the JVM behavior is good. POSIX says " unspecified timeout interval until the connection is established" I think that is exactly what we are seeing. In C, there is only the one `connect(args) method and Java solves this problem by having two Implementing an async connect_with_timeout may be a pain/time_suck. The two arg epoll could be Then the test program would call `connect(args, FiniteDuration(20, Seconds). A Java style connect-with-timeout could be used in multiple tests within TcpSuite.scala. on re-reading, I realize that I mangled the argument types, but the central idea |
Let me do a quick look to see if there is a C level socket option. |
The problem is the two arg method doesn't exist in the JDK If you are satisfied that this is something we only need to fix in our test suite (and not the socket implementations) then I can make an upstream PR to munit-cats-effect to add support for this. |
This discussion sounds familiar. Let me check the JDK If that is true, I suggest that we push this to the backburner. Now that the server will both show tracebacks I have been looking in the Scala Native javalib to see how it is implemented there. To my thinking As always, thank you for spending time on this. Sometimes one just has to say PS: no obvious socket level option. That would have been too easy. PPS: |
Never leave a bone with a hungry dog. On linux the connection timeout period for the IPv4 the test uses is a kernel parameter: tcp_syn_retries
That 127 seconds looks very familiar. So now we have an answer for people like me who complain about long timeout: decrease your kernel parameter. We will see this again, no doubt. Now we know what to tell people. |
If it is agreeable to you, I may submit a PR for TcpSuite with a short explanation that My backbrain suggested this to me as a way to set expectations for people running |
@LeeTibbert I've opened this PR so that we can set a proper timeout for our test suite, to prevent hangs like this: I will publish one of my bootlegs later and then we can give it a try, see what you think. |
Agree with closure. Will exercise the timeout when it is soup. |
Is there a culturally compatible way to give a timeout value, or a re-try value
to
connect()
?epoll
is not Java, but Java can provide some guidance as to whathas proven helpful to large groups of people over time.
Java Socket class has
connect(SocketAddress endpoint, int timeout)
.There is a place in TcpSuite
test("server-client ping-pong")
whereit calls:
_ <- ch.connect(serverAddr)
which in turn callsa more complicated underlying (4 arg?) version.
I was debugging a network related problem yesterday.
The issue was that the Suite was hanging/ceasing_to_make_progress.
To simplify things, I commented out all tests but "pong-pong".
At one point, I had thought the hang was due to the
connect()
not timing out.
Turns out the problem was bad info going to
connect()
ratherthan lack of a timeout. However, most networking programs
I know are defensive and have a time out
So the issue is, if culturally appropriate:
no timeout parameter to connect (and probably the
chain of things it calls, which could get interesting/complicated)
TcpSuite probably has multiple connect() calls which, understandably, do not call this hypothetical timeout.
The text was updated successfully, but these errors were encountered: