Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add control connection keepalive arguments #1423

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

davidBar-On
Copy link
Contributor

@davidBar-On davidBar-On commented Nov 20, 2022

Adding Control-connection TCP Keepalive arguments: --cntl-ka [<KEEPIDLE>/<KEEPINTV>/<KEEPCNT>].

As suggested in the resolved issues, setting the keepalive period (TCP_KEEPIDLE) resolves #812 by allowing to keep the control connection alive longer than the firewall timeout, and resolves #835 by allowing the server to terminate inactive connections after shorter time.

Added HAVE_TCP_KEEPALIVE, based on TCP_KEEPIDLE availability, as I am not sure if and which OSs support the TCP keepalive. In principle, there could be a separated "HAVE" for the socket (not TCP) SO_KEEPALIVE, but I assume this is not necessary.

(Note based on the discussion below starting with this comment: to use this PR before it is merged into main line, autoconf and autoheader should be run before ./configure. If autoconf fails with error that version 2.71 should be used, run autoconf -V to get its version and change "2.71" in configure.ac to that version.)

I was not able to fully test the code, as I was not able to test the keepalive retries.

Also, for some reason, in Windows WSL2 it seems that the keepalive period (TCP_KEEPIDLE) must be greater than the retry interval (TCP_KEEPINTV). Otherwise, the keepalive is sent only once. As I couldn't test the retries, I assumed that it may even be that the keepalive period must be greater than the retry period (TCP_KEEPINTV * TCP_KEEPCNT). Therefore, in case this condition is not met and only the keepalive period was set, I added code that reduces the retry interval accordingly.

@davidBar-On
Copy link
Contributor Author

davidBar-On commented Jan 10, 2023

@ankitg12, I reply to your comment here, as building the change is using the code of this PR.

Any idea when this change would be merged ? Or how I can get binary with this change?

As this PR is not merged (yet) into the main code, there are probably no available binaries. You can try building it yourself. In general, the process is (under Linux):

git clone https://github.com/davidBar-On/iperf.git --branch issue-812-835-control-connection-keepalive
cd iperf
./bootstrap.sh
./confoigure
make

The iperf3 executable will be ./src/iperf3. To make sure it was built o.k. run ./src/iperf3 -h. The help should include a line for the new --cntl-ka option.

Note that the PR is not fully tested. If you were able to build and run iperf3 successfully and you find problems with the implementation of the new option, please report them in this PR. Thanks.

@lbsou
Copy link

lbsou commented Feb 21, 2023

I guess this is probably obvious, but I had to add #define HAVE_TCP_KEEPALIVE 1 in ./iperf/src/iperf_config.h before launching make to have the --cntl-ka option available.

I can confirm with netstat -tnope that the new parameter change the control channel behavior correctly:

KEEPIDLE = Modify the interval of the keepalive
KEEPINTV = Modify the interval of the retry
KEEPCNT = Modify the amount of keepalive lost before dropping the connection

I would be very happy to see this change merge into the master branch!

@davidBar-On
Copy link
Contributor Author

.... I had to add #define HAVE_TCP_KEEPALIVE 1 in ./iperf/src/iperf_config ....

@lbsou, configure.h is generated by ./iperf3/configure per ./iperf3/configure.ac. Currently, configure.ac defines HAVE_TCP_KEEPALIVE is defined in confugure.h when netinet/tcp.h defines TCP_KEEPIDLE.

Which OS are you using? Can you check if and where TCP_KEEPIDLE is defined?

@lbsou
Copy link

lbsou commented Feb 21, 2023

Hi!

I tried on the following platforms :

  • Rocky Linux release 9.0 (Blue Onyx)
  • Ubuntu 20.04.5 LTS

The option is available on both system :

[root@ROCKY:/]# sysctl -a | grep keepalive
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200

root@UBUNTU:/# sysctl -a | grep keepalive
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200

I was able to compile and run the following code:

#include <stdio.h>
#include <netinet/tcp.h>

int main(void)
{
int foo = TCP_KEEPIDLE;
printf("%d", foo);
}

Returned : 4

@davidBar-On
Copy link
Contributor Author

Hi, thanks for the info! Hopefully I now understand the problem. When (if...) the PR will be merged into the mainline, autoconf and autoheader will be run to updated the ./cofigure file based on the ./configure.ac. Since the PR is not merged, ./configure does not handle the definition of HAVE_TCP_KEEPALIVE.

Can you try running autoconf? It should updated ./configure file, which defines HAVE_TCP_KEEPALIVE. If autoconf fails with error that version 2.71 should be used, run autoconf -V to get its version and change "2.71" in configure.ac to that version.

Now run autoheader and then ./configure, ./src/iperf_config.h should now define HAVE_TCP_KEEPALIVE.

Was this successful? If it did, I will add the above instructions to the PR description

@lbsou
Copy link

lbsou commented Feb 22, 2023

Thank you David, that was it!

I did autoconf then autoheader then ./configure and the ./src/iperf_config.h now contain #define HAVE_TCP_KEEPALIVE 1

Any idea when this change will be merged?

@davidBar-On
Copy link
Contributor Author

@lbsou, I have updated the PR description. Thanks for your efforts and time for clarifying this issue (and helping me to better understand the configuration process ...).

I am not from the team that maintains iperf3, so I don't know if and when this PR will be merged into the mainline.

@lbsou
Copy link

lbsou commented Feb 22, 2023

Thank you David!

@bmah888 Can we be of any further help for this PR to be merged?

Thanks!

@lbsou
Copy link

lbsou commented Mar 6, 2023

Hi @davidBar-On !

I did some more testing and I can activate keepalive only from client to server, not from server to client.

Server :
/usr/local/bin/iperf3 -s -i 1 -f k --forceflush --idle-timeout 300 --rcv-timeout 5000 --one-off -p 16000 --timestamps='%F %T ' --cntl-ka=10/1/5

# netstat -tnope | grep iperf3
tcp6 0 0 <server_ip>:16000 <client_ip>:6730 ESTABLISHED 0 40840968 3294693/iperf3 off (0.00/0/0)

Client :
/usr/local/bin/iperf3 -u -l 1400 -c <server_ip> -t 0 -b 1M --udp-counters-64bit --connect-timeout=1000 --dscp 0 --pacing-timer 12000 -f k -p 16000 --timestamps='%F %T ' --bidir --forceflush --cntl-ka=10/1/5 --rcv-timeout 5000

# netstat -tnope | grep iperf3
tcp 0 0 192.168.1.146:47610 23.250.5.250:16001 ESTABLISHED 0 1629116 218390/iperf3 keepalive (4.59/0/0)

This means the server might not notice if the control channel dies (aggressive nat gateway session timeout?) and then the server is unreachable forever (in case of a bidir communication, the server continues sending udp packet indefinitely.).

@davidBar-On davidBar-On force-pushed the issue-812-835-control-connection-keepalive branch from 4c4139d to 266ca8d Compare March 17, 2023 16:32
@davidBar-On
Copy link
Contributor Author

Hi @lbsou, thanks for testing and finding this issue.
Actually, setting the keepalive only on the client side was intentional, and just the help for the option wrongly showed it for both the client and the server ...

However, when evaluating your comment I realized that it is better that the keepalive option will be available for both the client and server. I enhanced the solution and the option should work now on both sides.

@davidBar-On davidBar-On force-pushed the issue-812-835-control-connection-keepalive branch from 266ca8d to 40345dc Compare March 18, 2023 07:48
@davidBar-On
Copy link
Contributor Author

Hi @lbsou, thanks for testing and finding this issue.

Actually, setting the keepalive only on the client side was intentional, as the client is initiating the control stream, but I forgot to limit the option for the client. However, when evaluating your comment I realized that it is better that the keepalive option will be available for both the client and the server. I enhanced the solution and the option should work now on both sides.

@lbsou
Copy link

lbsou commented Mar 18, 2023

Thanks you very much @davidBar-On, may the gods of merging be with us!

@swlars
Copy link
Contributor

swlars commented Oct 5, 2023

We might look into testing and merging this soon. We're interested in the changes. In the meantime, if you could update the man page for the new command line argument that would be great, otherwise we'll probably get to it once multi-threading is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

intermittent connection Cannot test long UDP sessions
3 participants