Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

self-tests #94 && #95 failing #265

Closed
pabeni opened this issue Mar 18, 2022 · 5 comments
Closed

self-tests #94 && #95 failing #265

pabeni opened this issue Mar 18, 2022 · 5 comments

Comments

@pabeni
Copy link

pabeni commented Mar 18, 2022

running in a loop:

while ./mptcp_join.sh 94 95; do : ; done

I see sporadic failures - VM with 3 vCPUs, debug build.

Created /tmp/tmp.GhtUQmr445 (size 1024 KB) containing data sent by client
Created /tmp/tmp.zHXDJjzfeb (size 1024 KB) containing data sent by server
copyfd_io_poll: poll timed out (events: POLLIN 1, POLLOUT 4)
copyfd_io_poll: poll timed out (events: POLLIN 1, POLLOUT 4)
 client exit code 2, server 0

netns ns1-KDjXGP socket stat for 10094:
Failed to find cgroup2 mount
TcpPassiveOpens                 2                  0.0
TcpEstabResets                  2                  0.0
TcpInSegs                       34                 0.0
TcpOutSegs                      95                 0.0
TcpOutRsts                      6                  0.0
TcpExtTCPPureAcks               21                 0.0
TcpExtTCPLossProbes             1                  0.0
TcpExtTCPOrigDataSent           81                 0.0
TcpExtTCPDelivered              80                 0.0
MPTcpExtMPCapableSYNRX          1                  0.0
MPTcpExtMPCapableACKRX          1                  0.0
MPTcpExtMPJoinSynRx             1                  0.0
MPTcpExtMPJoinAckRx             1                  0.0
MPTcpExtDataCsumErr             2                  0.0
MPTcpExtMPFailTx                2                  0.0
MPTcpExtMPFastcloseTx           2                  0.0
MPTcpExtMPRstTx                 2                  0.0

netns ns2-KDjXGP socket stat for 10094:
Failed to find cgroup2 mount
TcpActiveOpens                  2                  0.0
TcpEstabResets                  2                  0.0
TcpInSegs                       34                 0.0
TcpOutSegs                      55                 0.0
TcpExtTCPPureAcks               6                  0.0
TcpExtTCPOrigDataSent           31                 0.0
TcpExtTCPDelivered              18                 0.0
MPTcpExtMPCapableSYNTX          1                  0.0
MPTcpExtMPCapableSYNACKRX       1                  0.0
MPTcpExtMPTCPRetrans            1                  0.0
MPTcpExtMPJoinSynAckRx          1                  0.0
MPTcpExtOFOQueueTail            5                  0.0
MPTcpExtOFOQueue                6                  0.0
MPTcpExtOFOMerge                5                  0.0
MPTcpExtMPFailRx                2                  0.0
MPTcpExtMPRstRx                 2                  0.0
./mptcp_join.sh: line 1210: [: [{"total acts":1},{"actions":[{"order":0 pedit ,"control_action":{"type":"pipe"}keys 1
 	 index 1 ref 1 bind 1,"installed":3062,"last_used":3016
	 key #0  at 148: val ff000000 mask ffffffff
1: integer expression expected
095 MP_FAIL MP_RST                       syn[ ok ] - synack[ ok ] - ack[ ok ]
                                         sum[ ok ] - csum  [ ok ]
                                         ftx[fail] got 2 MP_FAIL[s] TX expected 1
 - failrx[fail] got 2 MP_FAIL[s] RX expected 1
Server ns stats
TcpPassiveOpens                 2                  0.0
TcpEstabResets                  2                  0.0
TcpInSegs                       34                 0.0
TcpOutSegs                      95                 0.0
TcpOutRsts                      6                  0.0
TcpExtTCPPureAcks               21                 0.0
TcpExtTCPLossProbes             1                  0.0
TcpExtTCPOrigDataSent           81                 0.0
TcpExtTCPDelivered              80                 0.0
MPTcpExtMPCapableSYNRX          1                  0.0
MPTcpExtMPCapableACKRX          1                  0.0
MPTcpExtMPJoinSynRx             1                  0.0
MPTcpExtMPJoinAckRx             1                  0.0
MPTcpExtDataCsumErr             2                  0.0
MPTcpExtMPFailTx                2                  0.0
MPTcpExtMPFastcloseTx           2                  0.0
MPTcpExtMPRstTx                 2                  0.0
Client ns stats
TcpActiveOpens                  2                  0.0
TcpEstabResets                  2                  0.0
TcpInSegs                       34                 0.0
TcpOutSegs                      55                 0.0
TcpExtTCPPureAcks               6                  0.0
TcpExtTCPOrigDataSent           31                 0.0
TcpExtTCPDelivered              18                 0.0
MPTcpExtMPCapableSYNTX          1                  0.0
MPTcpExtMPCapableSYNACKRX       1                  0.0
MPTcpExtMPTCPRetrans            1                  0.0
MPTcpExtMPJoinSynAckRx          1                  0.0
MPTcpExtOFOQueueTail            5                  0.0
MPTcpExtOFOQueue                6                  0.0
MPTcpExtOFOMerge                5                  0.0
MPTcpExtMPFailRx                2                  0.0
MPTcpExtMPRstRx                 2                  0.0
                                         rtx[fail] got 2 MP_RST[s] TX expected 1
 - rstrx [fail] got 2 MP_RST[s] RX expected 1
Server ns stats
TcpPassiveOpens                 2                  0.0
TcpEstabResets                  2                  0.0
TcpInSegs                       34                 0.0
TcpOutSegs                      95                 0.0
TcpOutRsts                      6                  0.0
TcpExtTCPPureAcks               21                 0.0
TcpExtTCPLossProbes             1                  0.0
TcpExtTCPOrigDataSent           81                 0.0
TcpExtTCPDelivered              80                 0.0
MPTcpExtMPCapableSYNRX          1                  0.0
MPTcpExtMPCapableACKRX          1                  0.0
MPTcpExtMPJoinSynRx             1                  0.0
MPTcpExtMPJoinAckRx             1                  0.0
MPTcpExtDataCsumErr             2                  0.0
MPTcpExtMPFailTx                2                  0.0
MPTcpExtMPFastcloseTx           2                  0.0
MPTcpExtMPRstTx                 2                  0.0
Client ns stats
TcpActiveOpens                  2                  0.0
TcpEstabResets                  2                  0.0
TcpInSegs                       34                 0.0
TcpOutSegs                      55                 0.0
TcpExtTCPPureAcks               6                  0.0
TcpExtTCPOrigDataSent           31                 0.0
TcpExtTCPDelivered              18                 0.0
MPTcpExtMPCapableSYNTX          1                  0.0
MPTcpExtMPCapableSYNACKRX       1                  0.0
MPTcpExtMPTCPRetrans            1                  0.0
MPTcpExtMPJoinSynAckRx          1                  0.0
MPTcpExtOFOQueueTail            5                  0.0
MPTcpExtOFOQueue                6                  0.0
MPTcpExtOFOMerge                5                  0.0
MPTcpExtMPFailRx                2                  0.0
MPTcpExtMPRstRx                 2                  0.0

                                         itx[ ok ] - infirx[ ok ]

1 failure(s) has(ve) been detected:
	- 95: MP_FAIL MP_RST

Note: the bad outout:

./mptcp_join.sh: line 1210: [: [{"total acts":1},{"actions":[{"order":0 pedit ,"control_action":{"type":"pipe"}keys 1
 	 index 1 ref 1 bind 1,"installed":3062,"last_used":3016
	 key #0  at 148: val ff000000 mask ffffffff
1: integer expression expected

is due to old/buggy tc version, does not affect the tests themself - e.g. I see the failure even with a recent tc version/without the above bash splat

@geliangtang
Copy link
Member

A kernel log is attached.

repeat-log-32245-204.dmesg.txt

@matttbe
Copy link
Member

matttbe commented Mar 31, 2022

From the last meeting:

  - It looks like we got 2 checksum failures on 2 different failures:
      - it triggers 2 checksum failures → 2 MP Fail
      - if it is the case, could we make sure we only "break" one subflow?
      - but we should because the 'iptables' rule should mark only one packet (can be one packet before TSO)
      - so only one packet should be corrupted on one single sublow

@matttbe
Copy link
Member

matttbe commented May 2, 2022

@pabeni recently sent a new patch:

New patches for t/upstream:

New patches for t/upstream-net:

  • b841c3f: net/sched: act_pedit: really ensure the skb is writable
  • Results: eefc441..8b01b4c (export-net)

I quote:

This almost solves issues/265 here. I'm still getting some rare failure with MPTcpExtMPFailTx==0: sometimes the transfer completes before we are able to use the 2nd/failing link. The relevant fix is a purely seft-test one

I guess we leave this ticket opens to have a fix for that, right?
Should we have more data to transfer? Wait for an event? Reduce BW? Introduce delays/losses on the first link to force using the 2nd subflow?

@geliangtang
Copy link
Member

geliangtang commented May 4, 2022

The old tc version bad outout issue is fixed by my commit 2770b28 ("selftests: mptcp: fix a mp_fail test warning").

@pabeni
Copy link
Author

pabeni commented May 9, 2022

@pabeni recently sent a new patch:

New patches for t/upstream:

* [b841c3f](https://github.com/multipath-tcp/mptcp_net-next/commit/b841c3f765af1e7af669ce7c1b406a60668d4b4e): net/sched: act_pedit: really ensure the skb is writable

* Results: [2767792](https://github.com/multipath-tcp/mptcp_net-next/commit/2767792d035c53b26cbdf47176beba4703acc406)..[8ae619c](https://github.com/multipath-tcp/mptcp_net-next/commit/8ae619c5e009f5a1e0e03be0e9085edc7dfe801e) (export)

New patches for t/upstream-net:

* [b841c3f](https://github.com/multipath-tcp/mptcp_net-next/commit/b841c3f765af1e7af669ce7c1b406a60668d4b4e): net/sched: act_pedit: really ensure the skb is writable

* Results: [eefc441](https://github.com/multipath-tcp/mptcp_net-next/commit/eefc441cb5abe84dc544a1859994a25c8bb443f8)..[8b01b4c](https://github.com/multipath-tcp/mptcp_net-next/commit/8b01b4ca3343862e8dd3167fc0233e7bf905e2b6) (export-net)

I quote:

This almost solves issues/265 here. I'm still getting some rare failure with MPTcpExtMPFailTx==0: sometimes the transfer completes before we are able to use the 2nd/failing link. The relevant fix is a purely seft-test one

I guess we leave this ticket opens to have a fix for that, right? Should we have more data to transfer? Wait for an event? Reduce BW? Introduce delays/losses on the first link to force using the 2nd subflow?

IMHO we are good, all the relevant fixes are in the export branch and I do not see failure anymore

@pabeni pabeni closed this as completed May 9, 2022
jenkins-tessares pushed a commit that referenced this issue Oct 6, 2023
Add various tests to check maximum number of supported programs
being attached:

  # ./vmtest.sh -- ./test_progs -t tc_opts
  [...]
  ./test_progs -t tc_opts
  [    1.185325] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.186826] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  [    1.270123] tsc: Refined TSC clocksource calibration: 3407.988 MHz
  [    1.272428] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc932722, max_idle_ns: 440795381586 ns
  [    1.276408] clocksource: Switched to clocksource tsc
  #252     tc_opts_after:OK
  #253     tc_opts_append:OK
  #254     tc_opts_basic:OK
  #255     tc_opts_before:OK
  #256     tc_opts_chain_classic:OK
  #257     tc_opts_chain_mixed:OK
  #258     tc_opts_delete_empty:OK
  #259     tc_opts_demixed:OK
  #260     tc_opts_detach:OK
  #261     tc_opts_detach_after:OK
  #262     tc_opts_detach_before:OK
  #263     tc_opts_dev_cleanup:OK
  #264     tc_opts_invalid:OK
  #265     tc_opts_max:OK              <--- (new test)
  #266     tc_opts_mixed:OK
  #267     tc_opts_prepend:OK
  #268     tc_opts_replace:OK
  #269     tc_opts_revision:OK
  Summary: 18/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230929204121.20305-2-daniel@iogearbox.net
jenkins-tessares pushed a commit that referenced this issue Oct 13, 2023
Add a new test case which performs double query of the bpf_mprog through
libbpf API, but also via raw bpf(2) syscall. This is testing to gather
first the count and then in a subsequent probe the full information with
the program array without clearing passed structs in between.

  # ./vmtest.sh -- ./test_progs -t tc_opts
  [...]
  ./test_progs -t tc_opts
  [    1.398818] tsc: Refined TSC clocksource calibration: 3407.999 MHz
  [    1.400263] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd336761, max_idle_ns: 440795243819 ns
  [    1.402734] clocksource: Switched to clocksource tsc
  [    1.426639] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.428112] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  #252     tc_opts_after:OK
  #253     tc_opts_append:OK
  #254     tc_opts_basic:OK
  #255     tc_opts_before:OK
  #256     tc_opts_chain_classic:OK
  #257     tc_opts_chain_mixed:OK
  #258     tc_opts_delete_empty:OK
  #259     tc_opts_demixed:OK
  #260     tc_opts_detach:OK
  #261     tc_opts_detach_after:OK
  #262     tc_opts_detach_before:OK
  #263     tc_opts_dev_cleanup:OK
  #264     tc_opts_invalid:OK
  #265     tc_opts_max:OK
  #266     tc_opts_mixed:OK
  #267     tc_opts_prepend:OK
  #268     tc_opts_query:OK            <--- (new test)
  #269     tc_opts_replace:OK
  #270     tc_opts_revision:OK
  Summary: 19/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20231006220655.1653-4-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
jenkins-tessares pushed a commit that referenced this issue Oct 13, 2023
Add a new test case to query on an empty bpf_mprog and pass the revision
directly into expected_revision for attachment to assert that this does
succeed.

  ./test_progs -t tc_opts
  [    1.406778] tsc: Refined TSC clocksource calibration: 3407.990 MHz
  [    1.408863] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fcaf6eb0, max_idle_ns: 440795321766 ns
  [    1.412419] clocksource: Switched to clocksource tsc
  [    1.428671] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.430260] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  #252     tc_opts_after:OK
  #253     tc_opts_append:OK
  #254     tc_opts_basic:OK
  #255     tc_opts_before:OK
  #256     tc_opts_chain_classic:OK
  #257     tc_opts_chain_mixed:OK
  #258     tc_opts_delete_empty:OK
  #259     tc_opts_demixed:OK
  #260     tc_opts_detach:OK
  #261     tc_opts_detach_after:OK
  #262     tc_opts_detach_before:OK
  #263     tc_opts_dev_cleanup:OK
  #264     tc_opts_invalid:OK
  #265     tc_opts_max:OK
  #266     tc_opts_mixed:OK
  #267     tc_opts_prepend:OK
  #268     tc_opts_query:OK
  #269     tc_opts_query_attach:OK     <--- (new test)
  #270     tc_opts_replace:OK
  #271     tc_opts_revision:OK
  Summary: 20/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20231006220655.1653-6-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
matttbe pushed a commit that referenced this issue Oct 27, 2023
Add several new test cases which assert corner cases on the mprog query
mechanism, for example, around passing in a too small or a larger array
than the current count.

  ./test_progs -t tc_opts
  #252     tc_opts_after:OK
  #253     tc_opts_append:OK
  #254     tc_opts_basic:OK
  #255     tc_opts_before:OK
  #256     tc_opts_chain_classic:OK
  #257     tc_opts_chain_mixed:OK
  #258     tc_opts_delete_empty:OK
  #259     tc_opts_demixed:OK
  #260     tc_opts_detach:OK
  #261     tc_opts_detach_after:OK
  #262     tc_opts_detach_before:OK
  #263     tc_opts_dev_cleanup:OK
  #264     tc_opts_invalid:OK
  #265     tc_opts_max:OK
  #266     tc_opts_mixed:OK
  #267     tc_opts_prepend:OK
  #268     tc_opts_query:OK
  #269     tc_opts_query_attach:OK
  #270     tc_opts_replace:OK
  #271     tc_opts_revision:OK
  Summary: 20/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20231017081728.24769-1-daniel@iogearbox.net
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants