Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

selftests: diag: failing on the public CI with the new debug.config #264

Closed
matttbe opened this issue Mar 7, 2022 · 4 comments
Closed
Assignees

Comments

@matttbe
Copy link
Member

matttbe commented Mar 7, 2022

Recently, the public CI is using a slightly different kconfig in debug mode, see: multipath-tcp/mptcp-upstream-virtme-docker@d599eb4

It looks like diag.sh selftests is now often failing, e.g.:

TAP version 13
1..1
not ok 1 test: selftest_diag.tap # exit=6
# no msk on netns creation                          [  ok  ]
# after MPC handshake                               [  ok  ]
# ....chk remote_key                                [ fail ] expected 2 found 0
# ....chk no fallback                               [  ok  ]
# check fallback                                    [  ok  ]
# many msk socket present                           [ fail ] expected 200 found 174

https://api.cirrus-ci.com/v1/artifact/task/6140706417803264/summary/summary.txt

TAP version 13
1..1
not ok 1 test: selftest_diag.tap # exit=6
# no msk on netns creation                          [  ok  ]
# after MPC handshake                               [  ok  ]
# ....chk remote_key                                [ fail ] expected 2 found 0
# ....chk no fallback                               [  ok  ]
# check fallback                                    [  ok  ]
# many msk socket present                           [ fail ] expected 200 found 178

https://api.cirrus-ci.com/v1/artifact/task/4623732357464064/summary/summary.txt

TAP version 13
1..1
not ok 1 test: selftest_diag.tap # exit=6
# no msk on netns creation                          [  ok  ]
# after MPC handshake                               [  ok  ]
# ....chk remote_key                                [  ok  ]
# ....chk no fallback                               [  ok  ]
# check fallback                                    [  ok  ]
# many msk socket present                           [ fail ] expected 200 found 176

https://api.cirrus-ci.com/v1/artifact/task/5455940454449152/summary/summary.txt

One of the two issues look similar to #258.

@matttbe
Copy link
Member Author

matttbe commented Mar 7, 2022

Could we wait more than one second in the new wait_ functions in diag.sh?

@matttbe matttbe self-assigned this Mar 10, 2022
@matttbe
Copy link
Member Author

matttbe commented Mar 10, 2022

Discussed at the last meeting:

  - waiting longer (increasing the timeout) should help, to be tested
  - may be good to:
      - have two timeouts:
          - after the first one, we emit a warning (e.g. 1sec)
          - after the second one, we mark the test as failed (e.g. 10sec) and we don't continue
        → it would help understanding what's wrong.

@matttbe
Copy link
Member Author

matttbe commented Mar 24, 2022

I cannot reproduce it locally :(

I can still increase the timeout to see if it improves stuff on the Public CI but not sure how to validate that

@matttbe
Copy link
Member Author

matttbe commented Jun 20, 2022

@matttbe matttbe closed this as completed Jun 20, 2022
@matttbe matttbe assigned pabeni and unassigned matttbe Jun 20, 2022
jenkins-tessares pushed a commit that referenced this issue Oct 6, 2023
Add various tests to check maximum number of supported programs
being attached:

  # ./vmtest.sh -- ./test_progs -t tc_opts
  [...]
  ./test_progs -t tc_opts
  [    1.185325] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.186826] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  [    1.270123] tsc: Refined TSC clocksource calibration: 3407.988 MHz
  [    1.272428] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc932722, max_idle_ns: 440795381586 ns
  [    1.276408] clocksource: Switched to clocksource tsc
  #252     tc_opts_after:OK
  #253     tc_opts_append:OK
  #254     tc_opts_basic:OK
  #255     tc_opts_before:OK
  #256     tc_opts_chain_classic:OK
  #257     tc_opts_chain_mixed:OK
  #258     tc_opts_delete_empty:OK
  #259     tc_opts_demixed:OK
  #260     tc_opts_detach:OK
  #261     tc_opts_detach_after:OK
  #262     tc_opts_detach_before:OK
  #263     tc_opts_dev_cleanup:OK
  #264     tc_opts_invalid:OK
  #265     tc_opts_max:OK              <--- (new test)
  #266     tc_opts_mixed:OK
  #267     tc_opts_prepend:OK
  #268     tc_opts_replace:OK
  #269     tc_opts_revision:OK
  Summary: 18/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230929204121.20305-2-daniel@iogearbox.net
jenkins-tessares pushed a commit that referenced this issue Oct 13, 2023
Add a new test case which performs double query of the bpf_mprog through
libbpf API, but also via raw bpf(2) syscall. This is testing to gather
first the count and then in a subsequent probe the full information with
the program array without clearing passed structs in between.

  # ./vmtest.sh -- ./test_progs -t tc_opts
  [...]
  ./test_progs -t tc_opts
  [    1.398818] tsc: Refined TSC clocksource calibration: 3407.999 MHz
  [    1.400263] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd336761, max_idle_ns: 440795243819 ns
  [    1.402734] clocksource: Switched to clocksource tsc
  [    1.426639] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.428112] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  #252     tc_opts_after:OK
  #253     tc_opts_append:OK
  #254     tc_opts_basic:OK
  #255     tc_opts_before:OK
  #256     tc_opts_chain_classic:OK
  #257     tc_opts_chain_mixed:OK
  #258     tc_opts_delete_empty:OK
  #259     tc_opts_demixed:OK
  #260     tc_opts_detach:OK
  #261     tc_opts_detach_after:OK
  #262     tc_opts_detach_before:OK
  #263     tc_opts_dev_cleanup:OK
  #264     tc_opts_invalid:OK
  #265     tc_opts_max:OK
  #266     tc_opts_mixed:OK
  #267     tc_opts_prepend:OK
  #268     tc_opts_query:OK            <--- (new test)
  #269     tc_opts_replace:OK
  #270     tc_opts_revision:OK
  Summary: 19/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20231006220655.1653-4-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
jenkins-tessares pushed a commit that referenced this issue Oct 13, 2023
Add a new test case to query on an empty bpf_mprog and pass the revision
directly into expected_revision for attachment to assert that this does
succeed.

  ./test_progs -t tc_opts
  [    1.406778] tsc: Refined TSC clocksource calibration: 3407.990 MHz
  [    1.408863] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fcaf6eb0, max_idle_ns: 440795321766 ns
  [    1.412419] clocksource: Switched to clocksource tsc
  [    1.428671] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.430260] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  #252     tc_opts_after:OK
  #253     tc_opts_append:OK
  #254     tc_opts_basic:OK
  #255     tc_opts_before:OK
  #256     tc_opts_chain_classic:OK
  #257     tc_opts_chain_mixed:OK
  #258     tc_opts_delete_empty:OK
  #259     tc_opts_demixed:OK
  #260     tc_opts_detach:OK
  #261     tc_opts_detach_after:OK
  #262     tc_opts_detach_before:OK
  #263     tc_opts_dev_cleanup:OK
  #264     tc_opts_invalid:OK
  #265     tc_opts_max:OK
  #266     tc_opts_mixed:OK
  #267     tc_opts_prepend:OK
  #268     tc_opts_query:OK
  #269     tc_opts_query_attach:OK     <--- (new test)
  #270     tc_opts_replace:OK
  #271     tc_opts_revision:OK
  Summary: 20/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20231006220655.1653-6-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
matttbe pushed a commit that referenced this issue Oct 27, 2023
Add several new test cases which assert corner cases on the mprog query
mechanism, for example, around passing in a too small or a larger array
than the current count.

  ./test_progs -t tc_opts
  #252     tc_opts_after:OK
  #253     tc_opts_append:OK
  #254     tc_opts_basic:OK
  #255     tc_opts_before:OK
  #256     tc_opts_chain_classic:OK
  #257     tc_opts_chain_mixed:OK
  #258     tc_opts_delete_empty:OK
  #259     tc_opts_demixed:OK
  #260     tc_opts_detach:OK
  #261     tc_opts_detach_after:OK
  #262     tc_opts_detach_before:OK
  #263     tc_opts_dev_cleanup:OK
  #264     tc_opts_invalid:OK
  #265     tc_opts_max:OK
  #266     tc_opts_mixed:OK
  #267     tc_opts_prepend:OK
  #268     tc_opts_query:OK
  #269     tc_opts_query_attach:OK
  #270     tc_opts_replace:OK
  #271     tc_opts_revision:OK
  Summary: 20/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20231017081728.24769-1-daniel@iogearbox.net
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants