Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS Anvil Server Tests in Nightly CI #3651

Merged
merged 1 commit into from
Aug 15, 2023

Conversation

FAlbertDev
Copy link
Collaborator

@FAlbertDev FAlbertDev commented Jul 31, 2023

(Refers to #3444)
This adds TLS Anvil server tests to Botan's nightly CI. Unfortunately, a current bug of the origin TLS-Anvil repo required me to create a fork and work around it (I already created an issue for the TLS-Anvil repo). Until the bug is fixed, the CI script uses my fork.
The job's workflow is the following:

  1. Build Botan and TLS-Anvil
  2. Run the script src/scripts/ci/ci_tlsanvil_test.py to setup and start a botan tls_server and the TLS-Anvil process. The TLS-Anvil process tests the server by playing a client (repeatedly) connecting to the server. TLS-Anvil creates a TestSuiteResults folder for storing the results with additional test information.
  3. The TestSuiteResults folder and the TLS-Anvil log files are uploaded as an artifact, so we can analyze them if necessary.
  4. The script src/scripts/ci/ci_tlsanvil_check.py checks if the tests are successful, i.e., if the results in TestSuiteResults meet our expectations. Currently, some tests fail. I will look into these together with @reneme after his holidays. Until then, I allowed the respective tests to fail.

An example workflow execution with failing tests (without deactivating the currently failing tests) is here. When allowing the tests to fail, the workflow looks like this. Since the tests take around 1.5h, they are only executed nightly.
I also plan to add the TLS-Anvil client tests in a future PR.

@coveralls
Copy link

coveralls commented Jul 31, 2023

Coverage Status

coverage: 91.703% (-0.02%) from 91.719% when pulling 4fb301b on FAlbertDev:ci/tls-anvil into 8c72bab on randombit:master.

Copy link
Owner

@randombit randombit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for setting this up @FAlbertDev, I'll be happy when this is part of our test coverage.

I have a few minor comments/questions but overall looks very good.

cert_path = os.path.join(Config.key_and_cert_storage_path, Config.tmp_cert_file_name)

with open(key_path, 'w', encoding='utf-8') as keyfile:
subprocess.run([botan_exe_path, "keygen", "--algo=RSA"], stdout=keyfile, check=True)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keygen with RSA defaults to 3072 bits since that is ~~ 128 bit security level, but for testing this is just some needless computational overhead - let's generate 2048 bit instead since that is still sufficient and may save a bit of test time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be not significantly faster with 2048 bit. Won't hurt, though.

tls_anvil_cmd = [
"java", "-jar", tls_anvil_jar_path,
"-strength", "1",
"-parallelHandshakes", "1",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runners we get have 2 cores - would it make sense / be possible to use 2-3 handshakes in parallel here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, parallel handshakes would be great. However, I run TLS-Anvil against the ./botan tls_server process, which only allows one connection at a time.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use tls_http_server instead?

action="store_true",
default=False,
help="Test the Botan TLS client",
)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead a single option --test-target= which could take on any of client, server, or client,server?

cache-key: linux-clang-x86_64-static

- name: Build Botan
run: ./configure.py --compiler-cache=ccache --build-targets=static,cli --without-documentation && make -j8
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I don't know if this even works in GH Actions but should this be something like

make -j$(nproc)

so the build scales to using all (and only) the number of cores available?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to work 👍

uses: ./.github/actions/setup-build-agent
with:
target: static
cache-key: linux-clang-x86_64-static
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cache key seems wrong since we are using gcc to build

I think if we use the linux-gcc-x86_64-static key, then we can share caches with this build and the normal static library GCC CI build.

@FAlbertDev
Copy link
Collaborator Author

Thanks for your review, @randombit! I addressed your suggestions in ba484ec :)

Copy link
Collaborator

@reneme reneme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Just a few general remarks below. Detailed review will need to come later.

Comment on lines 83 to 87
- name: Fetch TLS-Anvil fork for Botan tests
uses: actions/checkout@v3
with:
repository: FAlbertDev/TLS-Anvil
path: ./tlsanvil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need a fork, maybe it should live under @randombit's GitHub account. Similarly to the boringssl fork we maintain for BoGo.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm guessing you hope to get rid of the fork after your upstream issue is fixed)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm guessing you hope to get rid of the fork after your upstream issue is fixed)

True. In fact, the same hour you wrote this comment, the issue was fixed ;)

Comment on lines 74 to 75
- name: Build Botan
run: ./configure.py --compiler-cache=ccache --build-targets=static,cli --without-documentation && make -j$(nproc)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe configure and build botan in the ci_tlsanvil_test.py instead?

Comment on lines 99 to 100
env:
DOCKER: 1 # TLS-Anvil specific to disable the loading bar
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please set this inside the ci_tlsanvil_test.py script, if possible. Rationale: We try to keep the CI YAML files as "dumb" as possible and move as much of the configuration and tool-intricacies into the dedicated scripts.

Comment on lines 36 to 40
with open(key_path, 'w', encoding='utf-8') as keyfile:
subprocess.run([botan_exe_path, "keygen", "--algo=RSA", "--params=2048"], stdout=keyfile, check=True)

with open(cert_path, 'w', encoding='utf-8') as certfile:
subprocess.run([botan_exe_path, "gen_self_signed", key_path, "localhost"], stdout=certfile, check=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general remark: At one point we should spin-off some internal Utilities python module that we can reuse across the internal python machinery. As a center piece this will likely include a convenience wrapper around subprocess.run and/or subprocess.Popen. No idea, how many scripts already have a copy of run_commend(). 🤡

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. But we need to evaluate if an additional abstraction layer is worth it. I think, in this case, the calls are slim enough.

@FAlbertDev
Copy link
Collaborator Author

FAlbertDev commented Aug 8, 2023

My current state for this PR:
I'm currently working on the usage of the tls_http_server instead of tls_server to speed up the tests by parallelization. Some test results are currently inconsistent between both CLI modules (probably because of a small issue with the tls_http_server). I'll investigate this and will probably open another PR addressing this issue after understanding the underlying problem.
Also, I need to implement your suggestions, @reneme, and redirect the checkout of TLS-Anvil to the origin repo, which is fixed now.

uses: ./.github/actions/setup-build-agent
with:
target: static
cache-key: linux-gcc-x86_64-static
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a separate cache key because we build with --with-boost?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh probably so - maybe best to just use a new unique cache id for each job and let the cache management system sort it out rather than trying to share caches and causing conflicts and cache misses if/when the configuration of different jobs changes over time.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: if we use a unique target here (eg anvil) we can install boost in setup_gh_actions.sh as a side effect of setting up the build agent, rather than having an explicit install of boost later on in the yaml

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. (see a589ed2)

Thanks for your feedback!

@FAlbertDev
Copy link
Collaborator Author

FAlbertDev commented Aug 9, 2023

I applied your suggestions (109f2e4). However, this PR should only be merged after #3660 is merged.

@FAlbertDev FAlbertDev force-pushed the ci/tls-anvil branch 2 times, most recently from 9836910 to a589ed2 Compare August 11, 2023 07:26
@randombit
Copy link
Owner

LGTM. Only remaining comment from me would be please squash before merging.

@FAlbertDev
Copy link
Collaborator Author

please squash before merging

Done.

@reneme reneme merged commit fc4b427 into randombit:master Aug 15, 2023
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants