Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add container scanning to default checks #1981

Merged

Conversation

zondervancalvez
Copy link
Contributor

@zondervancalvez zondervancalvez commented Apr 20, 2022

Trivy is a cutting-edge security tool designed to enhance the safety of containerized applications by conducting thorough vulnerability assessments. Specifically developed for scanning container images, ranging from low-severity issues to critical threats. It employs an intelligent rating system to categorize vulnerabilities based on their severity levels, ensuring that high to critical vulnerabilities are given special attention. Upon detecting vulnerabilities that fall within this elevated range, Trivy will throw an error.

By integrating Trivy into our deployment pipeline, we can proactively mitigate security risks and enhance the resilience of our repository.

Fixes #1876

Depends On #2121
Depends On #2135

@zondervancalvez zondervancalvez force-pushed the zondervancalvez/issue1876 branch 7 times, most recently from 50ab694 to 32f3bbf Compare May 11, 2022 07:49
@zondervancalvez

This comment was marked as outdated.

@zondervancalvez

This comment was marked as resolved.

@zondervancalvez zondervancalvez force-pushed the zondervancalvez/issue1876 branch 3 times, most recently from 767d989 to a3da3cc Compare May 18, 2022 03:28
@zondervancalvez zondervancalvez marked this pull request as ready for review May 18, 2022 03:28
@petermetz

This comment was marked as resolved.

@petermetz
Copy link
Member

Here is the result during test run. 5/18 images were successful. Below is the list of vulnerabilities per image:

@zondervancalvez Thank you! Let's not create 100+ issues separately, instead maybe create an issue for each image's security problems so 1 issue per image.

Copy link
Member

@petermetz petermetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zondervancalvez The besu container build and some of the checks are broken.

  1. Please fix the ones that do not have dependencies on other PRs (e.g. the ones that are fixing the sec. vulnerabilities)

@zondervancalvez zondervancalvez force-pushed the zondervancalvez/issue1876 branch 5 times, most recently from 1d57b5f to 9a61d17 Compare May 27, 2022 08:25
@zondervancalvez
Copy link
Contributor Author

@zondervancalvez The besu container build and some of the checks are broken.

  1. Please fix the ones that do not have dependencies on other PRs (e.g. the ones that are fixing the sec. vulnerabilities)

All images build and azure container scanned successfully.

@zondervancalvez
Copy link
Contributor Author

Thank you @jagpreetsinghsasan for clocking that exit code bug!

@zondervancalvez This is a good start! Please

  1. Refactor this to have the scans run as part of the build in ci.yaml so that we don't waste resources by building all the images twice for no reason.
  2. Write a legible PR description
  3. Refactor job/worfklow names to not use indexing - it contains no useful information to call something build1 build2 buildN
  4. Follow the usual best practices about pinning versions down exactly for everything you use.
  5. Close down the other PR with the same name (if this is the original, but it looks like it is because this is the one with the reviews/discussion)

Hi @petermetz,

Requested for re-review. Already integrated the container scan in ci.yaml and removed the old yaml file that I created so that the images are not build twice. Also created a PR description above.

Thank you.

Copy link
Member

@petermetz petermetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zondervancalvez Thank you very much! LGTM

@petermetz petermetz enabled auto-merge (rebase) August 24, 2023 17:49
Copy link
Member

@petermetz petermetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zondervancalvez Please sync the commit message and the PR description so that all the information is present in both places.

@VRamakrishna
Copy link
Contributor

@petermetz All the actions in which Trivy is enabled are failing after listing critical vulnerabilities. I suspect this is because the fork is out-of-date. One of the vulnerability report links I clicked on wasn't even valid anymore (404). Should we update this branch and re-run the tests before merging?

The code changes LGTM otherwise.

@petermetz
Copy link
Member

@petermetz All the actions in which Trivy is enabled are failing after listing critical vulnerabilities. I suspect this is because the fork is out-of-date. One of the vulnerability report links I clicked on wasn't even valid anymore (404). Should we update this branch and re-run the tests before merging?

The code changes LGTM otherwise.

@VRamakrishna Hmm. That is a bit troubling indeed because I took a look at how old the branch was and it is 9 commits behind the upstream main branch (so probably around a week old). Two things:

  1. We need to investigate about that 404 because it could a problem with the tool itself or with the vulnerability database or something entirely different (I really have no idea TBH). If you still remember which one you clicked on specifically, please let me know and I'll do a deep dive on that specifically and get us some more context.
  2. Zooming out a bit: Regardless of the failing checks due to critical & high vulnerabilities in the images, for now I would just merge this and then apply a divide & conquer strategy on the vulnerabilities, e.g., we can work on resolving them in parallel in separate pull requests as needed and avoid the infinite loop of fixing most of the vulnerabilities and then by the time we merge the PR that fixes the existing findings there are new zero-days and we are back to square one fixing them again. (For the v2.0.0 GA I do want an intense push where we fix all them quickly and then issue the release knowing that in a few days or weeks at best there'll be new vulnerabilities anyway)
  3. I want to look into further customization in the config of these jobs so that for the images that are not meant to be for production deployments (which is 90% of our images) are not coming up as check failures but instead just warnings (if GitHub actions even has a concept of warnings instead of just pass/fail). The idea is that if we have a critical vulnerability in an image that is only used by the tests in the CI, then that should still only be a warning not a check failure. Of course for the production images the checks should continue failing hard if there are vulnerabilities found in them.

@VRamakrishna
Copy link
Contributor

@petermetz All the actions in which Trivy is enabled are failing after listing critical vulnerabilities. I suspect this is because the fork is out-of-date. One of the vulnerability report links I clicked on wasn't even valid anymore (404). Should we update this branch and re-run the tests before merging?
The code changes LGTM otherwise.

@VRamakrishna Hmm. That is a bit troubling indeed because I took a look at how old the branch was and it is 9 commits behind the upstream main branch (so probably around a week old). Two things:

1. We need to investigate about that 404 because it could a problem with the tool itself or with the vulnerability database or something entirely different (I really have no idea TBH). If you still remember which one you clicked on specifically, please let me know and I'll do a deep dive on that specifically and get us some more context.

2. Zooming out a bit: Regardless of the failing checks due to critical & high vulnerabilities in the images, for now I would just merge this and then apply a divide & conquer strategy on the vulnerabilities, e.g., we can work on resolving them in parallel in separate pull requests as needed and avoid the infinite loop of fixing most of the vulnerabilities and then by the time we merge the PR that fixes the existing findings there are new zero-days and we are back to square one fixing them again. (For the v2.0.0 GA I do want an intense push where we fix all them quickly and then issue the release knowing that in a few days or weeks at best there'll be new vulnerabilities anyway)

3. I want to look into further customization in the config of these jobs so that for the images that are not meant to be for production deployments (which is 90% of our images) are not coming up as check failures but instead just warnings (if GitHub actions even has a concept of warnings instead of just pass/fail). The idea is that if we have a critical vulnerability in an image that is only used by the tests in the CI, then that should still only be a warning not a check failure. Of course for the production images the checks should continue failing hard if there are vulnerabilities found in them.

@petermetz Here is the 404 I found: https://github.com/hyperledger/cacti/actions/runs/5948364958/job/16132017903?pr=1981#step:5:42.

I agree with points (2) and (3).

auto-merge was automatically disabled August 30, 2023 03:49

Head branch was pushed to by a user without write access

@zondervancalvez zondervancalvez force-pushed the zondervancalvez/issue1876 branch 2 times, most recently from cbc55b2 to 305755c Compare August 30, 2023 03:52
@github-actions
Copy link

@petermetz
Copy link
Member

@petermetz All the actions in which Trivy is enabled are failing after listing critical vulnerabilities. I suspect this is because the fork is out-of-date. One of the vulnerability report links I clicked on wasn't even valid anymore (404). Should we update this branch and re-run the tests before merging?
The code changes LGTM otherwise.

@VRamakrishna Hmm. That is a bit troubling indeed because I took a look at how old the branch was and it is 9 commits behind the upstream main branch (so probably around a week old). Two things:

1. We need to investigate about that 404 because it could a problem with the tool itself or with the vulnerability database or something entirely different (I really have no idea TBH). If you still remember which one you clicked on specifically, please let me know and I'll do a deep dive on that specifically and get us some more context.

2. Zooming out a bit: Regardless of the failing checks due to critical & high vulnerabilities in the images, for now I would just merge this and then apply a divide & conquer strategy on the vulnerabilities, e.g., we can work on resolving them in parallel in separate pull requests as needed and avoid the infinite loop of fixing most of the vulnerabilities and then by the time we merge the PR that fixes the existing findings there are new zero-days and we are back to square one fixing them again. (For the v2.0.0 GA I do want an intense push where we fix all them quickly and then issue the release knowing that in a few days or weeks at best there'll be new vulnerabilities anyway)

3. I want to look into further customization in the config of these jobs so that for the images that are not meant to be for production deployments (which is 90% of our images) are not coming up as check failures but instead just warnings (if GitHub actions even has a concept of warnings instead of just pass/fail). The idea is that if we have a critical vulnerability in an image that is only used by the tests in the CI, then that should still only be a warning not a check failure. Of course for the production images the checks should continue failing hard if there are vulnerabilities found in them.

@petermetz Here is the 404 I found: https://github.com/hyperledger/cacti/actions/runs/5948364958/job/16132017903?pr=1981#step:5:42.

I agree with points (2) and (3).

@VRamakrishna Thank you for the info! I've done a deep dive based on that and here are the findings:

  1. The vulnerability is there and valid.
  2. The link the tool prints is broken, and it seems like it's a bug in the tool itself, so I filed an issue here: fix(pkg/vulnerability): invalid PrimaryURLs produced - HTTP 404 aquasecurity/trivy#5079
  3. I still think it's OK to merge still because the vulnerabilities that the tool is finding are valid even though the URLs themselves are broken, we can just use a search engine to look up the CVE ID and other websites come up where the CVE does exist and we can take action based on that information.

@jagpreetsinghsasan
Copy link
Contributor

LGTM (not approving as Peter has already approved and we need only 1 approval per org)

@VRamakrishna
Copy link
Contributor

LGTM.

Trivy is a cutting-edge security tool designed to enhance
the safety of containerized applications by conducting thorough
vulnerability assessments. Specifically developed for scanning
container images, ranging from low-severity issues to critical
threats. It employs an intelligent rating system to categorize
vulnerabilities based on their severity levels, ensuring that
high to critical vulnerabilities are given special attention.
Upon detecting vulnerabilities that fall within this elevated
range, Trivy will throw an error.

By integrating Trivy into our deployment pipeline, we can
proactively mitigate security risks and enhance the resilience
of our repository.

Fixes hyperledger#1876

Depends On: hyperledger#2121
Depends On: hyperledger#2135

Signed-off-by: zondervancalvez <zondervan.v.calvez@accenture.com>
@petermetz petermetz enabled auto-merge (rebase) September 7, 2023 23:00
@petermetz petermetz dismissed jagpreetsinghsasan’s stale review September 7, 2023 23:31

Jagpreet also approved in the meantime so the old review is out of date now.

@petermetz petermetz merged commit 55a1507 into hyperledger:main Sep 7, 2023
110 of 135 checks passed
@petermetz petermetz deleted the zondervancalvez/issue1876 branch September 7, 2023 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ci: add container scanning to default checks
4 participants