Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

View for OpenScanHub results #441

Open
Tracked by #2516
lachmanfrantisek opened this issue Aug 21, 2024 · 15 comments
Open
Tracked by #2516

View for OpenScanHub results #441

lachmanfrantisek opened this issue Aug 21, 2024 · 15 comments
Labels
area/fedora Related to Fedora ecosystem gain/high This brings a lot of value to (not strictly a lot of) users. impact/high This issue impacts multiple/lot of users. kind/feature New feature or a request for enhancement.

Comments

@lachmanfrantisek
Copy link
Member

lachmanfrantisek commented Aug 21, 2024

It would be nice to present people results from the analysis including the relation to other related jobs (like SRPM/RPM) build.

This would most probably require a reporting mechanism to be implemented.
(Currently, Packit only triggers the scan.)

We might also want to think about this as part of Project Mycorrhisa, ping @Venefilyn .

@lachmanfrantisek
Copy link
Member Author

We also need to think about what needs to be displayed on Packit side and what (if) needs to be on OSH only.

@siteshwar
Copy link

As mentioned by @kdudka in packit/packit#2371 (reply in thread), this needs to be implemented on the Packit side.

As per my understanding, the workflow should look like:

  • Packit submits the scan to OSH, and waits for the scan to finish.
  • If the scan was successful, download scan-results.js and convert it to SARIF through csgrep --mode=sarif.
  • Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.
  • Link the user to the report through GitHub actions, or similar CI job.

@davidmalcolm Let us know if you have any suggestions here. Thanks!

@siteshwar
Copy link

We might also want to think about this as part of Project Mycorrhisa, ping @Venefilyn .

As I understand, it is Project Mycorrhiza and its a terrible name.

@lachmanfrantisek
Copy link
Member Author

As I understand, it is Project Mycorrhiza and its a terrible name.

I like the meaning but always struggle with the spelling..;-) Btw. we are collecting use cases people want to see help with here: https://packit.limesurvey.net/project-mycorrhiza

@lachmanfrantisek
Copy link
Member Author

@siteshwar thanks for the task list.

Do you (or anyone else) know if there is a way to get HTML or some other nice visualisation/representation so we can integrate it directly instead of linking an external service? (People do not like these clicking workflows.)

@lachmanfrantisek
Copy link
Member Author

  • Packit submits the scan to OSH, and waits for the scan to finish.

@siteshwar btw, for this, we would appreciate messages coming to the Fedora message bus to avoid busy-wait which is really hard to do with the the current Packit's event-driven architecture. Do you think it's realistic to do?

Related issues about this:

@mfocko mfocko added area/fedora Related to Fedora ecosystem gain/high This brings a lot of value to (not strictly a lot of) users. impact/high This issue impacts multiple/lot of users. kind/feature New feature or a request for enhancement. labels Aug 22, 2024
@siteshwar
Copy link

@siteshwar btw, for this, we would appreciate messages coming to the Fedora message bus to avoid busy-wait which is really hard to do with the the current Packit's event-driven architecture. Do you think it's realistic to do?

It is in my todo list. I would get back to it at some point in future.

@davidmalcolm
Copy link

As mentioned by @kdudka in packit/packit#2371 (reply in thread), this needs to be implemented on the Packit side.

As per my understanding, the workflow should look like:

* Packit submits the scan to OSH, and waits for the scan to finish.

* If the scan was successful, download `scan-results.js` and convert it to SARIF through `csgrep --mode=sarif`.

* Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.

* Link the user to the report through GitHub actions, or similar CI job.

@davidmalcolm Let us know if you have any suggestions here. Thanks!

@siteshwar I confess my knowledge here is mostly on the diagnostics and SARIF generation side, less on filtering, presentation, etc. I haven't played around with CodeQL yet; the pertinent docs would appear to be here. I notice that they are using a few non-standard properties via property bags. FWIW I regularly meet a CodeQL representative at the monthly OASIS SARIF TC meetings.

I'm nervous that going from, say GCC SARIF output to scan-results.js then back to SARIF might lose information. Ideally we'd use SARIF throughout the process to keep as much info as possible (perhaps using property bags if there's anything that OSH wants to capture that isn't yet expressible in SARIF). I'm familiar with a large subset of SARIF 2.1.0, so let me know if you have any questions about it.

@davidmalcolm
Copy link

@siteshwar Also, please note that (with my upstream GCC hat on), I regard GCC's "json" diagnostic output format to be "legacy"; all of my future development for machine-readable diagnostics is on its SARIF output format instead. See https://gcc.gnu.org/wiki/SARIF for details on GCC's SARIF support, bug tracking, etc.

@kdudka
Copy link

kdudka commented Aug 23, 2024

@davidmalcolm OSH still captures the plain-text format of GCC diagnostics. Although csdiff supports all the three formats (plain-text, JSON, SARIF), it is not easy to migrate csmock-plugin-gcc to the newer diagnostic formats. As it is implemented now, the diagnostic output from all GCC processes is written to a single text file. This works well for the plain-text format but not for the JSON-based formats because concatenation of multiple JSON files is not a valid JSON.

Can GCC write a separate SARIF file to a specified directory for each invocation? Something like valgrind --xml-file=${dir}/%p-%n.xml ...? I think something like that would ease the transition of OSH to GCC's SARIF output...

@davidmalcolm
Copy link

Can GCC write a separate SARIF file to a specified directory for each invocation? Something like valgrind --xml-file=${dir}/%p-%n.xml ...? I think something like that would ease the transition of OSH to GCC's SARIF output...

@kdudka Sadly not. I had thought that it respected the dumpfile command-line arguments, but it turns out it doesn't. Sorry about that. See GCC bug 110522 for more info, and ideas on fixing this.

Would being able to specify an output filename for the .sarif file on the gcc command line be enough?

BTW, which version of GCC are you using?

@siteshwar
Copy link

siteshwar commented Aug 28, 2024

BTW, which version of GCC are you using?

The scans that are run through Packit are run on rawhide, so we use the same version of gcc as available in rawhide. Currently it is 14.1.1-7.fc41, it can be actually seen in the stdout.log of any scan.

@kdudka
Copy link

kdudka commented Aug 28, 2024

Would being able to specify an output filename for the .sarif file on the gcc command line be enough?

Yes. If GCC was able to write a SARIF file with a unique name in a specified directory (like valgrind --xml-file=${dir}/%p-%n.xml ... does), it would be much easier to build automation on top of it. Nevertheless, if GCC is able to write the SARIF file to a specified path, we should be able to script around it with flock and mktemp.

Also does GCC always provide absolute paths to source code files in the SARIF data? Relative paths to source code files are currently translated by the compiler wrapper that processes the diagnostic messages printed by GCC to stderr.

BTW, which version of GCC are you using?

OSH uses the GCC version that is available in the buildroot. So, as @siteshwar says, it will be the Fedora Rawhide version by default for OSH tasks triggered by Packit. OSH needs to work with GCC versions available in EPEL buildroots, too, but csmock-plugin-gcc can probe for the needed features in GCC and eventually fallback to the plain-text capture that it uses today.

csutils/csmock#41 introduced an option to use a custom version of GCC Analyzer, which can be different from the GCC version that drives the native build (as long as these GCC versions can be installed in parallel into a single buildroot).

@siteshwar
Copy link

  • Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.

There is a limit of 10 MB on the results file. The SARIF file should be compressed before being uploaded.

@Venefilyn
Copy link
Collaborator

Venefilyn commented Sep 20, 2024

  • Upload the SARIF output to CodeQL, or a similar service that can present the results in a user friendly way.

I think outputting it with CodeQL would be great to keep things within GitHub. As for the dashboard we could make use of the SARIF output as well, for that I ask that packit-service can indicate in a response that a SARIF file exists or is being generated, and create a new endpoint to return the SARIF results

Given SARIF has a schema it should be relatively trivial to handle it in the dashboard as we can just install the types or generate the types to make things easier for us through something like https://www.npmjs.com/package/@types/sarif/v/2.1.5

We might also want to think about this as part of Project Mycorrhisa, ping @Venefilyn .

As I understand, it is Project Mycorrhiza and its a terrible name.

It's a project/WG name for an idea! It's a good name 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/fedora Related to Fedora ecosystem gain/high This brings a lot of value to (not strictly a lot of) users. impact/high This issue impacts multiple/lot of users. kind/feature New feature or a request for enhancement.
Projects
Status: backlog
Development

No branches or pull requests

6 participants