Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module API #332

Open
osandov opened this issue Jul 5, 2023 · 4 comments
Open

Module API #332

osandov opened this issue Jul 5, 2023 · 4 comments
Assignees
Labels
debuginfo Support for debugging information formats enhancement New feature or request

Comments

@osandov
Copy link
Owner

osandov commented Jul 5, 2023

drgn currently provides limited control over how debugging information is found: drgn.Program.load_debug_info() allows specifying a list of files that drgn will try to use, but that's it. drgn has built-in logic for where to search for debugging information by default; this is a custom implementation for the Linux kernel, a partial implementation for userspace core dumps, and libdwfl for live userspace processes. These all have issues, and really need to be unified and more flexible.

The solution to this is an API that exposes the main executable and every shared library, loadable kernel module, etc. as a "module". We can then allow providing debugging information per module, and even allow the user to create modules in case drgn gets it wrong. The existing load_debug_info() API will then be re-implemented on top of this API.

This will also solve or add the flexibility to enable a bunch of related issues: #16, #17, #25.

I'm working on this in the modules branch.

@osandov osandov added the enhancement New feature or request label Jul 5, 2023
@osandov osandov self-assigned this Jul 5, 2023
@osandov osandov added the debuginfo Support for debugging information formats label Jul 5, 2023
osandov added a commit that referenced this issue Oct 2, 2023
This will simplify the implementation of the module API (#332).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
osandov added a commit that referenced this issue Oct 2, 2023
This will simplify the implementation of the module API (#332).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
Asphaltt pushed a commit to Asphaltt/drgn-bpf that referenced this issue Oct 4, 2023
This will simplify the implementation of the module API (osandov#332).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
osandov added a commit that referenced this issue Oct 9, 2023
In my branch for the module API (#332), I want to log an error without
any additional context. Passing an empty format string causes a
"zero-length gnu_printf format string" warning from GCC, and passing
NULL crashes in vsnprintf().

Empty format strings are totally valid, but NULL clearly isn't, so
annotate the format parameter as non-NULL and disable
-Wformat-zero-length.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
@osandov
Copy link
Owner Author

osandov commented Nov 4, 2023

One other feature to consider, which doesn't exactly fit in with the API as it is currently implemented in my branch, is supporting plugins to get debug info. I.e., some configuration file that defines some way to get debug info on a particular system/distro.

@brenns10
Copy link
Contributor

brenns10 commented Nov 5, 2023

I've done a little bit of thinking about debuginfo in drgn-tools, and one thing I've found useful is splitting the concept into "finding" and "fetching" debuginfo. For "finding", the assumption is that the files exist on the filesystem if you know where to look. Drgn does this well. But for example on our analysis systems, we have an NFS mount that contains a bunch of vmlinux/ko files. That's a nonstandard location so it's nice to have a "finder" for that.

The "fetching" falls into the same category as debuginfod: the files are either in a remote location, or require a lengthy extraction process to find. For instance, I have two fetcher implementations now: one which can find kernel RPM debuginfo packages, and download and extract them, and another for internal analysis systems which finds the RPM on a (different) NFS share and does the same. The important thing about fetching is to place the newly created files into a location that the finder will find next time :)

I find the separation useful, but it could be a bit tied to our use cases!

@osandov
Copy link
Owner Author

osandov commented Mar 21, 2024

I'm working on hammering out the remaining bits of this now.

Re: "finding" vs "fetching", at least for debuginfod, the API is that you call debuginfod_find_executable() or debuginfod_find_debuginfo(), which first checks the debuginfod client cache, and if that misses, then it downloads it and stores the result in the cache. I.e., it's a single entrypoint that "finds" locally and "fetches" if that fails, and that's how I've been picturing the drgn equivalents. For your examples, I think a similar approach would work?

osandov added a commit that referenced this issue Mar 22, 2024
We currently only have one test resource file, sample.coredump.zst, but
the tests for #332 will add more. Create a package, tests.resources, to
contain test resources and a function, get_resource(), to decompress
them. It can also be used on the command line:

  python3 -m tests.resources $resource_name

Signed-off-by: Omar Sandoval <osandov@osandov.com>
@brenns10
Copy link
Contributor

I only really meant that it's nice to be able to check whether a request for debuginfo can be satisfied quickly, before committing to doing a long, blocking call. If debuginfod provides a way to check for cached debuginfo only, then it would be nice to have that option. But obviously whatever is easiest to implement, and if there's something I'd like to see, I could probably take a look at adding it too :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
debuginfo Support for debugging information formats enhancement New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

2 participants