-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack unwinding cannot go beneath IRQ frames #304
Comments
I've seen this in production, too, but hadn't gotten around to reproducing it, so this is super helpful! I'm still under the impression that ORC is supposed to help us unwind through an IRQ, but maybe I'm wrong or maybe there's a bug in the ORC unwinding code. |
I'm not so sure that we can rely on the ORC or DWARF to get us through the exception, but I likely haven't spent as much time looking at it as you have! Looking at how crash does it, it does seem like the transition from the exception stack into the normal task stack has a bit of manual effort involved... But then again, I don't know that crash's way is necessarily the way we should emulate, it's just another data point. In the case of my particular reproducer, it does seem as simple as "keep unwinding, assuming that the pt_regs location got pushed to the stack":
edit: and this approach works on the customer vmcore which finally caused me to go down the rabbit hole, which is pretty nice! The only difference is that sometimes, there's a "bogus" frame at the end of the stack trace, e.g.:
And so rather than using the last frame's sp, you need to use the second-to-last frame's sp. |
This gets curiouser and curiouser :) Drgn unwinds through the IRQ back into the task stack with no problems on 5.10... but crash can't handle that kernel. I'm updating the original comment with a table summarizing all of the results. Super bizarre stuff. The original comment also now has my reproducer branch linked with instructions. |
Ok, table is now updated with every vmtest kernel version for x86_64/default. I also ran the test with |
With the full table, I can at least start compiling some possible kernel commits to "blame". Not that the bug is necessarily the kernel's fault, but merely that there's something different between kernel versions which may point us to the part of the unwind that is correct or incorrect.
If we can find a suspect change, it would be good to revert it and see if that makes allows drgn to unwind past the execption frame again. |
Thanks again for the reproducer! I got it running locally and investigated a bit. I found a nasty typo in the ORC unwinder: diff --git a/libdrgn/arch_x86_64.c b/libdrgn/arch_x86_64.c
index 57ebcd4..d65b181 100644
--- a/libdrgn/arch_x86_64.c
+++ b/libdrgn/arch_x86_64.c
@@ -76,7 +76,7 @@ orc_to_cfi_x86_64(const struct drgn_orc_entry *orc,
break;
case ORC_REG_SP_INDIRECT:
rule.kind = DRGN_CFI_RULE_AT_REGISTER_ADD_OFFSET;
- rule.regno = DRGN_REGISTER_NUMBER(rbp);
+ rule.regno = DRGN_REGISTER_NUMBER(rsp);
rule.offset = orc->sp_offset;
break;
case ORC_REG_BP_INDIRECT: With that fixed, 6.2.16-vmtest21.1default works with
But not without it:
This means that |
I applied your ORC patch and re-ran
|
So that was a pretty awesome fix, great find there! For the remainder of the matrix, it seems likely that there was some change in the 5.12 release cycle which broke the DWARF CFI unwinding. The commit I linked above was definitely not the commit that caused it, but it was part of a series that reworked the x86 IRQ entry code. It seems likely that the change is in there. Comparing the ORC unwind on 5.12 and 5.11, we can see that the stack traces are in fact different: 5.12:
5.11:
In 5.11 we have the |
The patch series you linked to is indeed the problem: since that series, the stack switch happens in inline assembly via the macros This type of stack switch seems to be represented by |
Just an interesting note regarding unwinding through exception frames... There's a reason that I've probably noticed more issues with this than your average drgn user. The Oracle UEK 5 & 6 kernels that I tend to do a lot of support on (4.14 and 5.4 based) do not have ORC enabled! They do have frame pointers, and I typically have DWARF available, but there's no ORC to fall back on. Contrast that to UEK7 (5.15 based) which has ORC enabled, and no frame pointers. |
I've regularly seen this with customer vmcores, and I decided to reproduce this with the vmtest kernels so that it would be super easy for anyone to reproduce and debug. I brought up this issue in #206, but I believe that was a different (possibly related) issue, so I thought it would be cleaner to keep this one separate.
Suppose that during an interrupt handler, the kernel calls
panic()
. One common instance of this would be the soft lockup detector. Unfortunately, soft lockup detector is not enabled on the vmtest kernels. So instead, I implemented a simple kernel module that would callpanic()
in IRQ context, by way of using an IPI:I then configured kdump inside the vmtest kernel and loaded this module, triggering a panic. In the kexec kernel, I used
makedumpfile /proc/vmcore /io/vmcore
to write the vmcore.The kernel's dmesg for the panic included the following stack trace:
So there are two segments: first, the IRQ stack where the IPI was handled and called
panic()
. But beneath that is the idle task's stack. In other situations (e.g. the soft lockup case), the task which was interrupted may not be the idle task, it may be some very interesting task. We should be able to see this stack, so long as the task was executing in the kernel -- obviously userspace stacks aren't necessarily in-scope when debugging a kernel.However, drgn (0.0.22, and main) cannot unwind past the IRQ frame:
For comparison, crash is normally able to unwind past the IRQ frames, e.g:
I say normally, because in more recent kernels, crash actually fails to go past the IRQ frame, with a
bt: WARNING: possibly bogus exception frame
. I've tested the following on crash 8.0.3 and drgn main:6.3.7-vmtest21.1default
???
frame)5.15.116-vmtest21.1default
???
frame)5.14.21-vmtest21.1default
???
frame)5.13.19-vmtest21.1default
???
frame)5.12.19-vmtest21.1default
???
frame)5.11.22-vmtest21.1default
5.10.183-vmtest21.1default
5.4.246-vmtest21.1default
4.19.285-vmtest21.1default
???
frame)???
frame)4.14.317-vmtest21.1default
???
frame)???
frame)Reproducer steps:
drgn_test.ko
module.makedumpfile
installed (from kexec-tools)python -m vmtest.kmod -k $KVER
to build the modulemkdir io
python -m vmtest.vm -k $KVER -w $(pwd)/io
to run the VMpython3 -m vmtest.enter_kdump -n
to configure the kexec kernel (but skip panicking)insmod build/vmtest/x86_64/drgn_test-$KVER.ko
to trigger the panicmakedumpfile /proc/vmcore /io/vmcore-$KVER
to create the dump. I have encountered some low-memory issues, but none yet have prevented saving the vmcore. I assume increasing the crashkernel reservation would resolve the issues.The text was updated successfully, but these errors were encountered: