[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YifuVmkcb1ie7bzk@shell.armlinux.org.uk>
Date: Wed, 9 Mar 2022 00:01:26 +0000
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Ard Biesheuvel <ardb@...nel.org>
Cc: Corentin Labbe <clabbe.montjoie@...il.com>,
Linus Walleij <linus.walleij@...aro.org>,
Arnd Bergmann <arnd@...db.de>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: boot flooded with unwind: Index not found
On Wed, Mar 02, 2022 at 11:22:29AM +0000, Russell King (Oracle) wrote:
> On Wed, Mar 02, 2022 at 12:19:40PM +0100, Ard Biesheuvel wrote:
> > On Wed, 2 Mar 2022 at 12:12, Russell King (Oracle)
> > <linux@...linux.org.uk> wrote:
> > >
> > > On Wed, Mar 02, 2022 at 11:09:49AM +0100, Corentin Labbe wrote:
> > > > The crash disappeared (but the suspicious RCU usage is still here).
> > >
> > > As the trace on those is:
> > >
> > > [ 0.239629] unwind_backtrace from show_stack+0x10/0x14
> > > [ 0.239654] show_stack from init_stack+0x1c54/0x2000
> > >
> > > unwind_backtrace() and show_stack() are both C code, the compiler will
> > > emit the unwind information for it. show_stack() isn't called from
> > > assembly code, only from C code, so the next function's unwind
> > > information should also be generated by the compiler.
> > >
> > > However, init_stack is not a function - it's an array of unsigned long.
> > > There is no way this should appear in the trace, and this suggests that
> > > the unwind of show_stack() has gone wrong.
> > >
> > > I don't see anything obvious in Ard's changes that would cause that
> > > though.
> > >
> > > Did it used to work fine with previous versions of linux-next - those
> > > versions where we had Ard's "arm-vmap-stacks-v6" tag merged in
> > > (commit 2fa394824493) and did this only appear when I merged
> > > "arm-ftrace-for-rmk" (commit 74aaaa1e9bba) ? Did merging
> > > "arm-ftrace-for-rmk" cause any change in your .config?
> > >
> >
> > I can reproduce the RCU warnings, and I have tracked this down to the
> > change I made to return_address() for the graph tracer, which I
> > thought was justified after removing the call to
> > kernel_text_address():
> >
> > --- a/arch/arm/include/asm/ftrace.h
> > +++ b/arch/arm/include/asm/ftrace.h
> > @@ -35,26 +35,8 @@ static inline unsigned long
> > ftrace_call_adjust(unsigned long addr)
> >
> > #ifndef __ASSEMBLY__
> >
> > -#if defined(CONFIG_FRAME_POINTER) && !defined(CONFIG_ARM_UNWIND)
> > -/*
> > - * return_address uses walk_stackframe to do it's work. If both
> > - * CONFIG_FRAME_POINTER=y and CONFIG_ARM_UNWIND=y walk_stackframe uses unwind
> > - * information. For this to work in the function tracer many functions would
> > - * have to be marked with __notrace. So for now just depend on
> > - * !CONFIG_ARM_UNWIND.
> > - */
> > -
> > void *return_address(unsigned int);
> >
> > -#else
> > -
> > -static inline void *return_address(unsigned int level)
> > -{
> > - return NULL;
> > -}
> > -
> > -#endif
> > -
> > #define ftrace_return_address(n) return_address(n)
> >
> > #define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
> >
> > However, the function graph tracer works happily with this bit
> > reverted, and so that is probably the best course of action here.
> >
> > I have already sent the patch that reintroduces the
> > kernel_text_address() check - would you prefer a v2 of that one with
> > this change incorporated? Or a second patch that just reverts the
> > above? (Given that the bogus dereference was invoked from
> > return_address() as well, I suspect that this change would make the
> > get_kernel_nofault() change I proposed in this thread redundant)
>
> I'd prefer patches on top of my devel-stable branch, thanks.
To reinterate what I've just put on IRC - we have not got to the bottom
of this problem yet - it still very much exists.
There seems to be something of a fundamental issue with the unwinder,
it now appears to be going wrong and failing to unwind beyond a
couple of functions, and the address it's coming out with appears to
be incorrect. I've only just discovered this because I created my very
own bug, and yet again, the timing sucks with the proximity of the
merge window.
I'm getting:
[ 13.198803] [<c0017728>] (unwind_backtrace) from [<c0012828>] (show_stack+0x10/0x14)
[ 13.198820] [<c0012828>] (show_stack) from [<c2be78d4>] (0xc2be78d4)
for the WARN_ON() stacktrace, and that address that apparently called
show_stack() is most definitely rubbish and incorrect. This makes any
WARN_ON() condition undebuggable.
This is with both 9183/1 and 9184/1 applied on top of pulling your
"arm-ftrace-for-rmk" tag and also with just the "arm-vmap-stacks-v6"
tag. This seems to point at one of these patches breaking the
unwinder:
a1c510d0adc6 ARM: implement support for vmap'ed stacks
532319b9c418 ARM: unwind: disregard unwind info before stack frame is set up
4ab6827081c6 ARM: unwind: dump exception stack from calling frame
b6506981f880 ARM: unwind: support unwinding across multiple stacks
Given that the unwinder is broken, I wonder whether 0183/1 and 9184/1
are actually required.
I did try to point this problem out a few emails back:
"As the trace on those is:
[ 0.239629] unwind_backtrace from show_stack+0x10/0x14
[ 0.239654] show_stack from init_stack+0x1c54/0x2000
unwind_backtrace() and show_stack() are both C code, the compiler will
emit the unwind information for it. show_stack() isn't called from
assembly code, only from C code, so the next function's unwind
information should also be generated by the compiler.
However, init_stack is not a function - it's an array of unsigned long.
There is no way this should appear in the trace, and this suggests that
the unwind of show_stack() has gone wrong."
In Corentin's case, there is no way init_stack should ever appear in
the stack trace. In my case, it's not init_stack, but 0xc2be78d4.
Can you try testing out a dummy WARN_ON(1) test in your kernel please?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Powered by blists - more mailing lists