[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKwvOdnOpgo9rEctZZR9Y9rEc60FCthbPtp62UsdMtkGDF5nUg@mail.gmail.com>
Date: Fri, 12 Jul 2019 09:59:02 -0700
From: Nick Desaulniers <ndesaulniers@...gle.com>
To: Arnd Bergmann <arnd@...db.de>
Cc: Jann Horn <jannh@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
clang-built-linux <clang-built-linux@...glegroups.com>,
Josh Poimboeuf <jpoimboe@...hat.com>
Subject: Re: objtool crashes on clang output (drivers/hwmon/pmbus/adm1275.o)
On Fri, Jul 12, 2019 at 7:29 AM Josh Poimboeuf <jpoimboe@...hat.com> wrote:
>
> On Fri, Jul 12, 2019 at 04:19:02PM +0200, Arnd Bergmann wrote:
> > On Fri, Jul 12, 2019 at 3:57 PM Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> > >
> > > On Fri, Jul 12, 2019 at 09:51:35AM +0200, Arnd Bergmann wrote:
> > > > I no longer see any of the "can't find switch jump table" in last
> > > > nights randconfig
> > > > builds. I do see one other rare warning, see attached object file:
> > > >
> > > > fs/reiserfs/do_balan.o: warning: objtool: replace_key()+0x158: stack
> > > > state mismatch: cfa1=7+40 cfa2=7+56
> > > > fs/reiserfs/do_balan.o: warning: objtool: balance_leaf()+0x2791: stack
> > > > state mismatch: cfa1=7+176 cfa2=7+192
> > > > fs/reiserfs/ibalance.o: warning: objtool: balance_internal()+0xe8f:
> > > > stack state mismatch: cfa1=7+240 cfa2=7+248
> > > > fs/reiserfs/ibalance.o: warning: objtool:
> > > > internal_move_pointers_items()+0x36f: stack state mismatch: cfa1=7+152
> > > > cfa2=7+144
> > > > fs/reiserfs/lbalance.o: warning: objtool:
> > > > leaf_cut_from_buffer()+0x58b: stack state mismatch: cfa1=7+128
> > > > cfa2=7+112
> > > > fs/reiserfs/lbalance.o: warning: objtool:
> > > > leaf_copy_boundary_item()+0x7a9: stack state mismatch: cfa1=7+104
> > > > cfa2=7+96
> > > > fs/reiserfs/lbalance.o: warning: objtool:
> > > > leaf_copy_items_entirely()+0x3d2: stack state mismatch: cfa1=7+120
> > > > cfa2=7+128
> > > >
> > > > I suspect this comes from the calls to the __reiserfs_panic() noreturn function,
> > > > but have not actually looked at the object file.
> > >
> > > Looking at one of the examples:
> > >
> > > 2346: 0f 85 6a 01 00 00 jne 24b6 <leaf_copy_items_entirely+0x3a8>
> > > ...
> > > 23b1: e9 2a 01 00 00 jmpq 24e0 <leaf_copy_items_entirely+0x3d2>
> > > ...
> > > 24b6: 31 ff xor %edi,%edi
> > > 24b8: 48 c7 c6 00 00 00 00 mov $0x0,%rsi
> > > 24bb: R_X86_64_32S .rodata.str1.1
> > > 24bf: 48 c7 c2 00 00 00 00 mov $0x0,%rdx
> > > 24c2: R_X86_64_32S .rodata.str1.1+0x127b
> > > 24c6: 48 c7 c1 00 00 00 00 mov $0x0,%rcx
> > > 24c9: R_X86_64_32S .rodata.str1.1+0x1679
> > > 24cd: 41 b8 90 01 00 00 mov $0x190,%r8d
> > > 24d3: 49 c7 c1 00 00 00 00 mov $0x0,%r9
> > > 24d6: R_X86_64_32S .rodata.str1.1+0x127b
> > > 24da: b8 00 00 00 00 mov $0x0,%eax
> > > 24df: 55 push %rbp
> > > 24e0: 41 52 push %r10
> > > 24e2: e8 00 00 00 00 callq 24e7 <leaf_item_bottle>
> > > 24e3: R_X86_64_PC32 __reiserfs_panic-0x4
> > >
> > > Objtool is correct this time: There *is* a stack state mismatch at
> > > 0x24e0. The stack size is different at 0x24e0, depending on whether it
> > > came from 0x2346 or from 0x23b1.
> > >
> > > In this case it's not a problem for code flow, because the basic block
> > > is a dead end.
> > >
> > > But it *is* a problem for unwinding. The location of the previous stack
> > > frame is nondeterministic.
> > >
> > > And that's extra important for calls to noreturn functions, because they
> > > often dump the stack before exiting.
> > >
> > > So it looks like a compiler bug to me.
> >
> > The change below would shut up the warnings, and presumably avoid
> > the unwinding problem as well. Should I submit that for inclusion,
> > or should we try to fix clang first?
>
> That should work, though I guess it's up to the reiserfs maintainers.
>
> The issue still needs to get fixed in clang regardless. There are other
> noreturn functions in the kernel and this problem could easily pop back
> up.
Sure, thanks for the report. Arnd, can you help us get a more minimal
test case to understand the issue better?
--
Thanks,
~Nick Desaulniers
Powered by blists - more mailing lists