[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210927234543.6waods7rraxseind@treble>
Date: Mon, 27 Sep 2021 16:45:43 -0700
From: Josh Poimboeuf <jpoimboe@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Dmitry Vyukov <dvyukov@...gle.com>, Marco Elver <elver@...gle.com>,
syzbot <syzbot+d08efd12a2905a344291@...kaller.appspotmail.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
syzkaller-bugs@...glegroups.com, viro@...iv.linux.org.uk,
the arch/x86 maintainers <x86@...nel.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
kasan-dev <kasan-dev@...glegroups.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [syzbot] upstream test error: KFENCE: use-after-free in
kvm_fastop_exception
On Mon, Sep 27, 2021 at 04:07:51PM +0000, Sean Christopherson wrote:
> I was asking about the exact location to confirm that the explosion is indeed
> from exception fixup, which is the "unwinder scenario get confused" I was thinking
> of. Based on the disassembly from syzbot, that does indeed appear to be the case
> here, i.e. this
>
> 2a: 4c 8b 21 mov (%rcx),%r12
>
> is from exception fixup from somewhere in __d_lookup (can't tell exactly what
> it's from, maybe KASAN?).
>
> > Is there more info on this "the unwinder gets confused"? Bug filed
> > somewhere or an email thread? Is it on anybody's radar?
>
> I don't know if there's a bug report or if this is on anyone's radar. The issue
> I've encountered in the past, and what I'm pretty sure is being hit here, is that
> the ORC unwinder doesn't play nice with out-of-line fixup code, presumably because
> there are no tables for the fixup. I believe kvm_fastop_exception() gets blamed
> because it's the first label that's found when searching back through the tables.
The ORC unwinder actually knows about .fixup, and unwinding through the
.fixup code worked here, as evidenced by the entire stacktrace getting
printed. Otherwise there would have been a bunch of question marks in
the stack trace.
The problem reported here -- falsely printing kvm_fastop_exception -- is
actually in the arch-independent printing of symbol names, done by
__sprint_symbol(). Most .fixup code fragments are anonymous, in the
sense that they don't have symbols associated with them. For x86, here
are the only defined symbols in .fixup:
ffffffff81e02408 T kvm_fastop_exception
ffffffff81e02728 t .E_read_words
ffffffff81e0272b t .E_leading_bytes
ffffffff81e0272d t .E_trailing_bytes
ffffffff81e02734 t .E_write_words
ffffffff81e02740 t .E_copy
There's a lot of anonymous .fixup code which happens to be placed in the
gap between "kvm_fastop_exception" and ".E_read_words". The kernel
symbol printing code will go backwards from the given address and will
print the first symbol it finds. So any anonymous code in that gap will
falsely be reported as kvm_fastop_exception().
I'm thinking the ideal way to fix this would be getting rid of the
.fixup section altogether, and instead place a function's corresponding
fixup code in a cold part of the original function, with the help of
asm_goto and cold label attributes.
That way, the original faulting function would be printed instead of an
obscure reference to an anonymous .fixup code fragment. It would have
other benefits as well. For example, not breaking livepatch...
I'll try to play around with it.
--
Josh
Powered by blists - more mailing lists