[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211029200324.GR174703@worktop.programming.kicks-ass.net>
Date: Fri, 29 Oct 2021 22:03:24 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Sami Tolvanen <samitolvanen@...gle.com>
Cc: Ard Biesheuvel <ardb@...nel.org>,
Mark Rutland <mark.rutland@....com>, X86 ML <x86@...nel.org>,
Kees Cook <keescook@...omium.org>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Nathan Chancellor <nathan@...nel.org>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Sedat Dilek <sedat.dilek@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
linux-hardening@...r.kernel.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
llvm@...ts.linux.dev
Subject: Re: [PATCH v5 00/15] x86: Add support for Clang CFI
On Wed, Oct 27, 2021 at 08:50:17AM -0700, Sami Tolvanen wrote:
> On Wed, Oct 27, 2021 at 7:18 AM Ard Biesheuvel <ardb@...nel.org> wrote:
> > > /*
> > > * Turns a Clang CFI jump-table entry into an actual function pointer.
> > > * These jump-table entries are simply jmp.d32 instruction with their
> > > * relative offset pointing to the actual function, therefore decode the
> > > * instruction to find the real function.
> > > */
> > > static __always_inline void *nocfi_ptr(void *func)
> > > {
> > > union text_poke_insn insn = *(union text_poke_insn *)func;
> > >
> > > return func + sizeof(insn) + insn.disp;
> > > }
> > >
> > > But really, that wants to be a compiler intrinsic.
> >
> > Agreed. We could easily do something similar on arm64, but I'd prefer
> > to avoid that too.
>
> I'll see what we can do. Note that the compiler built-in we previously
> discussed would have semantics similar to function_nocfi(). It would
> return the raw function address from a symbol name, but it wouldn't
> decode the address from an arbitrary pointer, so this would require
> something different.
So I had a bit of a peek at what clang generates:
3fa4: 48 c7 c7 00 00 00 00 mov $0x0,%rdi 3fa7: R_X86_64_32S __SCK__x86_pmu_handle_irq
3fab: 48 c7 c6 00 00 00 00 mov $0x0,%rsi 3fae: R_X86_64_32S __SCT__x86_pmu_handle_irq.cfi_jt
3fb2: e8 00 00 00 00 call 3fb7 <init_hw_perf_events+0x1dc> 3fb3: R_X86_64_PLT32 __static_call_update-0x4
So this then gives the trampoline jump table entry to
__static_call_update(), with the result that it will rewrite the
jump-table entry, not the trampoline!
Now it so happens that the trampoline looks *exactly* like the
jump-table entry (one jmp.d32 instruction), so in that regards it'll
again 'work'.
But this is all really, as in *really*, wrong. And I'm really sad I'm
the one to have to discover this, even though I've mentioned
static_call()s being tricky in previous reviews.
Powered by blists - more mailing lists