lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 27 Oct 2021 17:55:15 +0200
From:   Ard Biesheuvel <ardb@...nel.org>
To:     Sami Tolvanen <samitolvanen@...gle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Mark Rutland <mark.rutland@....com>, X86 ML <x86@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Nathan Chancellor <nathan@...nel.org>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Sedat Dilek <sedat.dilek@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        linux-hardening@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        llvm@...ts.linux.dev
Subject: Re: [PATCH v5 00/15] x86: Add support for Clang CFI

On Wed, 27 Oct 2021 at 17:50, Sami Tolvanen <samitolvanen@...gle.com> wrote:
>
> On Wed, Oct 27, 2021 at 7:18 AM Ard Biesheuvel <ardb@...nel.org> wrote:
> >
> > On Wed, 27 Oct 2021 at 16:03, Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > On Wed, Oct 27, 2021 at 03:30:11PM +0200, Ard Biesheuvel wrote:
> > >
> > > > As far as I can tell from playing around with Clang, the stubs can
> > > > actually be executed directly,
> > >
> > > I had just finished reading the clang docs which suggest as much and was
> > > about to try what the compiler actually generates.
> > >
> > > > they just jumps to the actual function.
> > > > The compiler simply generates a jump table for each prototype that
> > > > appears in the code as the target of an indirect jump, and checks
> > > > whether the target appears in the list.
> > > >
> > > > E.g., the code below
> > > >
> > > > void foo(void) {}
> > > > void bar(int) {}
> > > > void baz(int) {}
> > > > void (* volatile fn1)(void) = foo;
> > > > void (* volatile fn2)(int) = bar;
> > > >
> > > > int main(int argc, char *argv[])
> > > > {
> > > >   fn1();
> > > >   fn2 = baz;
> > > >   fn2(-1);
> > > > }
> > > >
> > > > produces
> > > >
> > > > 0000000000400594 <foo.cfi>:
> > > >   400594: d65f03c0 ret
> > > >
> > > > 0000000000400598 <bar.cfi>:
> > > >   400598: d65f03c0 ret
> > > >
> > > > 000000000040059c <baz.cfi>:
> > > >   40059c: d65f03c0 ret
> > >
> > > Right, so these are the actual functions ^.
> > >
> > > > 00000000004005a0 <main>:
> > > >   4005a0: a9bf7bfd stp x29, x30, [sp, #-16]!
> > > >
> > > > // First indirect call
> > > >   4005a4: b0000088 adrp x8, 411000 <__libc_start_main@...BC_2.17>
> > > >   4005a8: f9401508 ldr x8, [x8, #40]
> > > >   4005ac: 90000009 adrp x9, 400000 <__abi_tag-0x278>
> > > >   4005b0: 91182129 add x9, x9, #0x608
> > > >   4005b4: 910003fd mov x29, sp
> > > >   4005b8: eb09011f cmp x8, x9
> > > >   4005bc: 54000241 b.ne 400604 <main+0x64>  // b.any
> > > >   4005c0: d63f0100 blr x8
> > >
> > > That's impenetrable to me, sorry.
> > >
> >
> > This loads the value of fn1 in x8, and takes the address of the jump
> > table in x9. Since it is only one entry long, it does a simple compare
> > to check whether x8 appears in the jump table, and branches to the BRK
> > at the end if they are different.
> >
> > > > // Assignment of fn2
> > > >   4005c4: 90000009 adrp x9, 400000 <__abi_tag-0x278>
> > > >   4005c8: b0000088 adrp x8, 411000 <__libc_start_main@...BC_2.17>
> > > >   4005cc: 91184129 add x9, x9, #0x610
> > > >   4005d0: f9001909 str x9, [x8, #48]
> > >
> > > I'm struggling here, x9 points to the branch at 400610, but then what?
> > >
> > > x8 is in .data somewhere?
> > >
> >
> > This takes the address of the jump table entry of 'baz' in x9, and
> > stores it in fn2 whose address is taken in x8.
> >
> >
> > > > // Second indirect call
> > > >   4005d4: f9401908 ldr x8, [x8, #48]
> > > >   4005d8: 90000009 adrp x9, 400000 <__abi_tag-0x278>
> > > >   4005dc: 91183129 add x9, x9, #0x60c
> > > >   4005e0: cb090109 sub x9, x8, x9
> > > >   4005e4: 93c90929 ror x9, x9, #2
> > > >   4005e8: f100053f cmp x9, #0x1
> > > >   4005ec: 540000c8 b.hi 400604 <main+0x64>  // b.pmore
> > > >   4005f0: 12800000 mov w0, #0xffffffff            // #-1
> > > >   4005f4: d63f0100 blr x8
> > > >
> > > >
> > > >   4005f8: 2a1f03e0 mov w0, wzr
> > > >   4005fc: a8c17bfd ldp x29, x30, [sp], #16
> > > >   400600: d65f03c0 ret
> > > >   400604: d4200020 brk #0x1
> > >
> > >
> > > > 0000000000400608 <__typeid__ZTSFvvE_global_addr>:
> > > >   400608: 17ffffe3 b 400594 <foo.cfi>
> > > >
> > > > 000000000040060c <__typeid__ZTSFviE_global_addr>:
> > > >   40060c: 17ffffe3 b 400598 <bar.cfi>
> > > >   400610: 17ffffe3 b 40059c <baz.cfi>
> > >
> > > And these are the stubs per type.
> > >
> > > > So it looks like taking the address is fine, although not optimal due
> > > > to the additional jump.
> > >
> > > Right.
> > >
> >
> > ... although it does seem that function_nocfi() doesn't actually work
> > as expected, given that we want the address of <func>.cfi and not the
> > address of the stub.
>
> This is because the example wasn't compiled with
> -fno-sanitize-cfi-canonical-jump-tables, which we use in the kernel.
> With non-canonical jump tables, <func> continues to point to the
> function and <func>.cfi_jt points to the jump table, and therefore,
> function_nocfi() returns the raw function address.
>

Ah excellent. So that means that we don't need function_nocfi() at
all, given that
- statically allocated references (i.e., DEFINE_STATIC_CALL()) will
refer to the function directly;
- runtime assignments can decode the target of the *func pointer and
strip off the initial branch.

It would still be nice to have an intrinsic for that, or some variable
attribute that signifies that assigning the address of a function to
it will produce the actual function rather than the jump table entry.

> > > > We could fudge around that by checking the
> > > > opcode at the target of the call, or token paste ".cfi" after the
> > > > symbol name in the static_call_update() macro, but it doesn't like
> > > > like anything is terminally broken tbh.
> > >
> > > Agreed, since the jump table entries are actually executable it 'works'.
> > >
> > > I really don't like that extra jump though, so I think I really do want
> > > that nocfi_ptr() thing. And going by:
> > >
> > >   https://clang.llvm.org/docs/ControlFlowIntegrityDesign.html#forward-edge-cfi-for-indirect-function-calls
> > >
> > > and the above, that might be possible (on x86) with something like:
> > >
> > > /*
> > >  * Turns a Clang CFI jump-table entry into an actual function pointer.
> > >  * These jump-table entries are simply jmp.d32 instruction with their
> > >  * relative offset pointing to the actual function, therefore decode the
> > >  * instruction to find the real function.
> > >  */
> > > static __always_inline void *nocfi_ptr(void *func)
> > > {
> > >         union text_poke_insn insn = *(union text_poke_insn *)func;
> > >
> > >         return func + sizeof(insn) + insn.disp;
> > > }
> > >
> > > But really, that wants to be a compiler intrinsic.
> >
> > Agreed. We could easily do something similar on arm64, but I'd prefer
> > to avoid that too.
>
> I'll see what we can do. Note that the compiler built-in we previously
> discussed would have semantics similar to function_nocfi(). It would
> return the raw function address from a symbol name, but it wouldn't
> decode the address from an arbitrary pointer, so this would require
> something different.
>
> Sami

Powered by blists - more mailing lists