lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yj3OEI+WHV/A5uf8@hirez.programming.kicks-ass.net>
Date:   Fri, 25 Mar 2022 15:13:36 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Mark Rutland <mark.rutland@....com>
Cc:     Nick Desaulniers <ndesaulniers@...gle.com>,
        Borislav Petkov <bp@...en8.de>,
        Nathan Chancellor <nathan@...nel.org>, x86-ml <x86@...nel.org>,
        lkml <linux-kernel@...r.kernel.org>, llvm@...ts.linux.dev,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        linux-toolchains@...r.kernel.org
Subject: Re: clang memcpy calls


+linux-toolchains

On Fri, Mar 25, 2022 at 12:15:28PM +0000, Mark Rutland wrote:
> On Thu, Mar 24, 2022 at 11:43:46AM -0700, Nick Desaulniers wrote:
> > On Thu, Mar 24, 2022 at 4:19 AM Borislav Petkov <bp@...en8.de> wrote:
> > >
> > > Hi folks,
> > >
> > > so I've been looking at a recent objtool noinstr warning from clang
> > > builds:
> > >
> > > vmlinux.o: warning: objtool: sync_regs()+0x20: call to memcpy() leaves .noinstr.text section
> > >
> > > The issue is that clang generates a memcpy() call when a struct copy
> > > happens:
> > >
> > >         if (regs != eregs)
> > >                 *regs = *eregs;
> > 
> > Specifically, this is copying one struct pt_regs to another. It looks
> > like the sizeof struct pt_regs is just large enough to have clang emit
> > the libcall.
> > https://godbolt.org/z/scx6aa8jq
> > Otherwise clang will also use rep; movsq; when -mno-sse -O2 is set and
> > the structs are below ARBITRARY_THRESHOLD.  Should ARBITRARY_THRESHOLD
> > be raised so that we continue to inline the memcpy? *shrug*
> > 
> > Though, looking at the compiled memcpy (`llvm-objdump -D
> > --disassemble-symbols=memcpy vmlinux`), maybe we *should* try harder.
> > Filed
> > https://github.com/llvm/llvm-project/issues/54535.
> > 
> > > see below for asm output.
> > >
> > > While gcc does simply generate an actual "rep; movsq".
> > >
> > > So, how hard would it be to make clang do that too pls?
> > 
> > As Mark said in the sibling reply; I don't know of general ways to
> > inhibit libcall optimizations on the level you're looking for, short
> > of heavy handy methods of disabling optimizations entirely.  There's
> > games that can be played with -fno-builtin-*, but they're not super
> > portable, and I think there's a handful of *blessed* functions that
> > must exist in any env, freestanding or not: memcpy, memmove, memset,
> > and memcmp for which you cannot yet express "these do not exist."
> 
> Talking with Peter on IRC, I think there's an oversight on the compiler
> side here w.r.t. the expectations around these blessed functions, since
> either:
> 
> a) The compiler expects the out-of-line implementations of functions
>    ARE NOT instrumented by address-sanitizer.
> 
>    If this is the case, then it's legitimate for the compiler to call
>    these functions anywhere, and we should NOT instrument the kernel
>    implementations of these. If the compiler wants those instrumented it
>    needs to add the instrumentation in the caller.
> 
> b) The compiler expects the out-of-line implementations of functions
>    ARE instrumented by address-sanitizer.
> 
>    If this is the case, the compiler MUST NOT generate implicit calls to
>    these "blessed" functions from functions marked with:
> 
>      __attribute__((no_sanitize_address)).
> 
>    ... or the compiler is violating the premise of that attribute.
> 
>    AFAICT The two options for the compiler here are:
> 
>    1) Always inline an uninstrumented form of the function in this case
> 
>    2) Have distinct instrumented/uninstrumented out-of-line
>       implementations, and call the uninstrumented form in this case.
> 
> To see what clang and GCC do today, I hacked the following in:
> 
> | diff --git a/init/main.c b/init/main.c
> | index 65fa2e41a9c0..30406c472b5d 100644
> | --- a/init/main.c
> | +++ b/init/main.c
> | @@ -1637,3 +1637,31 @@ static noinline void __init kernel_init_freeable(void)
> |  
> |         integrity_load_keys();
> |  }
> | +
> | +void
> | +test_implicit_memcpy(struct task_struct *dest,
> | +                    const struct task_struct *src)
> | +{
> | +       *dest = *src;
> | +}
> | +
> | +void
> | +test_explicit_memcpy(struct task_struct *dest,
> | +                    const struct task_struct *src)
> | +{
> | +       memcpy(dest, src, sizeof(*dest));
> | +}
> | +
> | +void __no_sanitize_address
> | +test_implicit_memcpy_nokasan(struct task_struct *dest,
> | +                            const struct task_struct *src)
> | +{
> | +       *dest = *src;
> | +}
> | +
> | +void __no_sanitize_address
> | +test_explicit_memcpy_nokasan(struct task_struct *dest,
> | +                            const struct task_struct *src)
> | +{
> | +       memcpy(dest, src, sizeof(*dest));
> | +}
> 
> 
> For arm64, GCC 11.1.0, KASAN_OUTLINE I see:
> 
> | <test_implicit_memcpy>:
> |        d503245f        bti     c
> |        d503233f        paciasp
> |        a9be7bfd        stp     x29, x30, [sp, #-32]!
> |        910003fd        mov     x29, sp
> |        a90153f3        stp     x19, x20, [sp, #16]
> |        aa0103f3        mov     x19, x1
> |        aa0003f4        mov     x20, x0
> |        d281c001        mov     x1, #0xe00                      // #3584
> |        940b9534        bl      ffff8000082f9d90 <__asan_storeN>
> |        aa1303e0        mov     x0, x19
> |        d281c001        mov     x1, #0xe00                      // #3584
> |        940b951e        bl      ffff8000082f9d44 <__asan_loadN>
> |        aa1303e1        mov     x1, x19
> |        aa1403e0        mov     x0, x20
> |        d281c002        mov     x2, #0xe00                      // #3584
> |        940b98c5        bl      ffff8000082fabf0 <memcpy>
> |        a94153f3        ldp     x19, x20, [sp, #16]
> |        a8c27bfd        ldp     x29, x30, [sp], #32
> |        d50323bf        autiasp
> |        d65f03c0        ret
> | 
> | <test_explicit_memcpy>:
> |        d503245f        bti     c
> |        d503233f        paciasp
> |        a9bf7bfd        stp     x29, x30, [sp, #-16]!
> |        d281c002        mov     x2, #0xe00                      // #3584
> |        910003fd        mov     x29, sp
> |        940b98bb        bl      ffff8000082fabf0 <memcpy>
> |        a8c17bfd        ldp     x29, x30, [sp], #16
> |        d50323bf        autiasp
> |        d65f03c0        ret
> | 
> | <test_implicit_memcpy_nokasan>:
> |        d503245f        bti     c
> |        d503233f        paciasp
> |        a9bf7bfd        stp     x29, x30, [sp, #-16]!
> |        d281c002        mov     x2, #0xe00                      // #3584
> |        910003fd        mov     x29, sp
> |        940b98b2        bl      ffff8000082fabf0 <memcpy>
> |        a8c17bfd        ldp     x29, x30, [sp], #16
> |        d50323bf        autiasp
> |        d65f03c0        ret
> | 
> | <test_explicit_memcpy_nokasan>:
> |        d503245f        bti     c
> |        d503233f        paciasp
> |        a9bf7bfd        stp     x29, x30, [sp, #-16]!
> |        d281c002        mov     x2, #0xe00                      // #3584
> |        910003fd        mov     x29, sp
> |        940b98a7        bl      ffff8000082fabf0 <memcpy>
> |        a8c17bfd        ldp     x29, x30, [sp], #16
> |        d50323bf        autiasp
> |        d65f03c0        ret
> 
> For x86_64, GCC 11.1.0, KASAN_OUTLINE I see:
> 
> | <test_implicit_memcpy>:
> |        41 54                   push   %r12
> |        49 89 fc                mov    %rdi,%r12
> |        55                      push   %rbp
> |        48 89 f5                mov    %rsi,%rbp
> |        be 40 1c 00 00          mov    $0x1c40,%esi
> |        e8 0d 9a 32 00          call   ffffffff8132b0f0 <__asan_storeN>
> |        48 89 ef                mov    %rbp,%rdi
> |        be 40 1c 00 00          mov    $0x1c40,%esi
> |        e8 f0 99 32 00          call   ffffffff8132b0e0 <__asan_loadN>
> |        4c 89 e7                mov    %r12,%rdi
> |        48 89 ee                mov    %rbp,%rsi
> |        b9 88 03 00 00          mov    $0x388,%ecx
> |        f3 48 a5                rep movsq %ds:(%rsi),%es:(%rdi)
> |        5d                      pop    %rbp
> |        41 5c                   pop    %r12
> |        c3                      ret   
> | 
> | <test_explicit_memcpy>:
> |        ba 40 1c 00 00          mov    $0x1c40,%edx
> |        e9 b6 9f 32 00          jmp    ffffffff8132b6d0 <memcpy>
> | 
> | 
> | <test_implicit_memcpy_nokasan>:
> |        b9 88 03 00 00          mov    $0x388,%ecx
> |        f3 48 a5                rep movsq %ds:(%rsi),%es:(%rdi)
> |        c3                      ret    
> | 
> | <test_explicit_memcpy_nokasan>:
> |        ba 40 1c 00 00          mov    $0x1c40,%edx
> |        e9 96 9f 32 00          jmp    ffffffff8132b6d0 <memcpy>
> 
> So from those examples it seems GCC falls into bucket (a), and assumes the
> blessed functions ARE NOT instrumented.
> 
> We can make this noinstr-safe AND get instrumentation for the first two cases
> by removing the instrumentation from the out-of-line copies (always using
> noinstr asm implementations) and using ifdeffery to make the explicit calls
> target as distinct kasan_instrumented_memcpy() or similar...
> 
> 
> For arm64, clang 13.0.0, KASAN_OUTLINE I see:
> 
> | <test_implicit_memcpy>:
> |        d503233f        paciasp
> |        a9bf7bfd        stp     x29, x30, [sp, #-16]!
> |        910003fd        mov     x29, sp
> |        5281c002        mov     w2, #0xe00                      // #3584
> |        940c0f66        bl      ffff8000083185fc <memcpy>
> |        a8c17bfd        ldp     x29, x30, [sp], #16
> |        d50323bf        autiasp
> |        d65f03c0        ret
> | 
> | <test_explicit_memcpy>:
> |        d503233f        paciasp
> |        a9bf7bfd        stp     x29, x30, [sp, #-16]!
> |        910003fd        mov     x29, sp
> |        5281c002        mov     w2, #0xe00                      // #3584
> |        940c0f5e        bl      ffff8000083185fc <memcpy>
> |        a8c17bfd        ldp     x29, x30, [sp], #16
> |        d50323bf        autiasp
> |        d65f03c0        ret
> | 
> | <test_implicit_memcpy_nokasan>:
> |        d503233f        paciasp
> |        a9bf7bfd        stp     x29, x30, [sp, #-16]!
> |        910003fd        mov     x29, sp
> |        5281c002        mov     w2, #0xe00                      // #3584
> |        940c0f56        bl      ffff8000083185fc <memcpy>
> |        a8c17bfd        ldp     x29, x30, [sp], #16
> |        d50323bf        autiasp
> |        d65f03c0        ret
> | 
> | <test_explicit_memcpy_nokasan>:
> |        d503233f        paciasp
> |        a9bf7bfd        stp     x29, x30, [sp, #-16]!
> |        910003fd        mov     x29, sp
> |        5281c002        mov     w2, #0xe00                      // #3584
> |        940c0f4e        bl      ffff8000083185fc <memcpy>
> |        a8c17bfd        ldp     x29, x30, [sp], #16
> |        d50323bf        autiasp
> |        d65f03c0        ret
> 
> For x86_64, clang 13.0.0, KASAN_OUTLINE I see:
> 
> | <test_implicit_memcpy>:
> |        ba 40 1c 00 00          mov    $0x1c40,%edx
> |        e8 d6 94 36 00          call   ffffffff8136a830 <memcpy>
> |        c3                      ret  
> | 
> | <test_explicit_memcpy>:
> |        ba 40 1c 00 00          mov    $0x1c40,%edx
> |        e8 c6 94 36 00          call   ffffffff8136a830 <memcpy>
> |        c3                      ret    
> | 
> | 
> | <test_implicit_memcpy_nokasan>:
> |        ba 40 1c 00 00          mov    $0x1c40,%edx
> |        e9 b6 94 36 00          jmp    ffffffff8136a830 <memcpy>
> | 
> | 
> | <test_explicit_memcpy_nokasan>:
> |        ba 40 1c 00 00          mov    $0x1c40,%edx
> |        e9 a6 94 36 00          jmp    ffffffff8136a830 <memcpy>
> 
> ... for which the first two suggests clang thinks the blessed functions *are*
> instrumented, which means that generating calls to those in the latter two
> cases is a bug.
> 
> We can make this noinstr-safe as with the GCC case, but we'll lose the
> desirable instrumentation for the test_implicit_memcpy() case.
> 
> I think something has to change on the compiler side here (e.g. as per
> options above), and we should align GCC and clang on the same
> approach...
> 
> Thanks,
> Mark.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ