linux-kernel - Re: [PATCH 0/4] riscv: uaccess: optimizations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wiv=9zGSwsu+=tKNgDg8oBUJn_25OEy_0wqO+rvz7p8wg@mail.gmail.com>
Date: Sun, 30 Jun 2024 09:59:36 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Arnd Bergmann <arnd@...db.de>, Will Deacon <will@...nel.org>, 
	Catalin Marinas <catalin.marinas@....com>
Cc: Jisheng Zhang <jszhang@...nel.org>, Paul Walmsley <paul.walmsley@...ive.com>, 
	Palmer Dabbelt <palmer@...belt.com>, Albert Ou <aou@...s.berkeley.edu>, 
	Linux-Arch <linux-arch@...r.kernel.org>, linux-riscv@...ts.infradead.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/4] riscv: uaccess: optimizations

On Tue, 25 Jun 2024 at 11:12, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> But yes, it would be lovely if we did things as "implement the
> low-level accessor functions and we'll wrap them for the generic case"
> rather than have every architecture have to do the wrapping..

Btw, to do that _well_, we need to expand on the user access functions
a bit more.

In particular, we can already implement "get_user()" on top of
user_access_begin() and friends something like this:

  #define get_user(x,ptr) ({                                    \
        __label__ Efault_read;                                  \
        long __err;                                             \
        __typeof__(ptr) __ptr = (ptr);                          \
        if (likely(user_access_begin(__ptr, sizeof(x)))) {      \
                unsafe_get_user(x, __ptr, Efault_read);         \
                user_access_end();                              \
                __err = 0;                                      \
        } else {                                                \
                if (0) {                                        \
Efault_read:            user_access_end();                      \
                }                                               \
                x = (__typeof__(x))0;                           \
                __err = -EFAULT;                                \
        }                                                       \
        __err; })

and it doesn't generate _horrible_ code. It looks pretty bad, but all
the error handling goes out-of-line, so on x86-64 (without debug
options) it generates code something like this:

        test   %rdi,%rdi
        js     <cap_validate_magic+0x98>
        stac
        lfence
        mov    (%rdi),%ecx
        clac

which is certainly not horrid. But it's not great, because that lfence
ends up really screwing up the pipeline.

The manually coded out-of-line code generates this instead:

        mov    %rax,%rdx
        sar    $0x3f,%rdx
        or     %rdx,%rax
        stac
        movzbl (%rax),%edx
        xor    %eax,%eax
        clac
        ret

because it doesn't do a conditional branch (with the required spectre
thing), but instead does the address as a data dependency and knows
that "all bits set" if the address was negative will cause a page
fault.

But we *can* get the user accesses to do the same with a slight
expansion of user_access_begin():

        stac
        mov    %rdi,%rax
        sar    $0x3f,%rax
        or     %rdi,%rax
        mov    (%rax),%eax
        clac

by just introducing a notion of "masked_user_access". The get_user()
macro part would look like this:

        __typeof__(ptr) __ptr;                                  \
        __ptr = masked_user_access_begin(ptr);                  \
        if (1) {                                                \
                unsafe_get_user(x, __ptr, Efault_read);         \
                user_access_end();                              \

and the patch to implement this is attached. I've had it around for a
while, but I don't know how many architectures can do this.

Note this part of the commit message:

  This model only works for dense accesses that start at 'src' and on
  architectures that have a guard region that is guaranteed to fault in
  between the user space and the kernel space area.

which is true on x86-64, but that "guard region" thing might not be
true everywhere.

Will/Catalin - would that

        src = masked_user_access_begin(src);

work on arm64? The code does do something like that with
__uaccess_mask_ptr() already, but at least currently it doesn't do the
"avoid conditional entirely", the masking is just in _addition_ to the
access_ok().

                 Linus

View attachment "patch.diff" of type "text/x-patch" (5321 bytes)