[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wiv=9zGSwsu+=tKNgDg8oBUJn_25OEy_0wqO+rvz7p8wg@mail.gmail.com>
Date: Sun, 30 Jun 2024 09:59:36 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Arnd Bergmann <arnd@...db.de>, Will Deacon <will@...nel.org>,
Catalin Marinas <catalin.marinas@....com>
Cc: Jisheng Zhang <jszhang@...nel.org>, Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>, Albert Ou <aou@...s.berkeley.edu>,
Linux-Arch <linux-arch@...r.kernel.org>, linux-riscv@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/4] riscv: uaccess: optimizations
On Tue, 25 Jun 2024 at 11:12, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> But yes, it would be lovely if we did things as "implement the
> low-level accessor functions and we'll wrap them for the generic case"
> rather than have every architecture have to do the wrapping..
Btw, to do that _well_, we need to expand on the user access functions
a bit more.
In particular, we can already implement "get_user()" on top of
user_access_begin() and friends something like this:
#define get_user(x,ptr) ({ \
__label__ Efault_read; \
long __err; \
__typeof__(ptr) __ptr = (ptr); \
if (likely(user_access_begin(__ptr, sizeof(x)))) { \
unsafe_get_user(x, __ptr, Efault_read); \
user_access_end(); \
__err = 0; \
} else { \
if (0) { \
Efault_read: user_access_end(); \
} \
x = (__typeof__(x))0; \
__err = -EFAULT; \
} \
__err; })
and it doesn't generate _horrible_ code. It looks pretty bad, but all
the error handling goes out-of-line, so on x86-64 (without debug
options) it generates code something like this:
test %rdi,%rdi
js <cap_validate_magic+0x98>
stac
lfence
mov (%rdi),%ecx
clac
which is certainly not horrid. But it's not great, because that lfence
ends up really screwing up the pipeline.
The manually coded out-of-line code generates this instead:
mov %rax,%rdx
sar $0x3f,%rdx
or %rdx,%rax
stac
movzbl (%rax),%edx
xor %eax,%eax
clac
ret
because it doesn't do a conditional branch (with the required spectre
thing), but instead does the address as a data dependency and knows
that "all bits set" if the address was negative will cause a page
fault.
But we *can* get the user accesses to do the same with a slight
expansion of user_access_begin():
stac
mov %rdi,%rax
sar $0x3f,%rax
or %rdi,%rax
mov (%rax),%eax
clac
by just introducing a notion of "masked_user_access". The get_user()
macro part would look like this:
__typeof__(ptr) __ptr; \
__ptr = masked_user_access_begin(ptr); \
if (1) { \
unsafe_get_user(x, __ptr, Efault_read); \
user_access_end(); \
and the patch to implement this is attached. I've had it around for a
while, but I don't know how many architectures can do this.
Note this part of the commit message:
This model only works for dense accesses that start at 'src' and on
architectures that have a guard region that is guaranteed to fault in
between the user space and the kernel space area.
which is true on x86-64, but that "guard region" thing might not be
true everywhere.
Will/Catalin - would that
src = masked_user_access_begin(src);
work on arm64? The code does do something like that with
__uaccess_mask_ptr() already, but at least currently it doesn't do the
"avoid conditional entirely", the masking is just in _addition_ to the
access_ok().
Linus
View attachment "patch.diff" of type "text/x-patch" (5321 bytes)
Powered by blists - more mailing lists