[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b2c0acf3e9af4ca9bf1595f8b66aee78@AcuMS.aculab.com>
Date: Fri, 22 Nov 2024 09:38:52 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Linus Torvalds' <torvalds@...ux-foundation.org>, Josh Poimboeuf
<jpoimboe@...nel.org>
CC: "x86@...nel.org" <x86@...nel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>, Peter Zijlstra <peterz@...radead.org>, "Pawan
Gupta" <pawan.kumar.gupta@...ux.intel.com>, Waiman Long <longman@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, Ingo Molnar <mingo@...hat.com>,
Michael Ellerman <mpe@...erman.id.au>, "linuxppc-dev@...ts.ozlabs.org"
<linuxppc-dev@...ts.ozlabs.org>, Andrew Cooper <andrew.cooper3@...rix.com>,
Mark Rutland <mark.rutland@....com>, "Kirill A . Shutemov"
<kirill@...temov.name>
Subject: RE: [PATCH v3 2/6] x86/uaccess: Avoid barrier_nospec() in 64-bit
__get_user()
From: Linus Torvalds
> Sent: 21 November 2024 22:16
>
> On Thu, 21 Nov 2024 at 13:40, Josh Poimboeuf <jpoimboe@...nel.org> wrote:
> >
> > The profile is showing futex_get_value_locked():
>
> Ahh.
>
> > That has several callers, so we can probably just use get_user() there?
>
> Yeah, that's the simplest thing. That thing isn't even some inline
> function, so the real cost is the call.
>
> That said, exactly because it's not inlined, and calls are expensive,
> and this is apparently really critical, we can just do it with the
> full "unsafe_get_user()" model.
>
> It's not so complicated. The attached patch is untested, but I did
> check that it generates almost perfect code:
>
> mov %gs:0x0,%rax # current
> incl 0x1a9c(%rax) # current->pagefault_disable++
> movabs $0x123456789abcdef,%rcx # magic virtual address size
> cmp %rsi,%rcx # address masking
> sbb %rcx,%rcx
> or %rsi,%rcx
> stac # enable user space acccess
> mov (%rcx),%ecx # get the value
> clac # disable user space access
> decl 0x1a9c(%rax) # current->pagefault_disable--
> mov %ecx,(%rdi) # save the value
> xor %eax,%eax # return 0
> ret
>
> (with the error case for the page fault all out-of-line).
I presume you are assuming an earlier access_ok() call?
IIRC all x86 from 286 onwards fault accesses that cross the ~0 to 0 boundary.
So you don't need to have called access_ok() prior to the above
for non-byte accesses.
Even for byte accesses (and with ~0 being a valid address) do the
address mask before pagefault_disable++ and the error test is 'jc label'
after the 'cmp'.
Don't you also get better code for an 'asm output with goto' form?
While it requires the caller have a 'label: return -EFAULT;' somewhere
that is quite common in kernel code.
> So this should be _faster_ than the old __get_user(), because while
> the address masking is not needed, it's cheaper than the function call
> used to be and the error handling is better.
Perhaps get_user() should be the function call and __get_user() inlined.
Both including address validation but different calling conventions?
(After fixing the code that doesn't want address masking.)
...
> Now, we could possibly say "just remove the fence in __get_user()
> entirely", but that would involve moving it to access_ok().
How valid would it be to assume an explicit access_ok() was far
enough away from the get_user() that you couldn't speculate all the
way to something that used the read value to perform another
kernel access?
> And then it wouldn't actually speed anything up (except the horrid
> architecture-specific kernel uses that then don't call access_ok() at
> all - and we don't care about *those*).
Find and kill :-)
David
>
> Linus
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists