lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b2c0acf3e9af4ca9bf1595f8b66aee78@AcuMS.aculab.com>
Date: Fri, 22 Nov 2024 09:38:52 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Linus Torvalds' <torvalds@...ux-foundation.org>, Josh Poimboeuf
	<jpoimboe@...nel.org>
CC: "x86@...nel.org" <x86@...nel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, Thomas Gleixner <tglx@...utronix.de>,
	Borislav Petkov <bp@...en8.de>, Peter Zijlstra <peterz@...radead.org>, "Pawan
 Gupta" <pawan.kumar.gupta@...ux.intel.com>, Waiman Long <longman@...hat.com>,
	Dave Hansen <dave.hansen@...ux.intel.com>, Ingo Molnar <mingo@...hat.com>,
	Michael Ellerman <mpe@...erman.id.au>, "linuxppc-dev@...ts.ozlabs.org"
	<linuxppc-dev@...ts.ozlabs.org>, Andrew Cooper <andrew.cooper3@...rix.com>,
	Mark Rutland <mark.rutland@....com>, "Kirill A . Shutemov"
	<kirill@...temov.name>
Subject: RE: [PATCH v3 2/6] x86/uaccess: Avoid barrier_nospec() in 64-bit
 __get_user()

From: Linus Torvalds
> Sent: 21 November 2024 22:16
> 
> On Thu, 21 Nov 2024 at 13:40, Josh Poimboeuf <jpoimboe@...nel.org> wrote:
> >
> > The profile is showing futex_get_value_locked():
> 
> Ahh.
> 
> > That has several callers, so we can probably just use get_user() there?
> 
> Yeah, that's the simplest thing. That thing isn't even some inline
> function, so the real cost is the call.
> 
> That said, exactly because it's not inlined, and calls are expensive,
> and this is apparently really critical, we can just do it with the
> full "unsafe_get_user()" model.
> 
> It's not so complicated. The attached patch is untested, but I did
> check that it generates almost perfect code:
> 
>     mov    %gs:0x0,%rax                 # current
>     incl   0x1a9c(%rax)                 # current->pagefault_disable++
>     movabs $0x123456789abcdef,%rcx      # magic virtual address size
>     cmp    %rsi,%rcx                    # address masking
>     sbb    %rcx,%rcx
>     or     %rsi,%rcx
>     stac                                # enable user space acccess
>     mov    (%rcx),%ecx                  # get the value
>     clac                                # disable user space access
>     decl   0x1a9c(%rax)                 # current->pagefault_disable--
>     mov    %ecx,(%rdi)                  # save the value
>     xor    %eax,%eax                    # return 0
>     ret
> 
> (with the error case for the page fault all out-of-line).

I presume you are assuming an earlier access_ok() call? 

IIRC all x86 from 286 onwards fault accesses that cross the ~0 to 0 boundary.
So you don't need to have called access_ok() prior to the above
for non-byte accesses.
Even for byte accesses (and with ~0 being a valid address) do the
address mask before pagefault_disable++ and the error test is 'jc label'
after the 'cmp'.

Don't you also get better code for an 'asm output with goto' form?
While it requires the caller have a 'label: return -EFAULT;' somewhere
that is quite common in kernel code.

> So this should be _faster_ than the old __get_user(), because while
> the address masking is not needed, it's cheaper than the function call
> used to be and the error handling is better.

Perhaps get_user() should be the function call and __get_user() inlined.
Both including address validation but different calling conventions?
(After fixing the code that doesn't want address masking.)

...
> Now, we could possibly say "just remove the fence in __get_user()
> entirely", but that would involve moving it to access_ok().

How valid would it be to assume an explicit access_ok() was far
enough away from the get_user() that you couldn't speculate all the
way to something that used the read value to perform another
kernel access?

> And then it wouldn't actually speed anything up (except the horrid
> architecture-specific kernel uses that then don't call access_ok() at
> all - and we don't care about *those*).

Find and kill :-)

	David

> 
>                Linus

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ