lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wigHm2J4LkUL1=y_H8zGwM0JsK2CrWyLNbz9fvXfbaBQA@mail.gmail.com>
Date: Thu, 21 Nov 2024 14:16:12 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Josh Poimboeuf <jpoimboe@...nel.org>
Cc: David Laight <David.Laight@...lab.com>, "x86@...nel.org" <x86@...nel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Thomas Gleixner <tglx@...utronix.de>, 
	Borislav Petkov <bp@...en8.de>, Peter Zijlstra <peterz@...radead.org>, 
	Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>, Waiman Long <longman@...hat.com>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, Ingo Molnar <mingo@...hat.com>, 
	Michael Ellerman <mpe@...erman.id.au>, 
	"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>, Andrew Cooper <andrew.cooper3@...rix.com>, 
	Mark Rutland <mark.rutland@....com>, "Kirill A . Shutemov" <kirill@...temov.name>
Subject: Re: [PATCH v3 2/6] x86/uaccess: Avoid barrier_nospec() in 64-bit __get_user()

On Thu, 21 Nov 2024 at 13:40, Josh Poimboeuf <jpoimboe@...nel.org> wrote:
>
> The profile is showing futex_get_value_locked():

Ahh.

> That has several callers, so we can probably just use get_user() there?

Yeah, that's the simplest thing. That thing isn't even some inline
function, so the real cost is the call.

That said, exactly because it's not inlined, and calls are expensive,
and this is apparently really critical, we can just do it with the
full "unsafe_get_user()" model.

It's not so complicated. The attached patch is untested, but I did
check that it generates almost perfect code:

    mov    %gs:0x0,%rax                 # current
    incl   0x1a9c(%rax)                 # current->pagefault_disable++
    movabs $0x123456789abcdef,%rcx      # magic virtual address size
    cmp    %rsi,%rcx                    # address masking
    sbb    %rcx,%rcx
    or     %rsi,%rcx
    stac                                # enable user space acccess
    mov    (%rcx),%ecx                  # get the value
    clac                                # disable user space access
    decl   0x1a9c(%rax)                 # current->pagefault_disable--
    mov    %ecx,(%rdi)                  # save the value
    xor    %eax,%eax                    # return 0
    ret

(with the error case for the page fault all out-of-line).

So this should be _faster_ than the old __get_user(), because while
the address masking is not needed, it's cheaper than the function call
used to be and the error handling is better.

If you can test this and verify that it actually help, I'll take it as
a patch. Consider it signed-off after testing.

> Also, is there any harm in speeding up __get_user()?  It still has ~80
> callers and it's likely to be slowing down things we don't know about.

How would you speed it up?  We definitely can't replace the fence with
addressing tricks. So we can't just replace it with "get_user()",
because of those horrid architecture-specific kernel uses.

Now, we could possibly say "just remove the fence in __get_user()
entirely", but that would involve moving it to access_ok().

And then it wouldn't actually speed anything up (except the horrid
architecture-specific kernel uses that then don't call access_ok() at
all - and we don't care about *those*).

               Linus

View attachment "patch.diff" of type "text/x-patch" (1128 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ