lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b90410d3f213496ebfdd2f561281791b@AcuMS.aculab.com>
Date: Sun, 24 Nov 2024 20:49:29 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Linus Torvalds' <torvalds@...ux-foundation.org>
CC: Andrew Cooper <andrew.cooper3@...rix.com>, "bp@...en8.de" <bp@...en8.de>,
	Josh Poimboeuf <jpoimboe@...nel.org>, "x86@...nel.org" <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Arnd Bergmann
	<arnd@...nel.org>, Mikel Rychliski <mikel@...elr.com>, Thomas Gleixner
	<tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Dave Hansen
	<dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>
Subject: RE: [PATCH v2] x86: Allow user accesses to the base of the guard page

From: Linus Torvalds
> Sent: 24 November 2024 18:53

> On Sun, 24 Nov 2024 at 07:39, David Laight <David.Laight@...lab.com> wrote:
> >
> > v2: Rewritten commit message.
> 
> Grr. Now I remember why I did it this way - I started looking around
> for the bigger context and history.
> 
> I wanted that "valid_user_address()" to really be "is this a valid
> user address", because it's also used by the fault handling code (for
> that reason).

Doesn't that just need a <= changed to < ?
(And possibly of name)

...
> and that would make this all go away, and that was why I was
> (incorrectly) fixating on the zero-sized access at the end of the
> address space, because I wasn't even thinking about this part of
> __access_ok().

access_ok(NULL, 0) is probably the annoying case that stops it using
valid_user_address(ptr + size - 1).
And the 'lea' that will do 'x + y - 1' runs on fewer 'ports' than add.

> IOW, my *preferred* fix for this all would actually look like this:
> 
>   --- a/arch/x86/include/asm/uaccess_64.h
>   +++ b/arch/x86/include/asm/uaccess_64.h
>   @@ -86,24 +86,12 @@ static inline void __user *mask_user_address(const void __user *ptr)
>     *
>     * Note that we always have at least one guard page between the
>     * max user address and the non-canonical gap, allowing us to
>   + * ignore the size entirely, since any kernel accesses will be in
>   + * increasing address order starting at 'ptr'.
>     */
>    static inline bool __access_ok(const void __user *ptr, unsigned long size)
>    {
>   +     return valid_user_address(ptr);
>    }
>    #define __access_ok __access_ok
> 
> but I suspect that I'm too chicken to actually do that.
> 
> Please somebody convince me.

I didn't know you really had a 'chicken streak' :-)

You'd need to double-check that nothing is parsing TLD data
(like CMSG or netlink buffers) directly from userspace having
done an outer access_ok() and then using __get_user().
OTOH there are few enough calls to access_ok() they can all
be checked.

Another place might be a copy_to/from_user() implementation
that does a copy of the final 'word' first and then copies
a whole number of words from the start of the buffer.
x86 should just be using 'rep movsb' (except for some constant sizes)
because I doubt anything else is worth the overhead of the (mispredicted
half the time) branch.

As an aside the:
>     movabs $0x123456789abcdef,%rcx      # magic virtual address size
>     cmp    %rsi,%rcx                    # address masking
>     sbb    %rcx,%rcx
>     or     %rsi,%rcx
sequence could be
>     movabs $0x123456789abcdef,%rcx      # magic virtual address size
>     cmp    %rsi,%rcx                    # address masking
>     cmovc  %rsi,%rcx
Provided the constant is TASK_SIZE_MAX (without the -1).

Remember cmov is an arithmetic instruction much like adc except
that it contains a multiplexor/selector not an adder.
Interestingly the intel implementation drops to 1 clock a family before
adc/sbb (the architecture didn't support 3 register inputs to one u-op).
(actually Ryzen might implement cmov as a conditional register rename)

In either case it won't be subject to misprediction.

That actually removes the requirement to access the base address first.
Just need to avoid jumps of PAGE_SIZE.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ