[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b90410d3f213496ebfdd2f561281791b@AcuMS.aculab.com>
Date: Sun, 24 Nov 2024 20:49:29 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Linus Torvalds' <torvalds@...ux-foundation.org>
CC: Andrew Cooper <andrew.cooper3@...rix.com>, "bp@...en8.de" <bp@...en8.de>,
Josh Poimboeuf <jpoimboe@...nel.org>, "x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Arnd Bergmann
<arnd@...nel.org>, Mikel Rychliski <mikel@...elr.com>, Thomas Gleixner
<tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Dave Hansen
<dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>
Subject: RE: [PATCH v2] x86: Allow user accesses to the base of the guard page
From: Linus Torvalds
> Sent: 24 November 2024 18:53
> On Sun, 24 Nov 2024 at 07:39, David Laight <David.Laight@...lab.com> wrote:
> >
> > v2: Rewritten commit message.
>
> Grr. Now I remember why I did it this way - I started looking around
> for the bigger context and history.
>
> I wanted that "valid_user_address()" to really be "is this a valid
> user address", because it's also used by the fault handling code (for
> that reason).
Doesn't that just need a <= changed to < ?
(And possibly of name)
...
> and that would make this all go away, and that was why I was
> (incorrectly) fixating on the zero-sized access at the end of the
> address space, because I wasn't even thinking about this part of
> __access_ok().
access_ok(NULL, 0) is probably the annoying case that stops it using
valid_user_address(ptr + size - 1).
And the 'lea' that will do 'x + y - 1' runs on fewer 'ports' than add.
> IOW, my *preferred* fix for this all would actually look like this:
>
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -86,24 +86,12 @@ static inline void __user *mask_user_address(const void __user *ptr)
> *
> * Note that we always have at least one guard page between the
> * max user address and the non-canonical gap, allowing us to
> + * ignore the size entirely, since any kernel accesses will be in
> + * increasing address order starting at 'ptr'.
> */
> static inline bool __access_ok(const void __user *ptr, unsigned long size)
> {
> + return valid_user_address(ptr);
> }
> #define __access_ok __access_ok
>
> but I suspect that I'm too chicken to actually do that.
>
> Please somebody convince me.
I didn't know you really had a 'chicken streak' :-)
You'd need to double-check that nothing is parsing TLD data
(like CMSG or netlink buffers) directly from userspace having
done an outer access_ok() and then using __get_user().
OTOH there are few enough calls to access_ok() they can all
be checked.
Another place might be a copy_to/from_user() implementation
that does a copy of the final 'word' first and then copies
a whole number of words from the start of the buffer.
x86 should just be using 'rep movsb' (except for some constant sizes)
because I doubt anything else is worth the overhead of the (mispredicted
half the time) branch.
As an aside the:
> movabs $0x123456789abcdef,%rcx # magic virtual address size
> cmp %rsi,%rcx # address masking
> sbb %rcx,%rcx
> or %rsi,%rcx
sequence could be
> movabs $0x123456789abcdef,%rcx # magic virtual address size
> cmp %rsi,%rcx # address masking
> cmovc %rsi,%rcx
Provided the constant is TASK_SIZE_MAX (without the -1).
Remember cmov is an arithmetic instruction much like adc except
that it contains a multiplexor/selector not an adder.
Interestingly the intel implementation drops to 1 clock a family before
adc/sbb (the architecture didn't support 3 register inputs to one u-op).
(actually Ryzen might implement cmov as a conditional register rename)
In either case it won't be subject to misprediction.
That actually removes the requirement to access the base address first.
Just need to avoid jumps of PAGE_SIZE.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists