linux-kernel - Re: [PATCH 1/2] riscv: Use pointer masking to limit uaccess speculation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251231103511.5ddb768b@pumpkin>
Date: Wed, 31 Dec 2025 10:35:11 +0000
From: David Laight <david.laight.linux@...il.com>
To: Vivian Wang <wangruikang@...as.ac.cn>
Cc: Deepak Gupta <debug@...osinc.com>, Lukas Gerlach
 <lukas.gerlach@...pa.de>, linux-riscv@...ts.infradead.org,
 palmer@...belt.com, pjw@...nel.org, aou@...s.berkeley.edu, alex@...ti.fr,
 linux-kernel@...r.kernel.org, daniel.weber@...pa.de,
 michael.schwarz@...pa.de, marton.bognar@...euven.be,
 jo.vanbulck@...euven.be
Subject: Re: [PATCH 1/2] riscv: Use pointer masking to limit uaccess
 speculation

On Wed, 31 Dec 2025 11:47:16 +0800
Vivian Wang <wangruikang@...as.ac.cn> wrote:

> On 12/29/25 20:32, David Laight wrote:
> > On Sun, 28 Dec 2025 22:34:30 +0000
> > David Laight <david.laight.linux@...il.com> wrote:
> >  
> >> On Sat, 27 Dec 2025 17:59:38 -0800
> >> Deepak Gupta <debug@...osinc.com> wrote:
> >>  
> >>> On Sat, Dec 27, 2025 at 09:28:59PM +0000, David Laight wrote:    
> >>>> On Fri, 19 Dec 2025 16:44:11 -0800
> >>>> Deepak Gupta <debug@...osinc.com> wrote:
> >>>>      
> >>>>> On Thu, Dec 18, 2025 at 08:13:31PM +0100, Lukas Gerlach wrote:      
> >>>>>> Similarly to x86 and arm64, mitigate speculation past an access_ok()
> >>>>>> check by masking the pointer before use.
> >>>>>>
> >>>>>> On RISC-V, user addresses have the MSB clear while kernel addresses
> >>>>>> have the MSB set. The uaccess_mask_ptr() function clears the MSB,
> >>>>>> ensuring any kernel pointer becomes invalid and will fault, while
> >>>>>> valid user pointers remain unchanged. This prevents speculative
> >>>>>> access to kernel memory via user copy functions.
> >>>>>>
> >>>>>> The masking is applied to __get_user, __put_user, raw_copy_from_user,
> >>>>>> raw_copy_to_user, clear_user, and the unsafe_* variants.
> >>>>>>
> >>>>>> Signed-off-by: Lukas Gerlach <lukas.gerlach@...pa.de>
> >>>>>> ---
> >>>>>> arch/riscv/include/asm/uaccess.h | 41 +++++++++++++++++++++++++-------
> >>>>>> 1 file changed, 32 insertions(+), 9 deletions(-)
> >>>>>>
> >>>>>> diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
> >>>>>> index 36bba6720c26..ceee1d62ff9b 100644
> >>>>>> --- a/arch/riscv/include/asm/uaccess.h
> >>>>>> +++ b/arch/riscv/include/asm/uaccess.h
> >>>>>> @@ -74,6 +74,23 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm, unsigne
> >>>>>> #define __typefits(x, type, not) \
> >>>>>> 	__builtin_choose_expr(sizeof(x) <= sizeof(type), (unsigned type)0, not)
> >>>>>>
> >>>>>> +/*
> >>>>>> + * Sanitize a uaccess pointer such that it cannot reach any kernel address.
> >>>>>> + *
> >>>>>> + * On RISC-V, virtual addresses are sign-extended from the top implemented bit.
> >>>>>> + * User addresses have the MSB clear; kernel addresses have the MSB set.
> >>>>>> + * Clearing the MSB ensures any kernel pointer becomes non-canonical and will
> >>>>>> + * fault, while valid user pointers remain unchanged.
> >>>>>> + */
> >>>>>> +#define uaccess_mask_ptr(ptr) ((__typeof__(ptr))__uaccess_mask_ptr(ptr))
> >>>>>> +static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
> >>>>>> +{
> >>>>>> +	unsigned long val = (unsigned long)ptr;
> >>>>>> +
> >>>>>> +	val = (val << 1) >> 1;
> >>>>>> +	return (void __user *)val;      
> >>>>> This is only clearing b63 which is what we don't need here.      
> >>>> It is also entirely the wrong operation.
> >>>> A kernel address needs converting into an address that is guaranteed
> >>>> to fault - not a user address that might be valid.      
> >>> This is about speculative accesses and not actual accesses. Due to some
> >>> speculation it is possible that in speculative path a wrong address is
> >>> generated with MSB=1. This simply ensures that bit is cleared for agen
> >>> even in speculative path.    
> >> You said you were following what x86 did - this isn't what is does.
> >> Avoiding the conditional branch in access_ok() is actually a big win.
> >> That is true even without the issues with speculative accesses to kernel
> >> memory.
> >> The 'address masking' (badly named - it isn't just masking) is a replacement
> >> for access_ok(), it changes kernel addresses to invalid ones so that the
> >> access faults and the trap/fault fixup code detects the error.  
> > If you can guarantee that address 0 is never mapped into userspace
> > (not true for x86 for historic reasons) then I think you can convert
> > kernel addresses to zero - which will then fault.
> > This is simpler than using the base of the guard page (which is still
> > needed for sequential accesses).
> > Something like:
> > 	addr &= (addr >= guard_page) - 1;
> > will then work - but it needs (partially) coding in assembler to
> > guarantee that 'setae' (or equivalent) is used rather than a
> > conditional branch.
> >    
> 
> Maybe use:
> 
>     addr |= (long)addr >> 63
> 
> To map kernel addresses to -1?

That also requires that address 0 is never mapped into userspace.
The first access might be offset from the actual base address.
(It is too hard to check that code accesses the first member of a
structure first - and some obvious candidate bits of code don't.)

There are also issues on x86-64 (I think amd cpu) which mean that
just testing the high bit isn't good enough, the check has to use
the highest valid user address (ie one below the 'big address hole').
Linus added to code to patch the correct constant into the comparison
instruction in order to avoid a memory read because a different limit
is needed to 4 and 5 level page tables. 

Whether that applies to riscv is another matter, but if the hardware
actually uses a lower bit to select kernel addresses (perhaps in
parallel with validating that the top few bits are all the same)
then you do need to compare against the top user address. 

I suspect that if you don't do it now, at some point there'll be
an 'issue' that needs the correct limit address used.

	David