[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z-xOFuT9Sl6VuFYi@gmail.com>
Date: Tue, 1 Apr 2025 22:35:34 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: mingo@...hat.com, x86@...nel.org, linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Uros Bizjak <ubizjak@...il.com>, "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH] x86: predict __access_ok() returning true
* Mateusz Guzik <mjguzik@...il.com> wrote:
> This works around what seems to be an optimization bug in gcc (at least
> 13.3.0), where it predicts access_ok() to fail despite the hint to the
> contrary.
>
> _copy_to_user contains:
> if (access_ok(to, n)) {
> instrument_copy_to_user(to, from, n);
> n = raw_copy_to_user(to, from, n);
> }
>
> Where access_ok is likely(__access_ok(addr, size)), yet the compiler
> emits conditional jumps forward for the case where it succeeds:
>
> <+0>: endbr64
> <+4>: mov %rdx,%rcx
> <+7>: mov %rdx,%rax
> <+10>: xor %edx,%edx
> <+12>: add %rdi,%rcx
> <+15>: setb %dl
> <+18>: movabs $0x123456789abcdef,%r8
> <+28>: test %rdx,%rdx
> <+31>: jne 0xffffffff81b3b7c6 <_copy_to_user+38>
> <+33>: cmp %rcx,%r8
> <+36>: jae 0xffffffff81b3b7cb <_copy_to_user+43>
> <+38>: jmp 0xffffffff822673e0 <__x86_return_thunk>
> <+43>: nop
> <+44>: nop
> <+45>: nop
> <+46>: mov %rax,%rcx
> <+49>: rep movsb %ds:(%rsi),%es:(%rdi)
> <+51>: nop
> <+52>: nop
> <+53>: nop
> <+54>: mov %rcx,%rax
> <+57>: nop
> <+58>: nop
> <+59>: nop
> <+60>: jmp 0xffffffff822673e0 <__x86_return_thunk>
>
> Patching _copy_to_user() to likely() around the access_ok() use does
> not change the asm.
>
> However, spelling out the prediction *within* __access_ok() does the
> trick:
> <+0>: endbr64
> <+4>: xor %eax,%eax
> <+6>: mov %rdx,%rcx
> <+9>: add %rdi,%rdx
> <+12>: setb %al
> <+15>: movabs $0x123456789abcdef,%r8
> <+25>: test %rax,%rax
> <+28>: jne 0xffffffff81b315e6 <_copy_to_user+54>
> <+30>: cmp %rdx,%r8
> <+33>: jb 0xffffffff81b315e6 <_copy_to_user+54>
> <+35>: nop
> <+36>: nop
> <+37>: nop
> <+38>: rep movsb %ds:(%rsi),%es:(%rdi)
> <+40>: nop
> <+41>: nop
> <+42>: nop
> <+43>: nop
> <+44>: nop
> <+45>: nop
> <+46>: mov %rcx,%rax
> <+49>: jmp 0xffffffff82255ba0 <__x86_return_thunk>
> <+54>: mov %rcx,%rax
> <+57>: jmp 0xffffffff82255ba0 <__x86_return_thunk>
>
> Signed-off-by: Mateusz Guzik <mjguzik@...il.com>
> ---
>
> I did not investigate what's going on here. It may be other spots are
> also suffering.
>
> If someone commits to figuring out what went wrong I'll be happy to drop
> this patch. Otherwise this can be worked around at least for access_ok()
> consumers.
>
> arch/x86/include/asm/uaccess_64.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
> index c52f0133425b..30c912375260 100644
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -98,11 +98,11 @@ static inline void __user *mask_user_address(const void __user *ptr)
> static inline bool __access_ok(const void __user *ptr, unsigned long size)
> {
> if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) {
> - return valid_user_address(ptr);
> + return likely(valid_user_address(ptr));
> } else {
> unsigned long sum = size + (__force unsigned long)ptr;
>
> - return valid_user_address(sum) && sum >= (__force unsigned long)ptr;
> + return likely(valid_user_address(sum) && sum >= (__force unsigned long)ptr);
Cannot we put this into valid_user_address() definition, via something
like the below patch?
It's also the right place to have the hint: that user addresses are
valid is the common case we optimize for.
Thanks,
Ingo
arch/x86/include/asm/uaccess_64.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index c52f0133425b..4c13883371aa 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -54,7 +54,7 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,
#endif
#define valid_user_address(x) \
- ((__force unsigned long)(x) <= runtime_const_ptr(USER_PTR_MAX))
+ likely((__force unsigned long)(x) <= runtime_const_ptr(USER_PTR_MAX))
/*
* Masking the user address is an alternative to a conditional
Powered by blists - more mailing lists