[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y8KE2h8LSKsrkJhX@zn.tnic>
Date: Sat, 14 Jan 2023 11:42:13 +0100
From: Borislav Petkov <bp@...en8.de>
To: Daniel Verkamp <dverkamp@...omium.org>
Cc: x86@...nel.org, Tony Luck <tony.luck@...el.com>,
Jiri Slaby <jirislaby@...nel.org>,
lkml <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86: combine memmove FSRM and ERMS alternatives
+ lkml.
Always CC lkml on patches pls.
Ok, let's see. I hope the coffee's working already and I'm not missing an
aspect...
On Fri, Jan 13, 2023 at 12:34:27PM -0800, Daniel Verkamp wrote:
> diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S
> index 724bbf83eb5b..1fc36dbd3bdc 100644
> --- a/arch/x86/lib/memmove_64.S
> +++ b/arch/x86/lib/memmove_64.S
> @@ -38,8 +38,10 @@ SYM_FUNC_START(__memmove)
>
> /* FSRM implies ERMS => no length checks, do the copy directly */
> .Lmemmove_begin_forward:
> - ALTERNATIVE "cmp $0x20, %rdx; jb 1f", "", X86_FEATURE_FSRM
> - ALTERNATIVE "", "jmp .Lmemmove_erms", X86_FEATURE_ERMS
> + ALTERNATIVE_2 \
> + "cmp $0x20, %rdx; jb 1f", \
> + "jmp .Lmemmove_erms", X86_FEATURE_FSRM, \
> + "jmp .Lmemmove_erms", X86_FEATURE_ERMS
This is wrong in the ERMS case:
* If you have FSRM, you can simply do
REP; MOVSB
as any size is handled fine. So that's ok. BUT:
* If you have ERMS, you need to jump to 1f for smaller sizes. ERMS makes sense
only for bigger than, well, we have 0x20 there.
So if you have ERMS, you can't simply replace:
"cmp $0x20, %rdx; jb 1f"
with
"jmp .Lmemmove_erms"
You still need that size check.
IOW, it should be something like this:
ALTERNATIVE_2
"cmp $0x20, %rdx; jb 1f; jmp .Lmemmove_erms", X86_FEATURE_ERMS,
"jmp .Lmemmove_erms", X86_FEATURE_FSRM
But you can't have JMPs as NOT the first insn in alternatives because we fixup
the JMP offsets only for the first insn, see where recompute_jump() is called.
So, in order for the above to work, you'd need to use the insn decoder and look
at every insn in replacement and recompute_jump() it if it is a JMP.
And there are nice examples how to do that - see the loops in alternative.c
doing insn_decode_kernel().
Feel like getting your hands dirty with that?
:-)
Or, altenatively (pun intended), you can do what copy_user_generic() does and
move all that logic into C and inline asm. Which I'd prefer, actually, instead of
doing ugly asm hacks. Depends on how ugly it gets...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists