[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YzVRJ3NY2w1NSoM2@gmail.com>
Date: Thu, 29 Sep 2022 10:02:47 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Nick Desaulniers <ndesaulniers@...gle.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H . Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Kees Cook <keescook@...omium.org>,
linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
llvm@...ts.linux.dev, Andy Lutomirski <luto@...nel.org>,
Rasmus Villemoes <linux@...musvillemoes.dk>
Subject: Re: [PATCH v4] x86, mem: move memmove to out of line assembler
* Nick Desaulniers <ndesaulniers@...gle.com> wrote:
> +SYM_FUNC_START(memmove)
> +/*
> + * void *memmove(void *dest_in, const void *src_in, size_t n)
> + * -mregparm=3 passes these in registers:
> + * dest_in: %eax
> + * src_in: %edx
> + * n: %ecx
> + *
> + * n can remain in %ecx, but for `rep movsl`, we'll need dest in %edi and src
> + * in %esi.
> + */
> +.set dest_in, %eax
> +.set dest, %edi
> +.set src_in, %edx
> +.set src, %esi
> +.set n, %ecx
> +
> +/*
> + * Need 3 scratch registers. These need to be saved+restored. Section 3.2.1
> + * Footnote 7 of the System V Application Binary Interface Version 1.0 aka
> + * "psABI" notes:
> + * Note that in contrast to the Intel386 ABI, %rdi, and %rsi belong to the
> + * called function, not the caller.
> + * i.e. %edi and %esi are callee saved for i386 (because they belong to the
> + * caller).
> + */
> +.set tmp0, %edx
> +.set tmp0w, %dx
> +.set tmp1, %ebx
> +.set tmp1w, %bx
> +.set tmp2, %eax
> +.set tmp3b, %cl
> +
> + pushl %ebp
> + movl %esp, %ebp
> +
> + pushl dest_in
> + pushl dest
> + pushl src
> + pushl tmp1
Yeah, so you did various whitespace & indentation cleanups, and I think if
we are touching trivialities we might as well fix/improve the documentation
of this function too...
For example the comments around parameters and register clobbering are
somewhat inaccurate and actively obfuscate what is going on.
1)
Firstly, the function uses not "3 scratch registers", but four:
eax [tmp2]
ebx [tmp1]
ecx [tmp3]
edx [tmp0]
[ Confusion probably comes from the fact that the main logic uses 3 of
these registers to move stuff around: tmp0/1/2, and tmp3 is clobbered as
part of the 'byteswap' branch. ]
2)
The description of the calling convention is needlessly obfuscated with
calling standards details. If we want to mention it to make it clear what
we are saving on the stack and what not, the best description is the one
from calling.h:
x86 function calling convention, 32-bit:
----------------------------------------
arguments | callee-saved | extra caller-saved | return
[callee-clobbered] | | [callee-clobbered] |
-------------------------------------------------------------------------
eax edx ecx | ebx edi esi ebp [*] | <none> | eax
This makes it clear that of the 4 temporary scratch registers used by
memmove(), only ebx [tmp1] needs to be saved explicitly.
Beyond the (content-)scratch registers, the function will also internally
clobber three other registers:
esi [src]
edi [dest]
ebp [frame pointer]
These esi/edi are the indices into the memory regions.
Since esi/edi are callee-saved, these need to be saved/restored too.
This fully explains the prologue - with annotations in the comments added
by me:
+ pushl %ebp // save callee-saved ebp
+ movl %esp, %ebp // set standard frame pointer
+ pushl dest_in // 'dest_in' will be the return value
+ pushl dest // save callee-saved edi
+ pushl src // save callee-saved esi
+ pushl tmp1 // save callee-saved ebx
...
+ popl tmp1 // restore callee-saved ebx
+ popl src // restore callee-saved esi
+ popl dest // restore callee-saved edi
+ popl %eax // memmove returns 'dest_in'
+ popl %ebp // restore callee-saved ebp
+ RET
3)
But since this large function clobbers *all* callee-saved general purpose
registers of the i386 kernel function call ABI, we might as well make that
explicit, via something like:
/*
* Save all callee-saved registers, because this function is
* going to clobber all of them:
*/
pushl %ebp
movl %esp, %ebp // set standard frame pointer
pushl %ebx
pushl %edi
pushl %esi
pushl dest_in // save 'dest_in' parameter [eax] as the return value
...
popl dest_in // restore 'dest_in' [eax] as the return value
/* Restore all callee-saved registers: */
popl %esi
popl %edi
popl %ebx
popl %ebp
RET
This IMO makes it a lot more clear what is going on in the
prologue/epilogue and why.
Feel free to carry these changes over into your patch.
Thanks,
Ingo
Powered by blists - more mailing lists