lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YzVRJ3NY2w1NSoM2@gmail.com>
Date:   Thu, 29 Sep 2022 10:02:47 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     Nick Desaulniers <ndesaulniers@...gle.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Kees Cook <keescook@...omium.org>,
        linux-kernel@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        llvm@...ts.linux.dev, Andy Lutomirski <luto@...nel.org>,
        Rasmus Villemoes <linux@...musvillemoes.dk>
Subject: Re: [PATCH v4] x86, mem: move memmove to out of line assembler


* Nick Desaulniers <ndesaulniers@...gle.com> wrote:

> +SYM_FUNC_START(memmove)
> +/*
> + * void *memmove(void *dest_in, const void *src_in, size_t n)
> + * -mregparm=3 passes these in registers:
> + * dest_in: %eax
> + * src_in: %edx
> + * n: %ecx
> + *
> + * n can remain in %ecx, but for `rep movsl`, we'll need dest in %edi and src
> + * in %esi.
> + */
> +.set dest_in, %eax
> +.set dest, %edi
> +.set src_in, %edx
> +.set src, %esi
> +.set n, %ecx
> +
> +/*
> + * Need 3 scratch registers. These need to be saved+restored. Section 3.2.1
> + * Footnote 7 of the System V Application Binary Interface Version 1.0 aka
> + * "psABI" notes:
> + *   Note that in contrast to the Intel386 ABI, %rdi, and %rsi belong to the
> + *   called function, not the caller.
> + * i.e. %edi and %esi are callee saved for i386 (because they belong to the
> + * caller).
> + */
> +.set tmp0, %edx
> +.set tmp0w, %dx
> +.set tmp1, %ebx
> +.set tmp1w, %bx
> +.set tmp2, %eax
> +.set tmp3b, %cl
> +
> +	pushl	%ebp
> +	movl	%esp, %ebp
> +
> +	pushl	dest_in
> +	pushl	dest
> +	pushl	src
> +	pushl	tmp1

Yeah, so you did various whitespace & indentation cleanups, and I think if 
we are touching trivialities we might as well fix/improve the documentation 
of this function too...

For example the comments around parameters and register clobbering are 
somewhat inaccurate and actively obfuscate what is going on.

1)

Firstly, the function uses not "3 scratch registers", but four:

   eax [tmp2]
   ebx [tmp1]
   ecx [tmp3]
   edx [tmp0]

[ Confusion probably comes from the fact that the main logic uses 3 of 
  these registers to move stuff around: tmp0/1/2, and tmp3 is clobbered as 
  part of the 'byteswap' branch. ]

2)

The description of the calling convention is needlessly obfuscated with 
calling standards details. If we want to mention it to make it clear what 
we are saving on the stack and what not, the best description is the one 
from calling.h:

   x86 function calling convention, 32-bit:
   ----------------------------------------
    arguments         | callee-saved        | extra caller-saved | return
   [callee-clobbered] |                     | [callee-clobbered] |
   -------------------------------------------------------------------------
   eax edx ecx        | ebx edi esi ebp [*] | <none>             | eax

This makes it clear that of the 4 temporary scratch registers used by 
memmove(), only ebx [tmp1] needs to be saved explicitly.

Beyond the (content-)scratch registers, the function will also internally 
clobber three other registers:

   esi [src]
   edi [dest]
   ebp [frame pointer]

These esi/edi are the indices into the memory regions.

Since esi/edi are callee-saved, these need to be saved/restored too.

This fully explains the prologue - with annotations in the comments added 
by me:

+       pushl   %ebp                // save callee-saved ebp
+       movl    %esp, %ebp          // set standard frame pointer

+       pushl   dest_in             // 'dest_in' will be the return value
+       pushl   dest                // save callee-saved edi
+       pushl   src                 // save callee-saved esi
+       pushl   tmp1                // save callee-saved ebx

...

+       popl    tmp1                // restore callee-saved ebx
+       popl    src                 // restore callee-saved esi
+       popl    dest                // restore callee-saved edi
+       popl    %eax                // memmove returns 'dest_in'

+       popl    %ebp                // restore callee-saved ebp
+       RET

3)

But since this large function clobbers *all* callee-saved general purpose 
registers of the i386 kernel function call ABI, we might as well make that 
explicit, via something like:

        /*
         * Save all callee-saved registers, because this function is
         * going to clobber all of them:
         */
        pushl   %ebp
        movl    %esp, %ebp          // set standard frame pointer
        pushl   %ebx
        pushl   %edi
        pushl   %esi

        pushl   dest_in             // save 'dest_in' parameter [eax] as the return value

        ...

        popl    dest_in             // restore 'dest_in' [eax] as the return value

        /* Restore all callee-saved registers: */
        popl    %esi
        popl    %edi
        popl    %ebx
        popl    %ebp

        RET

This IMO makes it a lot more clear what is going on in the 
prologue/epilogue and why.

Feel free to carry these changes over into your patch.

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ