[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wiAwEA5KekL4+t+v6qAdKLSOQvBkGbumvk+Rxhaay8aFg@mail.gmail.com>
Date: Fri, 23 Sep 2022 10:29:58 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Nick Desaulniers <ndesaulniers@...gle.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Kees Cook <keescook@...omium.org>,
linux-kernel@...r.kernel.org, llvm@...ts.linux.dev,
Andy Lutomirski <luto@...nel.org>, Ma Ling <ling.ma@...el.com>
Subject: Re: [PATCH] x86, mem: move memmove to out of line assembler
On Fri, Sep 23, 2022 at 10:02 AM Nick Desaulniers
<ndesaulniers@...gle.com> wrote:
>
> memmove is quite large and probably shouldn't be inlined due to size
> alone. A noinline function attribute would be the simplest fix, but
> there's a few things that stand out with the current definition:
I don't think your patch is wrong, and it's not that different from
what the x86-64 code did long ago back in 2011 in commit 9599ec0471de
("x86-64, mem: Convert memmove() to assembly file and fix return value
bug")
But I wonder if we might be better off getting rid of that horrid
memmove thing entirely. The original thing seems to be from 2010,
where commit 3b4b682becdf ("x86, mem: Optimize memmove for small size
and unaligned cases") did this thing for both 32-bit and 64-bit code.
And it's really not likely used all that much - memcpy is the *much*
more important function, and that's the one that has actually been
updated for modern CPU's instead of looking like it's some copy from
glibc or whatever.
To make things even stranger, on the x86-64 side, we actually have
*two* copies of 'memmove()' - it looks like memcpy_orig() is already a
memmove, in that it does that
cmp %dil, %sil
jl .Lcopy_backward
thing that seems to mean that it already handles the overlapping case.
Anyway, that 32-bit memmove() implemented (badly) as inline asm does
need to go. As you point out, it seems to get the clobbers wrong too,
so it's actively buggy and just happens to work because there's
nothing else in that function, and it clobbers %bx that is supposed to
be callee-saved, but *presumably* the compile has to allocate %bx one
of the other registers, so it will save and restore %bx anyway.
So that current memmove() seems to be truly horrendously buggy, but in
a "it happens to work" way. Your patch seems an improvement, but I get
the feeling that it could be improved a lot more...
Linus
Powered by blists - more mailing lists