[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ys1p27uWqjWlcaa1@localhost.localdomain>
Date: Tue, 12 Jul 2022 15:32:27 +0300
From: Alexey Dobriyan <adobriyan@...il.com>
To: Borislav Petkov <bp@...en8.de>
Cc: linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Mark Hemment <markhemm@...glemail.com>,
Andrew Morton <akpm@...ux-foundation.org>,
the arch/x86 maintainers <x86@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
patrice.chotard@...s.st.com, Mikulas Patocka <mpatocka@...hat.com>,
Lukas Czerner <lczerner@...hat.com>,
Christoph Hellwig <hch@....de>,
"Darrick J. Wong" <djwong@...nel.org>,
Chuck Lever <chuck.lever@...cle.com>,
Hugh Dickins <hughd@...gle.com>, patches@...ts.linux.dev,
Linux-MM <linux-mm@...ck.org>, mm-commits@...r.kernel.org,
Mel Gorman <mgorman@...e.de>
Subject: Re: [PATCH -final] x86/clear_user: Make it faster
On Mon, Jul 11, 2022 at 12:33:20PM +0200, Borislav Petkov wrote:
> On Wed, Jul 06, 2022 at 12:24:12PM +0300, Alexey Dobriyan wrote:
> > On Tue, Jul 05, 2022 at 07:01:06PM +0200, Borislav Petkov wrote:
> >
> > > + asm volatile(
> > > + "1:\n\t"
> > > + ALTERNATIVE_3("rep stosb",
> > > + "call clear_user_erms", ALT_NOT(X86_FEATURE_FSRM),
> > > + "call clear_user_rep_good", ALT_NOT(X86_FEATURE_ERMS),
> > > + "call clear_user_original", ALT_NOT(X86_FEATURE_REP_GOOD))
> > > + "2:\n"
> > > + _ASM_EXTABLE_UA(1b, 2b)
> > > + : "+&c" (size), "+&D" (addr), ASM_CALL_CONSTRAINT
> > > + : "a" (0)
> > > + /* rep_good clobbers %rdx */
> > > + : "rdx");
> >
> > "+c" and "+D" should be enough for 1 instruction assembly?
>
> I'm looking at
>
> e0a96129db57 ("x86: use early clobbers in usercopy*.c")
>
> which introduced the early clobbers and I'm thinking we want them
> because "this operand is an earlyclobber operand, which is written
> before the instruction is finished using the input operands" and we have
> exception handling.
>
> But maybe you need to be more verbose as to what you mean exactly...
This is the original code:
-#define __do_strncpy_from_user(dst,src,count,res) \
-do { \
- long __d0, __d1, __d2; \
- might_fault(); \
- __asm__ __volatile__( \
- " testq %1,%1\n" \
- " jz 2f\n" \
- "0: lodsb\n" \
- " stosb\n" \
- " testb %%al,%%al\n" \
- " jz 1f\n" \
- " decq %1\n" \
- " jnz 0b\n" \
- "1: subq %1,%0\n" \
- "2:\n" \
- ".section .fixup,\"ax\"\n" \
- "3: movq %5,%0\n" \
- " jmp 2b\n" \
- ".previous\n" \
- _ASM_EXTABLE(0b,3b) \
- : "=&r"(res), "=&c"(count), "=&a" (__d0), "=&S" (__d1), \
- "=&D" (__d2) \
- : "i"(-EFAULT), "0"(count), "1"(count), "3"(src), "4"(dst) \
- : "memory"); \
-} while (0)
I meant to say that earlyclobber is necessary only because the asm body
is more than 1 instruction so there is possibility of writing to some
outputs before all inputs are consumed.
If asm body is 1 insn there is no such possibility at all.
Now "rep stosb" is 1 instruction and two alterantive functions masquarade
as single instruction.
Powered by blists - more mailing lists