[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z_zuMvZq43nc0SoO@gmail.com>
Date: Mon, 14 Apr 2025 13:14:58 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, x86@...nel.org, torvalds@...ux-foundation.org,
akpm@...ux-foundation.org, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com, mingo@...hat.com,
luto@...nel.org, paulmck@...nel.org, rostedt@...dmis.org,
tglx@...utronix.de, willy@...radead.org, jon.grimm@....com,
bharata@....com, raghavendra.kt@....com, boris.ostrovsky@...cle.com,
konrad.wilk@...cle.com
Subject: Re: [PATCH v3 1/4] x86/clear_page: extend clear_page*() for
multi-page clearing
* Peter Zijlstra <peterz@...radead.org> wrote:
> On Mon, Apr 14, 2025 at 08:32:29AM +0200, Ingo Molnar wrote:
>
> > > static inline void clear_page(void *page)
> > > {
> > > + unsigned int length = PAGE_SIZE;
> > > /*
> > > - * Clean up KMSAN metadata for the page being cleared. The assembly call
> > > + * Clean up KMSAN metadata for the pages being cleared. The assembly call
> > > * below clobbers @page, so we perform unpoisoning before it.
> >
> > > */
> > > - kmsan_unpoison_memory(page, PAGE_SIZE);
> > > - alternative_call_2(clear_page_orig,
> > > - clear_page_rep, X86_FEATURE_REP_GOOD,
> > > - clear_page_erms, X86_FEATURE_ERMS,
> > > + kmsan_unpoison_memory(page, length);
> > > +
> > > + alternative_call_2(clear_pages_orig,
> > > + clear_pages_rep, X86_FEATURE_REP_GOOD,
> > > + clear_pages_erms, X86_FEATURE_ERMS,
> > > "=D" (page),
> > > - "D" (page),
> > > + ASM_INPUT("D" (page), "S" (length)),
> > > "cc", "memory", "rax", "rcx");
> > > }
> > >
> > > diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S
> > > index a508e4a8c66a..bce516263b69 100644
> > > --- a/arch/x86/lib/clear_page_64.S
> > > +++ b/arch/x86/lib/clear_page_64.S
> > > @@ -13,20 +13,35 @@
> > > */
> > >
> > > /*
> > > - * Zero a page.
> > > - * %rdi - page
> > > + * Zero kernel page aligned region.
> > > + *
> > > + * Input:
> > > + * %rdi - destination
> > > + * %esi - length
> > > + *
> > > + * Clobbers: %rax, %rcx
> > > */
> > > -SYM_TYPED_FUNC_START(clear_page_rep)
> > > - movl $4096/8,%ecx
> > > +SYM_TYPED_FUNC_START(clear_pages_rep)
> > > + movl %esi, %ecx
> > > xorl %eax,%eax
> > > + shrl $3,%ecx
> > > rep stosq
> > > RET
> > > -SYM_FUNC_END(clear_page_rep)
> > > -EXPORT_SYMBOL_GPL(clear_page_rep)
> > > +SYM_FUNC_END(clear_pages_rep)
> > > +EXPORT_SYMBOL_GPL(clear_pages_rep)
> > >
> > > -SYM_TYPED_FUNC_START(clear_page_orig)
> > > +/*
> > > + * Original page zeroing loop.
> > > + * Input:
> > > + * %rdi - destination
> > > + * %esi - length
> > > + *
> > > + * Clobbers: %rax, %rcx, %rflags
> > > + */
> > > +SYM_TYPED_FUNC_START(clear_pages_orig)
> > > + movl %esi, %ecx
> > > xorl %eax,%eax
> > > - movl $4096/64,%ecx
> > > + shrl $6,%ecx
> >
> > So if the natural input parameter is RCX, why is this function using
> > RSI as the input 'length' parameter? Causes unnecessary register
> > shuffling.
>
> This symbol is written as a C function with C calling convention, even
> though it is only meant to be called from that clear_page() alternative.
>
> If we want to go change all this, then we should go do the same we do
> for __clear_user() and write it thusly:
>
> asm volatile(ALTERNATIVE("rep stosb",
> "call rep_stos_alternative", ALT_NOT(X86_FEATURE_FSRS)
> : "+c" (size), "+D" (addr), ASM_CALL_CONSTRAINT
> : "a" (0))
>
> And forget about all those clear_page_*() thingies.
Yeah.
Thanks,
Ingo
Powered by blists - more mailing lists