[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <877c1bwmki.fsf@oracle.com>
Date: Mon, 16 Jun 2025 11:47:25 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, x86@...nel.org, akpm@...ux-foundation.org,
bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com,
mingo@...hat.com, mjguzik@...il.com, luto@...nel.org,
peterz@...radead.org, acme@...nel.org, namhyung@...nel.org,
tglx@...utronix.de, willy@...radead.org, jon.grimm@....com,
bharata@....com, raghavendra.kt@....com, boris.ostrovsky@...cle.com,
konrad.wilk@...cle.com
Subject: Re: [PATCH v4 13/13] x86/folio_zero_user: Add multi-page clearing
Dave Hansen <dave.hansen@...el.com> writes:
> On 6/15/25 22:22, Ankur Arora wrote:
>> Override the common code version of folio_zero_user() so we can use
>> clear_pages() to do multi-page clearing instead of the standard
>> page-at-a-time clearing.
>
> I'm not a big fan of the naming in this series.
>
> To me multi-page means "more than one 'struct page'". But this series is
> clearly using multi-page clearing to mean clearing >PAGE_SIZE in one
> clear. But oh well.
I'd say it's doing both of those. Seen from the folio side, it is
clearing more than one struct page.
Once you descend to the clearing primitive, that's just page aligned
memory.
> The second problem with where this ends up is that none of the code is
> *actually* x86-specific. The only thing that x86 provides that's
> interesting is a clear_pages() implementation that hands >PAGE_SIZE
> units down to the CPUs.
>
> The result is ~100 lines of code that will compile and run functionally
> on any architecture.
True. The underlying assumption is that you can provide extent level
information to string instructions which AFAIK only exists on x86.
> To me, that's deserving of an ARCH_HAS_FOO bit that we can set on the
> x86 side that then cajoles the core mm/ code to use the fancy new
> clear_pages_resched() implementation.
This seems straight-forward enough.
> Because what are the arm64 guys going to do when their CPUs start doing
> this? They're either going to copy-and-paste the x86 implementation or
> they're going to go move the refactor the x86 implementation into common
> code.
These instructions have been around for an awfully long time. Are other
architectures looking at adding similar instructions?
I think this is definitely worth if there are performance advantages on
arm64 -- maybe just because of the reduced per-page overhead.
Let me try this out on arm64.
> My money is on the refactoring, because those arm64 guys do good work.
> Could we save them the trouble, please?
> Oh, and one other little thing:
>
>> +/*
>> + * Limit the optimized version of folio_zero_user() to !CONFIG_HIGHMEM.
>> + * We do that because clear_pages() works on contiguous kernel pages
>> + * which might not be true under HIGHMEM.
>> + */
>
> The tip trees are picky about imperative voice, so no "we's". But if you
> stick this in mm/, folks are less picky. ;)
Hah. That might be come in handy ;).
--
ankur
Powered by blists - more mailing lists