[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87bjsoq9fd.fsf@oracle.com>
Date: Tue, 22 Apr 2025 12:32:22 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Zi Yan <ziy@...dia.com>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, x86@...nel.org, torvalds@...ux-foundation.org,
akpm@...ux-foundation.org, bp@...en8.de, dave.hansen@...ux.intel.com,
hpa@...or.com, mingo@...hat.com, luto@...nel.org, peterz@...radead.org,
paulmck@...nel.org, rostedt@...dmis.org, tglx@...utronix.de,
willy@...radead.org, jon.grimm@....com, bharata@....com,
raghavendra.kt@....com, boris.ostrovsky@...cle.com,
konrad.wilk@...cle.com
Subject: Re: [PATCH v3 0/4] mm/folio_zero_user: add multi-page clearing
Zi Yan <ziy@...dia.com> writes:
> On 13 Apr 2025, at 23:46, Ankur Arora wrote:
>
>> This series adds multi-page clearing for hugepages. It is a rework
>> of [1] which took a detour through PREEMPT_LAZY [2].
>>
>> Why multi-page clearing?: multi-page clearing improves upon the
>> current page-at-a-time approach by providing the processor with a
>> hint as to the real region size. A processor could use this hint to,
>> for instance, elide cacheline allocation when clearing a large
>> region.
>>
>> This optimization in particular is done by REP; STOS on AMD Zen
>> where regions larger than L3-size use non-temporal stores.
>>
>> This results in significantly better performance.
>
> Do you have init_on_alloc=1 in your kernel?
> With that, pages coming from buddy allocator are zeroed
> in post_alloc_hook() by kernel_init_pages(), which is a for loop
> of clear_highpage_kasan_tagged(), a wrap of clear_page().
> And folio_zero_user() is not used.
>
> At least Debian, Fedora, and Ubuntu by default have
> CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y, which means init_on_alloc=1.
>
> Maybe kernel_init_pages() should get your optimization as well,
> unless you only target hugetlb pages.
Thanks for the suggestion. I do plan to look for other places where
we could be zeroing contiguous regions.
Often the problem is that even if the underlying region is contiguous,
it isn't so under CONFIG_HIGHMEM. For instance,
clear_highpage_kasan_tagged() does a kmap/kunmap_local_page() around the
clearing. This breaks the contiguous region into multiple 4K pages even
when CONFIG_HIGHMEM is not defined.
I wonder if we need a clear_highpages() kind of abstraction which lets
HIGHMEM and non-HIGHMEM go their separate ways.
--
ankur
Powered by blists - more mailing lists