[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f8140a17-c4ec-489b-b314-d45abe48bf36@redhat.com>
Date: Mon, 25 Aug 2025 17:42:33 +0200
From: David Hildenbrand <david@...hat.com>
To: Mike Rapoport <rppt@...nel.org>
Cc: Mika Penttilä <mpenttil@...hat.com>,
linux-kernel@...r.kernel.org, Alexander Potapenko <glider@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Brendan Jackman <jackmanb@...gle.com>, Christoph Lameter <cl@...two.org>,
Dennis Zhou <dennis@...nel.org>, Dmitry Vyukov <dvyukov@...gle.com>,
dri-devel@...ts.freedesktop.org, intel-gfx@...ts.freedesktop.org,
iommu@...ts.linux.dev, io-uring@...r.kernel.org,
Jason Gunthorpe <jgg@...dia.com>, Jens Axboe <axboe@...nel.dk>,
Johannes Weiner <hannes@...xchg.org>, John Hubbard <jhubbard@...dia.com>,
kasan-dev@...glegroups.com, kvm@...r.kernel.org,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Linus Torvalds <torvalds@...ux-foundation.org>, linux-arm-kernel@...s.com,
linux-arm-kernel@...ts.infradead.org, linux-crypto@...r.kernel.org,
linux-ide@...r.kernel.org, linux-kselftest@...r.kernel.org,
linux-mips@...r.kernel.org, linux-mmc@...r.kernel.org, linux-mm@...ck.org,
linux-riscv@...ts.infradead.org, linux-s390@...r.kernel.org,
linux-scsi@...r.kernel.org, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Marco Elver <elver@...gle.com>, Marek Szyprowski <m.szyprowski@...sung.com>,
Michal Hocko <mhocko@...e.com>, Muchun Song <muchun.song@...ux.dev>,
netdev@...r.kernel.org, Oscar Salvador <osalvador@...e.de>,
Peter Xu <peterx@...hat.com>, Robin Murphy <robin.murphy@....com>,
Suren Baghdasaryan <surenb@...gle.com>, Tejun Heo <tj@...nel.org>,
virtualization@...ts.linux.dev, Vlastimil Babka <vbabka@...e.cz>,
wireguard@...ts.zx2c4.com, x86@...nel.org, Zi Yan <ziy@...dia.com>
Subject: Re: [PATCH RFC 10/35] mm/hugetlb: cleanup
hugetlb_folio_init_tail_vmemmap()
On 25.08.25 16:59, Mike Rapoport wrote:
> On Mon, Aug 25, 2025 at 04:38:03PM +0200, David Hildenbrand wrote:
>> On 25.08.25 16:32, Mike Rapoport wrote:
>>> On Mon, Aug 25, 2025 at 02:48:58PM +0200, David Hildenbrand wrote:
>>>> On 23.08.25 10:59, Mike Rapoport wrote:
>>>>> On Fri, Aug 22, 2025 at 08:24:31AM +0200, David Hildenbrand wrote:
>>>>>> On 22.08.25 06:09, Mika Penttilä wrote:
>>>>>>>
>>>>>>> On 8/21/25 23:06, David Hildenbrand wrote:
>>>>>>>
>>>>>>>> All pages were already initialized and set to PageReserved() with a
>>>>>>>> refcount of 1 by MM init code.
>>>>>>>
>>>>>>> Just to be sure, how is this working with MEMBLOCK_RSRV_NOINIT, where MM is supposed not to
>>>>>>> initialize struct pages?
>>>>>>
>>>>>> Excellent point, I did not know about that one.
>>>>>>
>>>>>> Spotting that we don't do the same for the head page made me assume that
>>>>>> it's just a misuse of __init_single_page().
>>>>>>
>>>>>> But the nasty thing is that we use memblock_reserved_mark_noinit() to only
>>>>>> mark the tail pages ...
>>>>>
>>>>> And even nastier thing is that when CONFIG_DEFERRED_STRUCT_PAGE_INIT is
>>>>> disabled struct pages are initialized regardless of
>>>>> memblock_reserved_mark_noinit().
>>>>>
>>>>> I think this patch should go in before your updates:
>>>>
>>>> Shouldn't we fix this in memblock code?
>>>>
>>>> Hacking around that in the memblock_reserved_mark_noinit() user sound wrong
>>>> -- and nothing in the doc of memblock_reserved_mark_noinit() spells that
>>>> behavior out.
>>>
>>> We can surely update the docs, but unfortunately I don't see how to avoid
>>> hacking around it in hugetlb.
>>> Since it's used to optimise HVO even further to the point hugetlb open
>>> codes memmap initialization, I think it's fair that it should deal with all
>>> possible configurations.
>>
>> Remind me, why can't we support memblock_reserved_mark_noinit() when
>> CONFIG_DEFERRED_STRUCT_PAGE_INIT is disabled?
>
> When CONFIG_DEFERRED_STRUCT_PAGE_INIT is disabled we initialize the entire
> memmap early (setup_arch()->free_area_init()), and we may have a bunch of
> memblock_reserved_mark_noinit() afterwards
Oh, you mean that we get effective memblock modifications after already
initializing the memmap.
That sounds ... interesting :)
So yeah, we have to document this for memblock_reserved_mark_noinit().
Is it also a problem for kexec_handover?
We should do something like:
diff --git a/mm/memblock.c b/mm/memblock.c
index 154f1d73b61f2..ed4c563d72c32 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1091,13 +1091,16 @@ int __init_memblock memblock_clear_nomap(phys_addr_t base, phys_addr_t size)
/**
* memblock_reserved_mark_noinit - Mark a reserved memory region with flag
- * MEMBLOCK_RSRV_NOINIT which results in the struct pages not being initialized
- * for this region.
+ * MEMBLOCK_RSRV_NOINIT which allows for the "struct pages" corresponding
+ * to this region not getting initialized, because the caller will take
+ * care of it.
* @base: the base phys addr of the region
* @size: the size of the region
*
- * struct pages will not be initialized for reserved memory regions marked with
- * %MEMBLOCK_RSRV_NOINIT.
+ * "struct pages" will not be initialized for reserved memory regions marked
+ * with %MEMBLOCK_RSRV_NOINIT if this function is called before initialization
+ * code runs. Without CONFIG_DEFERRED_STRUCT_PAGE_INIT, it is more likely
+ * that this function is not effective.
*
* Return: 0 on success, -errno on failure.
*/
Optimizing the hugetlb code could be done, but I am not sure how high
the priority is (nobody complained so far about the double init).
--
Cheers
David / dhildenb
Powered by blists - more mailing lists