[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0D5ABF4F-B1F2-4EB2-BD3B-A8312629D55E@infradead.org>
Date: Fri, 25 Apr 2025 21:36:21 +0100
From: David Woodhouse <dwmw2@...radead.org>
To: David Hildenbrand <david@...hat.com>, Mike Rapoport <rppt@...nel.org>
CC: Andrew Morton <akpm@...ux-foundation.org>,
"Sauerwein, David" <dssauerw@...zon.de>,
Anshuman Khandual <anshuman.khandual@....com>,
Ard Biesheuvel <ardb@...nel.org>, Catalin Marinas <catalin.marinas@....com>,
Marc Zyngier <maz@...nel.org>, Mark Rutland <mark.rutland@....com>,
Mike Rapoport <rppt@...ux.ibm.com>, Will Deacon <will@...nel.org>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, Ruihan Li <lrh2000@....edu.cn>
Subject: Re: [PATCH v4 7/7] mm/mm_init: Use for_each_valid_pfn() in init_unavailable_range()
On 25 April 2025 21:12:49 BST, David Hildenbrand <david@...hat.com> wrote:
>On 25.04.25 21:08, David Woodhouse wrote:
>> On 25 April 2025 17:17:25 BST, David Hildenbrand <david@...hat.com> wrote:
>>> On 23.04.25 15:33, David Woodhouse wrote:
>>>> From: David Woodhouse <dwmw@...zon.co.uk>
>>>>
>>>> Currently, memmap_init initializes pfn_hole with 0 instead of
>>>> ARCH_PFN_OFFSET. Then init_unavailable_range will start iterating each
>>>> page from the page at address zero to the first available page, but it
>>>> won't do anything for pages below ARCH_PFN_OFFSET because pfn_valid
>>>> won't pass.
>>>>
>>>> If ARCH_PFN_OFFSET is very large (e.g., something like 2^64-2GiB if the
>>>> kernel is used as a library and loaded at a very high address), the
>>>> pointless iteration for pages below ARCH_PFN_OFFSET will take a very
>>>> long time, and the kernel will look stuck at boot time.
>>>>
>>>> Use for_each_valid_pfn() to skip the pointless iterations.
>>>>
>>>> Reported-by: Ruihan Li <lrh2000@....edu.cn>
>>>> Suggested-by: Mike Rapoport <rppt@...nel.org>
>>>> Signed-off-by: David Woodhouse <dwmw@...zon.co.uk>
>>>> Reviewed-by: Mike Rapoport (Microsoft) <rppt@...nel.org>
>>>> Tested-by: Ruihan Li <lrh2000@....edu.cn>
>>>> ---
>>>> mm/mm_init.c | 6 +-----
>>>> 1 file changed, 1 insertion(+), 5 deletions(-)
>>>>
>>>> diff --git a/mm/mm_init.c b/mm/mm_init.c
>>>> index 41884f2155c4..0d1a4546825c 100644
>>>> --- a/mm/mm_init.c
>>>> +++ b/mm/mm_init.c
>>>> @@ -845,11 +845,7 @@ static void __init init_unavailable_range(unsigned long spfn,
>>>> unsigned long pfn;
>>>> u64 pgcnt = 0;
>>>> - for (pfn = spfn; pfn < epfn; pfn++) {
>>>> - if (!pfn_valid(pageblock_start_pfn(pfn))) {
>>>> - pfn = pageblock_end_pfn(pfn) - 1;
>>>> - continue;
>>>> - }
>>>
>>> So, if the first pfn in a pageblock is not valid, we skip the whole pageblock ...
>>>
>>>> + for_each_valid_pfn(pfn, spfn, epfn) {
>>>> __init_single_page(pfn_to_page(pfn), pfn, zone, node);
>>>> __SetPageReserved(pfn_to_page(pfn));
>>>> pgcnt++;
>>>
>>> but here, we would process further pfns inside such a pageblock?
>>>
>>
>> Is it not the case that either *all*, or *none*, of the PFNs within a given pageblock will be valid?
>
>Hmm, good point. I was thinking about sub-sections, but all early sections are fully valid.
>
>(Also, at least on x86, the subsection size should match the pageblock size; might not be the case on other architectures, like arm64 with 64K base pages ...)
>
>>
>> I assumed that was *why* it had that skip, as an attempt at the kind of optimisation that for_each_valid_pfn() now gives us?
>
>But it's interesting in this code that we didn't optimize for "if the first pfn is valid, all the remaining ones are valid". We would still check each PFN.
>
>In any case, trying to figure out why Lorenzo ran into an issue ... if it's nit because of the pageblock, maybe something in for_each_valid_pfn with sparsemem is still shaky.
>
A previous round of the patch series had a less aggressively optimised version of the sparsemem implementation...?
Will see if I can reproduce in the morning. A boot in QEMU worked here before I sent it out.
Powered by blists - more mailing lists