linux-kernel - Re: [PATCH v4 7/7] mm/mm_init: Use for_each_valid_pfn() in init_unavailable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0D5ABF4F-B1F2-4EB2-BD3B-A8312629D55E@infradead.org>
Date: Fri, 25 Apr 2025 21:36:21 +0100
From: David Woodhouse <dwmw2@...radead.org>
To: David Hildenbrand <david@...hat.com>, Mike Rapoport <rppt@...nel.org>
CC: Andrew Morton <akpm@...ux-foundation.org>,
 "Sauerwein, David" <dssauerw@...zon.de>,
 Anshuman Khandual <anshuman.khandual@....com>,
 Ard Biesheuvel <ardb@...nel.org>, Catalin Marinas <catalin.marinas@....com>,
 Marc Zyngier <maz@...nel.org>, Mark Rutland <mark.rutland@....com>,
 Mike Rapoport <rppt@...ux.ibm.com>, Will Deacon <will@...nel.org>,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 linux-mm@...ck.org, Ruihan Li <lrh2000@....edu.cn>
Subject: Re: [PATCH v4 7/7] mm/mm_init: Use for_each_valid_pfn() in init_unavailable_range()

On 25 April 2025 21:12:49 BST, David Hildenbrand <david@...hat.com> wrote:
>On 25.04.25 21:08, David Woodhouse wrote:
>> On 25 April 2025 17:17:25 BST, David Hildenbrand <david@...hat.com> wrote:
>>> On 23.04.25 15:33, David Woodhouse wrote:
>>>> From: David Woodhouse <dwmw@...zon.co.uk>
>>>> 
>>>> Currently, memmap_init initializes pfn_hole with 0 instead of
>>>> ARCH_PFN_OFFSET. Then init_unavailable_range will start iterating each
>>>> page from the page at address zero to the first available page, but it
>>>> won't do anything for pages below ARCH_PFN_OFFSET because pfn_valid
>>>> won't pass.
>>>> 
>>>> If ARCH_PFN_OFFSET is very large (e.g., something like 2^64-2GiB if the
>>>> kernel is used as a library and loaded at a very high address), the
>>>> pointless iteration for pages below ARCH_PFN_OFFSET will take a very
>>>> long time, and the kernel will look stuck at boot time.
>>>> 
>>>> Use for_each_valid_pfn() to skip the pointless iterations.
>>>> 
>>>> Reported-by: Ruihan Li <lrh2000@....edu.cn>
>>>> Suggested-by: Mike Rapoport <rppt@...nel.org>
>>>> Signed-off-by: David Woodhouse <dwmw@...zon.co.uk>
>>>> Reviewed-by: Mike Rapoport (Microsoft) <rppt@...nel.org>
>>>> Tested-by: Ruihan Li <lrh2000@....edu.cn>
>>>> ---
>>>>    mm/mm_init.c | 6 +-----
>>>>    1 file changed, 1 insertion(+), 5 deletions(-)
>>>> 
>>>> diff --git a/mm/mm_init.c b/mm/mm_init.c
>>>> index 41884f2155c4..0d1a4546825c 100644
>>>> --- a/mm/mm_init.c
>>>> +++ b/mm/mm_init.c
>>>> @@ -845,11 +845,7 @@ static void __init init_unavailable_range(unsigned long spfn,
>>>>    	unsigned long pfn;
>>>>    	u64 pgcnt = 0;
>>>>    -	for (pfn = spfn; pfn < epfn; pfn++) {
>>>> -		if (!pfn_valid(pageblock_start_pfn(pfn))) {
>>>> -			pfn = pageblock_end_pfn(pfn) - 1;
>>>> -			continue;
>>>> -		}
>>> 
>>> So, if the first pfn in a pageblock is not valid, we skip the whole pageblock ...
>>> 
>>>> +	for_each_valid_pfn(pfn, spfn, epfn) {
>>>>    		__init_single_page(pfn_to_page(pfn), pfn, zone, node);
>>>>    		__SetPageReserved(pfn_to_page(pfn));
>>>>    		pgcnt++;
>>> 
>>> but here, we would process further pfns inside such a pageblock?
>>> 
>> 
>> Is it not the case that either *all*, or *none*, of the PFNs within a given pageblock will be valid?
>
>Hmm, good point. I was thinking about sub-sections, but all early sections are fully valid.
>
>(Also, at least on x86, the subsection size should match the pageblock size; might not be the case on other architectures, like arm64 with 64K base pages ...)
>
>> 
>> I assumed that was *why* it had that skip, as an attempt at the kind of optimisation that for_each_valid_pfn() now gives us?
>
>But it's interesting in this code that we didn't optimize for "if the first pfn is valid, all the remaining ones are valid". We would still check each PFN.
>
>In any case, trying to figure out why Lorenzo ran into an issue ... if it's nit because of the pageblock, maybe something in for_each_valid_pfn with sparsemem is still shaky.
>

A previous round of the patch series had a less aggressively optimised version of the sparsemem implementation...?

Will see if I can reproduce in the morning. A boot in QEMU worked here before I sent it out.