lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c8b8dd2-0348-4311-b237-6129fcc60b08@linux.ibm.com>
Date: Tue, 11 Mar 2025 20:33:30 +0530
From: Donet Tom <donettom@...ux.ibm.com>
To: David Hildenbrand <david@...hat.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-kernel@...r.kernel.org
Cc: Ritesh Harjani <ritesh.list@...il.com>,
        "Rafael J . Wysocki" <rafael@...nel.org>,
        Danilo Krummrich <dakr@...nel.org>
Subject: Re: [PATCH] driver/base/node.c: Fix softlockups during the
 initialization of large systems with interleaved memory blocks


On 3/11/25 2:52 PM, David Hildenbrand wrote:
> On 10.03.25 12:53, Donet Tom wrote:
>> On large systems with more than 64TB of DRAM, if the memory blocks
>> are interleaved, node initialization (node_dev_init()) could take
>> a long time since it iterates over each memory block. If the memory
>> block belongs to the current iterating node, the first pfn_to_nid
>> will provide the correct value. Otherwise, it will iterate over all
>> PFNs and check the nid. On non-preemptive kernels, this can result
>> in a watchdog softlockup warning. Even though CONFIG_PREEMPT_LAZY
>> is enabled in kernels now [1], we may still need to fix older
>> stable kernels to avoid encountering these kernel warnings during
>> boot.
>
> If it's not an issue upstream, there is no need for an upstream patch.
>
> Fix stable kernels separately.
>
> Or did I get you wrong and this can be triggered upstream?


Yes, the issue is present upstream if CONFIG_PREEMPT_LAZY is disabled.

Thanks
Donet


>
>>
>> This patch adds a cond_resched() call in node_dev_init() to avoid
>> this warning.
>>
>> node_dev_init()
>>      register_one_node
>>          register_memory_blocks_under_node
>>              walk_memory_blocks()
>>                  register_mem_block_under_node_early
>>                      get_nid_for_pfn
>>                          early_pfn_to_nid
>>
>> In my system node4 has a memory block ranging from memory30351
>> to memory38524, and memory128433. The memory blocks between
>> memory38524 and memory128433 do not belong to this node.
>>
>> In  walk_memory_blocks() we iterate over all memblocks starting
>> from memory38524 to memory128433.
>> In register_mem_block_under_node_early(), up to memory38524, the
>> first pfn correctly returns the corresponding nid and the function
>> returns from there. But after memory38524 and until memory128433,
>> the loop iterates through each pfn and checks the nid. Since the nid
>> does not match the required nid, the loop continues. This causes
>> the soft lockups.
>>
>> [1]: 
>> https://lore.kernel.org/linuxppc-dev/20241116192306.88217-1-sshegde@linux.ibm.com/
>> Fixes: 2848a28b0a60 ("drivers/base/node: consolidate node device 
>> subsystem initialization in node_dev_init()")
>
> That commit only moved code; so very likely, that is not the 
> problematic commit.
>
>
>> Signed-off-by: Donet Tom <donettom@...ux.ibm.com>
>> ---
>>   drivers/base/node.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>> index 0ea653fa3433..107eb508e28e 100644
>> --- a/drivers/base/node.c
>> +++ b/drivers/base/node.c
>> @@ -975,5 +975,6 @@ void __init node_dev_init(void)
>>           ret = register_one_node(i);
>>           if (ret)
>>               panic("%s() failed to add node: %d\n", __func__, ret);
>> +        cond_resched();
>>       }
>>   }
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ