[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8d906c2a-8666-430c-aa41-2db6ec0088e5@suse.com>
Date: Fri, 31 May 2024 08:21:52 +0200
From: Jan Beulich <jbeulich@...e.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Dave Hansen <dave.hansen@...ux.intel.com>,
Andrew Lutomirski <luto@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
lkml <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/NUMA: don't pass MAX_NUMNODES to memblock_set_node()
On 29.05.2024 18:08, Dave Hansen wrote:
> On 5/29/24 09:00, Jan Beulich wrote:
>>> In other words, it's not completely clear why ff6c3d81f2e8 introduced
>>> this problem.
>> It is my understanding that said change, by preventing the NUMA
>> configuration from being rejected, resulted in different code paths to
>> be taken. The observed crash was somewhat later than the "No NUMA
>> configuration found" etc messages. Thus I don't really see a connection
>> between said change not having had any MAX_NUMNODES check and it having
>> introduced the (only perceived?) regression.
>
> So your system has a bad NUMA config. If it's rejected, then all is
> merry. Something goes and writes over the nids in all of the memblocks
> to point to 0 (probably).
>
> If it _isn't_ rejected, then it leaves a memblock in place that points
> to MAX_NUMNODES. That MAX_NUMNODES is a ticking time bomb for later.
>
> So this patch doesn't actually revert the rejection behavior change in
> the Fixes: commit. It just makes the rest of the code more tolerant to
> _not_ rejecting the NUMA config?
No, the NUMA config is now properly rejected again:
NUMA: no nodes coverage for 2041MB of 8185MB RAM
No NUMA configuration found
Faking a node at [mem 0x0000000000000000-0x000000027fffffff]
Jan
Powered by blists - more mailing lists