[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ab5cfba0-1d49-4e4d-e2c8-171e24473c1b@redhat.com>
Date: Tue, 7 Dec 2021 13:28:31 +0100
From: David Hildenbrand <david@...hat.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Alexey Makhalov <amakhalov@...are.com>,
Dennis Zhou <dennis@...nel.org>,
Eric Dumazet <eric.dumazet@...il.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Oscar Salvador <osalvador@...e.de>, Tejun Heo <tj@...nel.org>,
Christoph Lameter <cl@...ux.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH v3] mm: fix panic in __alloc_pages
>>> + free_area_init_memoryless_node(nid);
>>
>> That's really just free_area_init_node() below, I do wonder what value
>> free_area_init_memoryless_node() has as of today.
>
> I am not sure there is any real value in having this special name for
> this but I have kept is sync with what x86 does currently. If we want to
> remove the wrapper then just do it everywhere. I can do that on top.
>
Sure, just a general comment.
>>> + continue;
>>> + }
>>> +
>>> free_area_init_node(nid);
>>>
>>> /* Any memory on that node */
>>>
>>> Could you give it a try? I do not have any machine which would exhibit
>>> the problem so I cannot really test this out. I hope build_zone_info
>>> will not choke on this. I assume the node distance table is
>>> uninitialized for these nodes and IIUC this should lead to an assumption
>>> that all other nodes are close. But who knows that can blow up there.
>>>
>>> Btw. does this make any sense at all to others?
>>>
>>
>> __build_all_zonelists() has to update the zonelists of all nodes I think.
>
> I am not sure what you mean. This should be achieved by this patch
> because the boot time build_all_zonelists will go over all online nodes
"Over all possible nodes", including online and offline ones, to make
sure any possible node has a valid pgdat. IIUC, you're not changing
anything about online vs offline nodes, only that we have a pgdat also
for offline nodes.
> (i.e. with pgdat). free_area_init happens before that. I am just worried
> that the arch specific node_distance() will generate a complete garbage
> or blow up for some reason.
Assume you online a new zone and then call __build_all_zonelists() to
include the zone in all zonelists (via online_pages()).
__build_all_zonelists() will not include offline nodes (that still have
a pgdat with a valid zonelist now).
Similarly, assume you online a zone and then call
__build_all_zonelists() to exclude the zone from all zonelists (via
offline_pages()). __build_all_zonelists() will not include offline nodes
(that still have a pgdat with a valid zonelist now).
Essentially, IIRC, even after your change
start_kernel()->build_all_zonelists(NULL)->build_all_zonelists_init()->__build_all_zonelists(NULL)
won't initialize the zonelist of the new pgdat, because the nodes are
offline.
I'd assume we'd need
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c5952749ad40..e5d958abc7cc 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6382,7 +6382,7 @@ static void __build_all_zonelists(void *data)
if (self && !node_online(self->node_id)) {
build_zonelists(self);
} else {
- for_each_online_node(nid) {
+ for_each_node(nid) {
pg_data_t *pgdat = NODE_DATA(nid);
build_zonelists(pgdat);
to properly initialize the zonelist also for the offline nodes with a
valid pgdat.
But maybe I am missing something important regarding online vs. offline
nodes that your patch changes?
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists