[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <840cb3d0-61fe-b6cb-9918-69146ba06cf7@redhat.com>
Date: Mon, 6 Dec 2021 12:00:50 +0100
From: David Hildenbrand <david@...hat.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Nico Pache <npache@...hat.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, akpm@...ux-foundation.org, shakeelb@...gle.com,
ktkhai@...tuozzo.com, shy828301@...il.com, guro@...com,
vbabka@...e.cz, vdavydov.dev@...il.com, raquini@...hat.com
Subject: Re: [RFC PATCH 2/2] mm/vmscan.c: Prevent allocating shrinker_info on
offlined nodes
On 06.12.21 11:54, Michal Hocko wrote:
> On Mon 06-12-21 11:45:54, David Hildenbrand wrote:
>>> This doesn't seen complete. Slab shrinkers are used in the reclaim
>>> context. Previously offline nodes could be onlined later and this would
>>> lead to NULL ptr because there is no hook to allocate new shrinker
>>> infos. This would be also really impractical because this would have to
>>> update all existing memcgs...
>>
>> Instead of going through the trouble of updating...
>>
>> ... maybe just keep for_each_node() and check if the target node is
>> offline. If it's offline, just allocate from the first online node.
>> After all, we're not using __GFP_THISNODE, so there are no guarantees
>> either way ...
>
> This looks like another way to paper over a deeper underlying problem
> IMHO. Fundamentally we have a problem that some pgdata are not allocated
> and that causes a lot of headache. Not to mention that node_online
> is just adding to a confusion because it doesn't really tell anything
> about the logical state of the node.
>
> I think we really should get rid of this approach rather than play a
> whack-a-mole. We should really drop all notion of node_online and
> instead allocate pgdat for each possible node. Arch specific code should
> make sure that zone lists are properly initialized.
>
I'm not sure if it's rally whack-a-mole really applies. It's just the
for_each_node_* calls that need love. In other cases, we shouldn't
really stumble over an offline node.
Someone deliberately decided to use "for_each_node()" instead of
for_each_online_node() without taking care of online vs. offline
semantics. That's just a BUG and needs fixing IMHO.
After all, we do need patches to backport, reworking pgdat init isn't
really something feasible for that I think. And I heard of PPC that can
hotplug thousands of nodes ...
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists