[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200317140122.GQ26018@dhcp22.suse.cz>
Date: Tue, 17 Mar 2020 15:01:22 +0100
From: Michal Hocko <mhocko@...nel.org>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
Sachin Sant <sachinp@...ux.vnet.ibm.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org,
Mel Gorman <mgorman@...e.de>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Andrew Morton <akpm@...ux-foundation.org>,
linuxppc-dev@...ts.ozlabs.org, Christopher Lameter <cl@...ux.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Kirill Tkhai <ktkhai@...tuozzo.com>
Subject: Re: [PATCH 1/3] powerpc/numa: Set numa_node for all possible cpus
On Tue 17-03-20 14:44:45, Vlastimil Babka wrote:
> On 3/16/20 10:06 AM, Michal Hocko wrote:
> > On Thu 12-03-20 17:41:58, Vlastimil Babka wrote:
> > [...]
> >> with nid present in:
> >> N_POSSIBLE - pgdat might not exist, node_to_mem_node() must return some online
> >
> > I would rather have a dummy pgdat for those. Have a look at
> > $ git grep "NODE_DATA.*->" | wc -l
> > 63
> >
> > Who knows how many else we have there. I haven't looked more closely.
> > Besides that what is a real reason to not have pgdat ther and force all
> > users of a $random node from those that the platform considers possible
> > for special casing? Is that a memory overhead? Is that really a thing?
>
> I guess we can ignore memory overhead. I guess there only might be some concern
> that for nodes that are initially offline, we will allocate the pgdat on a
> different node, and after they are online, it will stay on a different node with
> more access latency from local cpus. If we only allocate for online nodes, it
> can always be local? But I guess it doesn't matter that much.
This is not the case even now because of chicke&egg. You need a memory
to allocate from and that memory has to be managed somewhere per node
(pgdat). Keep in mind we do not have the bootmem allocator for the
hotplug. Have a look at hotadd_new_pgdat and when it is called. There
are some attempts to allocate memmap from the hotpluged memory but I am
not sure we can do the whole thing without pgdat in place. If we can
then can come up with some replace the pgdat magic. But still I am not
even sure this is something we really have to optimize for.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists