[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190731144114.GY9330@dhcp22.suse.cz>
Date: Wed, 31 Jul 2019 16:41:14 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Mike Rapoport <rppt@...ux.ibm.com>
Cc: Hoan Tran OS <hoan@...amperecomputing.com>,
Will Deacon <will@...nel.org>,
Catalin Marinas <catalin.marinas@....com>,
Heiko Carstens <heiko.carstens@...ibm.com>,
"open list:MEMORY MANAGEMENT" <linux-mm@...ck.org>,
Paul Mackerras <paulus@...ba.org>,
"H . Peter Anvin" <hpa@...or.com>,
"sparclinux@...r.kernel.org" <sparclinux@...r.kernel.org>,
Alexander Duyck <alexander.h.duyck@...ux.intel.com>,
"linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
Michael Ellerman <mpe@...erman.id.au>,
"x86@...nel.org" <x86@...nel.org>,
Christian Borntraeger <borntraeger@...ibm.com>,
Ingo Molnar <mingo@...hat.com>,
Vlastimil Babka <vbabka@...e.cz>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Open Source Submission <patches@...erecomputing.com>,
Pavel Tatashin <pavel.tatashin@...rosoft.com>,
Vasily Gorbik <gor@...ux.ibm.com>,
Will Deacon <will.deacon@....com>,
Borislav Petkov <bp@...en8.de>,
Thomas Gleixner <tglx@...utronix.de>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
Oscar Salvador <osalvador@...e.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
"David S . Miller" <davem@...emloft.net>,
"willy@...radead.org" <willy@...radead.org>
Subject: Re: microblaze HAVE_MEMBLOCK_NODE_MAP dependency (was Re: [PATCH v2
0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA)
On Wed 31-07-19 17:21:29, Mike Rapoport wrote:
> On Wed, Jul 31, 2019 at 03:00:37PM +0200, Michal Hocko wrote:
> > On Wed 31-07-19 15:26:32, Mike Rapoport wrote:
> > > On Wed, Jul 31, 2019 at 01:40:16PM +0200, Michal Hocko wrote:
> > > > On Wed 31-07-19 14:14:22, Mike Rapoport wrote:
> > > > > On Wed, Jul 31, 2019 at 10:03:09AM +0200, Michal Hocko wrote:
> > > > > > On Wed 31-07-19 09:24:21, Mike Rapoport wrote:
> > > > > > > [ sorry for a late reply too, somehow I missed this thread before ]
> > > > > > >
> > > > > > > On Tue, Jul 30, 2019 at 10:14:15AM +0200, Michal Hocko wrote:
> > > > > > > > [Sorry for a late reply]
> > > > > > > >
> > > > > > > > On Mon 15-07-19 17:55:07, Hoan Tran OS wrote:
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > On 7/12/19 10:00 PM, Michal Hocko wrote:
> > > > > > > > [...]
> > > > > > > > > > Hmm, I thought this was selectable. But I am obviously wrong here.
> > > > > > > > > > Looking more closely, it seems that this is indeed only about
> > > > > > > > > > __early_pfn_to_nid and as such not something that should add a config
> > > > > > > > > > symbol. This should have been called out in the changelog though.
> > > > > > > > >
> > > > > > > > > Yes, do you have any other comments about my patch?
> > > > > > > >
> > > > > > > > Not really. Just make sure to explicitly state that
> > > > > > > > CONFIG_NODES_SPAN_OTHER_NODES is only about __early_pfn_to_nid and that
> > > > > > > > doesn't really deserve it's own config and can be pulled under NUMA.
> > > > > > > >
> > > > > > > > > > Also while at it, does HAVE_MEMBLOCK_NODE_MAP fall into a similar
> > > > > > > > > > bucket? Do we have any NUMA architecture that doesn't enable it?
> > > > > > > > > >
> > > > > > >
> > > > > > > HAVE_MEMBLOCK_NODE_MAP makes huge difference in node/zone initialization
> > > > > > > sequence so it's not only about a singe function.
> > > > > >
> > > > > > The question is whether we want to have this a config option or enable
> > > > > > it unconditionally for each NUMA system.
> > > > >
> > > > > We can make it 'default NUMA', but we can't drop it completely because
> > > > > microblaze uses sparse_memory_present_with_active_regions() which is
> > > > > unavailable when HAVE_MEMBLOCK_NODE_MAP=n.
> > > >
> > > > I suppose you mean that microblaze is using
> > > > sparse_memory_present_with_active_regions even without CONFIG_NUMA,
> > > > right?
> > >
> > > Yes.
> > >
> > > > I have to confess I do not understand that code. What is the deal
> > > > with setting node id there?
> > >
> > > The sparse_memory_present_with_active_regions() iterates over
> > > memblock.memory regions and uses the node id of each region as the
> > > parameter to memory_present(). The assumption here is that sometime before
> > > each region was assigned a proper non-negative node id.
> > >
> > > microblaze uses device tree for memory enumeration and the current FDT code
> > > does memblock_add() that implicitly sets nid in memblock.memory regions to -1.
> > >
> > > So in order to have proper node id passed to memory_present() microblaze
> > > has to call memblock_set_node() before it can use
> > > sparse_memory_present_with_active_regions().
> >
> > I am sorry, but I still do not follow. Who is consuming that node id
> > information when NUMA=n. In other words why cannot we simply do
>
> We can, I think nobody cared to change it.
It would be great if somebody with the actual HW could try it out.
I can throw a patch but I do not even have a cross compiler in my
toolbox.
>
> > diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
> > index a015a951c8b7..3a47e8db8d1c 100644
> > --- a/arch/microblaze/mm/init.c
> > +++ b/arch/microblaze/mm/init.c
> > @@ -175,14 +175,9 @@ void __init setup_memory(void)
> >
> > start_pfn = memblock_region_memory_base_pfn(reg);
> > end_pfn = memblock_region_memory_end_pfn(reg);
> > - memblock_set_node(start_pfn << PAGE_SHIFT,
> > - (end_pfn - start_pfn) << PAGE_SHIFT,
> > - &memblock.memory, 0);
> > + memory_present(0, start_pfn << PAGE_SHIFT, end_pfn << PAGE_SHIFT);
>
> memory_present() expects pfns, the shift is not needed.
Right.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists