[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140201131312.5b63fde3@hananiah.suse.cz>
Date: Sat, 1 Feb 2014 13:13:12 +0100
From: Petr Tesarik <ptesarik@...e.cz>
To: Dave Hansen <dave@...1.net>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
Jiang Liu <liuj97@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86: fix the initialization of physnode_map
On Fri, 31 Jan 2014 13:14:29 -0800
Dave Hansen <dave@...1.net> wrote:
> On 01/31/2014 02:05 AM, Petr Tesarik wrote:
> > With DISCONTIGMEM, the mapping between a pfn and its owning node is
> > initialized using data provided by the BIOS or from the command line.
> > However, the initialization may fail if the extents are not aligned
> > to section boundary (64M).
>
> So is this a problem that shows up with DISCONTIGMEM?
Yes, that's it.
> Just curious, but
> what the heck kind of 32-bit NUMA hardware is still in the wild? Did
> someon buy a NUMA-Q on eBay? :)
In fact, this is a patch that has been floating around in SUSE
Enterprise kernels for some time. It was originally added to pass
certification on IBM SurePOS 700 x4900-785.
When cleaning up our kernel patches, I noticed that the bug is still
present in the upstream kernel, so I posted this patch. While I don't
have any evidence that someone actually needs the fix today, it seems
wrong to leave buggy code in the kernel.
If you all agree that we rip off DISCONTIGMEM instead, I can post
patches to do that and be equally happy. ;-)
> > void memory_present(int nid, unsigned long start, unsigned long end)
> > {
> > - unsigned long pfn;
> > + unsigned long sect, endsect;
> >
> > printk(KERN_INFO "Node: %d, start_pfn: %lx, end_pfn: %lx\n",
> > nid, start, end);
> > printk(KERN_DEBUG " Setting physnode_map array to node %d for pfns:\n", nid);
> > printk(KERN_DEBUG " ");
> > - for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
> > - physnode_map[pfn / PAGES_PER_SECTION] = nid;
> > - printk(KERN_CONT "%lx ", pfn);
> > + endsect = (end - 1) / PAGES_PER_SECTION;
> > + for (sect = start / PAGES_PER_SECTION; sect <= endsect; ++sect) {
> > + physnode_map[sect] = nid;
> > + printk(KERN_CONT "%lx ", sect * PAGES_PER_SECTION);
> > }
> > printk(KERN_CONT "\n");
> > }
>
> So, if start and end are not aligned to section boundaries, we will miss
> setting physnode_map[] for the final section?
If end belongs to a different section than start, the final section
will not be initialized, yes.
> For instance, if we have a 64MB section size and try to call
> memory_present(32MB -> 96MB), we will set 0->64MB present, but not set
> the 64MB->128MB section as present.
>
> Right?
Exactly.
> Can you just align 'start' down to the section's start and 'end' up to
> the end of the section that contains it? I guess you do that
> implicitly, but you should be able to do it without refactoring the for
> loop entirely.
Works for me.
Petr Tesarik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists