linux-kernel - Re: [PATCH] x86: fix the initialization of physnode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140201131312.5b63fde3@hananiah.suse.cz>
Date:	Sat, 1 Feb 2014 13:13:12 +0100
From:	Petr Tesarik <ptesarik@...e.cz>
To:	Dave Hansen <dave@...1.net>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
	Jiang Liu <liuj97@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Dave Hansen <dave@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86: fix the initialization of physnode_map

On Fri, 31 Jan 2014 13:14:29 -0800
Dave Hansen <dave@...1.net> wrote:

> On 01/31/2014 02:05 AM, Petr Tesarik wrote:
> > With DISCONTIGMEM, the mapping between a pfn and its owning node is
> > initialized using data provided by the BIOS or from the command line.
> > However, the initialization may fail if the extents are not aligned
> > to section boundary (64M).
> 
> So is this a problem that shows up with DISCONTIGMEM?

Yes, that's it.

> Just curious, but
> what the heck kind of 32-bit NUMA hardware is still in the wild?  Did
> someon buy a NUMA-Q on eBay? :)

In fact, this is a patch that has been floating around in SUSE
Enterprise kernels for some time. It was originally added to pass
certification on IBM SurePOS 700 x4900-785.

When cleaning up our kernel patches, I noticed that the bug is still
present in the upstream kernel, so I posted this patch. While I don't
have any evidence that someone actually needs the fix today, it seems
wrong to leave buggy code in the kernel.

If you all agree that we rip off DISCONTIGMEM instead, I can post
patches to do that and be equally happy. ;-)

> >  void memory_present(int nid, unsigned long start, unsigned long end)
> >  {
> > -	unsigned long pfn;
> > +	unsigned long sect, endsect;
> >  
> >  	printk(KERN_INFO "Node: %d, start_pfn: %lx, end_pfn: %lx\n",
> >  			nid, start, end);
> >  	printk(KERN_DEBUG "  Setting physnode_map array to node %d for pfns:\n", nid);
> >  	printk(KERN_DEBUG "  ");
> > -	for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
> > -		physnode_map[pfn / PAGES_PER_SECTION] = nid;
> > -		printk(KERN_CONT "%lx ", pfn);
> > +	endsect = (end - 1) / PAGES_PER_SECTION;
> > +	for (sect = start / PAGES_PER_SECTION; sect <= endsect; ++sect) {
> > +		physnode_map[sect] = nid;
> > +		printk(KERN_CONT "%lx ", sect * PAGES_PER_SECTION);
> >  	}
> >  	printk(KERN_CONT "\n");
> >  }
> 
> So, if start and end are not aligned to section boundaries, we will miss
> setting physnode_map[] for the final section?

If end belongs to a different section than start, the final section
will not be initialized, yes.

> For instance, if we have a 64MB section size and try to call
> memory_present(32MB -> 96MB), we will set 0->64MB present, but not set
> the 64MB->128MB section as present.
> 
> Right?

Exactly.

> Can you just align 'start' down to the section's start and 'end' up to
> the end of the section that contains it?  I guess you do that
> implicitly, but you should be able to do it without refactoring the for
> loop entirely.

Works for me.

Petr Tesarik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/