lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1404221849330.4008@chino.kir.corp.google.com>
Date:	Tue, 22 Apr 2014 18:59:15 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Peter Zijlstra <peterz@...radead.org>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Jiang Liu <jiang.liu@...ux.intel.com>,
	Ingo Molnar <mingo@...nel.org>, Ingo Molnar <mingo@...hat.com>,
	"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
	Tony Luck <tony.luck@...el.com>, linux-kernel@...r.kernel.org
Subject: Re: [Bugfix] sched: fix possible invalid memory access caused by
 CPU hot-addition

On Tue, 22 Apr 2014, Peter Zijlstra wrote:

> On Tue, Apr 22, 2014 at 01:01:51PM -0700, Andrew Morton wrote:
> > On Tue, 22 Apr 2014 10:15:15 +0200 Peter Zijlstra <peterz@...radead.org> wrote:
> > 
> > > On Tue, Apr 22, 2014 at 01:27:15PM +0800, Jiang Liu wrote:
> > > > When calling kzalloc_node(size, flags, node), we should first check
> > > > whether node is onlined, otherwise it may cause invalid memory access
> > > > as below.
> > > 
> > > But this is only for memory less node crap, right? 
> > 
> > um, why are memoryless nodes crap?
> 
> Why wouldn't they be? Having CPUs with no local memory seems decidedly
> suboptimal.

The quick fix for memoryless node issues is usually just do cpu_to_mem() 
rather than cpu_to_node() in the caller.  This assumes that the arch is 
setup correctly to handle memoryless nodes with 
CONFIG_HAVE_MEMORYLESS_NODES (and we've had problems recently with 
memoryless nodes not being configured correctly on powerpc).

That type of a fix would probably be better handled in the slab allocator, 
though, since kmalloc_node(nid) shouldn't crash just because nid is 
memoryless, we should be doing local_memory_node(node) when allocating the 
slab pages.

However, I don't think memoryless nodes are the problem here since Jiang 
is testing for !node_online(nid) in his patch, so it's a problem with 
cpu_to_node() pointing to an offline node.  It makes sense for the page 
allocator to crash in such a case, the node id is erroneous.

So either the cpu-to-node mapping is invalid or alloc_fair_sched_group() 
is allocating memory for a cpu on an offline node.  The 
for_each_possible_cpu() looks suspicious.  There's no guarantee that 
local_memory_node(node) for an offline node will return anything with 
affinity, so falling back to NUMA_NO_NODE looks appropriate in Jiang's 
patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ