lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1210090259250.24261@chino.kir.corp.google.com>
Date:	Tue, 9 Oct 2012 03:04:52 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Tang Chen <tangchen@...fujitsu.com>,
	Andrew Morton <akpm@...ux-foundation.org>
cc:	Wen Congyang <wency@...fujitsu.com>, mingo@...hat.com,
	peterz@...radead.org, miaox@...fujitsu.com,
	linux-kernel@...r.kernel.org, linux-numa@...r.kernel.org
Subject: Re: [PATCH] Do not use cpu_to_node() to find an offlined cpu's
 node.

On Tue, 9 Oct 2012, Tang Chen wrote:

> > > Eek, the nid shouldn't be -1 yet, though, for cpu hotplug since this
> > > should be called at CPU_DYING level and migrate_tasks() still sees a valid
> > > cpu.
> 
> As Wen said below, nid is now set to -1 when cpu is hotremoved.
> I reproduce this problem in this situation:
> 
> all cpus are online, and hot remove a system board directorily, without
> offlining any cpu.
> 
> As a result, the removed cpu's nid is set to -1, and this causes
> problems.
> 

Let's add Andrew to the cc list then, because I'm nacking 
cpu_hotplug-unmap-cpu2node-when-the-cpu-is-hotremoved.patch in the -mm 
tree for this reason.

We can only clear a cpu-to-node mapping when the cpu is completely 
offline, not before or during the CPU_DYING stage.  Kernel code, such as 
the sched code that you are now trying to "fix", depends on this mapping 
to work correctly; obviously no audit was done of cpu hotplug code 
depending on it before the patch was proposed.

I say "fix" because even this workaround isn't a good solution since it 
would be much better to pick another cpu on the same node as the offlining 
cpu for the runqueue before falling back to the set of all allowed nodes.  
We lose all NUMA affinity information with that patch.  There's no reason 
why we shouldn't know the node of a cpu that is being offlined.

So nack to cpu_hotplug-unmap-cpu2node-when-the-cpu-is-hotremoved.patch.  
After it's removed because it's buggy, this "fix" will no longer be 
necessary.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ