linux-kernel - Re: [bisected] "sched: Allow per-cpu kernel threads to run on online && !active" causes warning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20160816221953.GA3373@osiris>
Date:	Wed, 17 Aug 2016 00:19:53 +0200
From:	Heiko Carstens <heiko.carstens@...ibm.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Ming Lei <tom.leiming@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>,
	Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Michael Holzheu <holzheu@...ux.vnet.ibm.com>,
	Martin Schwidefsky <schwidefsky@...ibm.com>
Subject: Re: [bisected] "sched: Allow per-cpu kernel threads to run on online
 && !active" causes warning

On Tue, Aug 16, 2016 at 11:42:05AM -0400, Tejun Heo wrote:
> Hello, Peter.
> 
> On Tue, Aug 16, 2016 at 05:29:49PM +0200, Peter Zijlstra wrote:
> > On Tue, Aug 16, 2016 at 11:20:27AM -0400, Tejun Heo wrote:
> > > As long as the mapping doesn't change after the first onlining of the
> > > CPU, the workqueue side shouldn't be too difficult to fix up.  I'll
> > > look into it.  For memory allocations, as long as the cpu <-> node
> > > mapping is established before any memory allocation for the cpu takes
> > > place, it should be fine too, I think.
> > 
> > Don't we allocate per-cpu memory for 'cpu_possible_map' on boot? There's
> > a whole bunch of per-cpu memory users that does things like:
> > 
> > 
> > 	for_each_possible_cpu(cpu) {
> > 		struct foo *foo = per_cpu_ptr(&per_cpu_var, cpu);
> > 
> > 		/* muck with foo */
> > 	}
> > 
> > 
> > Which requires a cpu->node map for all possible cpus at boot time.
> 
> Ah, right.  If cpu -> node mapping is dynamic, there isn't much that
> we can do about allocating per-cpu memory on the wrong node.  And it
> is problematic that percpu allocations can race against an onlining
> CPU switching its node association.
> 
> One way to keep the mapping stable would be reserving per-node
> possible CPU slots so that the CPU number assigned to a new CPU is on
> the right node.  It'd be a simple solution but would get really
> expensive with increasing number of nodes.
> 
> Heiko, do you have any ideas?

I think the easiest solution would be to simply assign all cpus, for which
we do not have any topology information, to an arbitrary node; e.g. round
robin.

After all the case that cpus are added later is rare and the s390 fake numa
implementation does not know about the memory topology. All it is doing is
distributing the memory to several nodes in order to avoid a single huge
node. So that should be sort of ok.

Unless somebody has a better idea?

Michael, Martin?