linux-kernel - Re: [PATCH 2/4] sched: implement __set_cpus

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4C03879A.8030505@kernel.org>
Date:	Mon, 31 May 2010 11:55:38 +0200
From:	Tejun Heo <tj@...nel.org>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	mingo@...e.hu, linux-kernel@...r.kernel.org,
	Rusty Russell <rusty@...tcorp.com.au>,
	Mike Galbraith <efault@....de>
Subject: Re: [PATCH 2/4] sched: implement __set_cpus_allowed()

Hello,

On 05/31/2010 10:01 AM, Peter Zijlstra wrote:
> On Thu, 2010-05-13 at 12:48 +0200, Tejun Heo wrote:
>> Concurrency managed workqueue needs to be able to migrate tasks to a
>> cpu which is online but !active for the following two purposes.
>>
>> p1. To guarantee forward progress during cpu down sequence.  Each
>>     workqueue which could be depended upon during memory allocation
>>     has an emergency worker task which is summoned when a pending work
>>     on such workqueue can't be serviced immediately.  cpu hotplug
>>     callbacks expect workqueues to work during cpu down sequence
>>     (usually so that they can flush them), so, to guarantee forward
>>     progress, it should be possible to summon emergency workers to
>>     !active but online cpus.
> 
> If we do the thing suggested in the previous patch, that is move
> clearing active and rebuilding the sched domains until right after
> DOWN_PREPARE, this goes away, right?

Hmmm... yeah, if the usual set_cpus_allowed_ptr() keeps working
throughout CPU_DOWN_PREPARE, this probably goes away.  I'll give it a
shot.

>> p2. To migrate back unbound workers when a cpu comes back online.
>>     When a cpu goes down, existing workers are unbound from the cpu
>>     and allowed to run on other cpus if there still are pending or
>>     running works.  If the cpu comes back online while those workers
>>     are still around, those workers are migrated back and re-bound to
>>     the cpu.  This isn't strictly required for correctness as long as
>>     those unbound workers don't execute works which are newly
>>     scheduled after the cpu comes back online; however, migrating back
>>     the workers has the advantage of making the behavior more
>>     consistent thus avoiding surprises which are difficult to expect
>>     and reproduce, and being actually cleaner and easier to implement.
> 
> I still don't like this much, if you mark these tasks to simply die when
> the queue is exhausted, and flush the queue explicitly on
> CPU_UP_PREPARE, you should never need to do this.

I don't think flushing from CPU_UP_PREPARE would be a good idea.
There is no guarantee that those works will finish in short (human
scale) time, but we can update cpu_active mask before other
CPU_UP_PREPARE notifiers are executed so that it's symmetrical to cpu
down path and then this problem goes away the exact same way, right?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/