linux-kernel - Re: [PATCH 00/10] workqueue: break affinity initiatively

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201215084914.GD3040@hirez.programming.kicks-ass.net>
Date:   Tue, 15 Dec 2020 09:49:14 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Lai Jiangshan <jiangshanlai@...il.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Lai Jiangshan <laijs@...ux.alibaba.com>,
        Hillf Danton <hdanton@...a.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Qian Cai <cai@...hat.com>,
        Vincent Donnefort <vincent.donnefort@....com>,
        Tejun Heo <tj@...nel.org>
Subject: Re: [PATCH 00/10] workqueue: break affinity initiatively

On Tue, Dec 15, 2020 at 04:14:26PM +0800, Lai Jiangshan wrote:
> On Tue, Dec 15, 2020 at 3:50 PM Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > On Tue, Dec 15, 2020 at 01:44:53PM +0800, Lai Jiangshan wrote:
> > > I don't know how the scheduler distinguishes all these
> > > different cases under the "new assumption".
> >
> > The special case is:
> >
> >   (p->flags & PF_KTHREAD) && p->nr_cpus_allowed == 1
> >
> >
> 
> So unbound per-node workers can possibly match this test. So there is code
> needed to handle for unbound workers/pools which is done by this patchset.

Curious; how could a per-node worker match this? Only if the node is a
single CPU, or otherwise too?

> Is this the code of is_per_cpu_kthread()? I think I should have also
> used this function in workqueue and don't break affinity for unbound
> workers have more than 1 cpu.

Yes, that function captures it. If you want to use it, feel free to move
it to include/linux/sched.h.

This class of threads is 'special', since it needs to violate the
regular hotplug rules, and migrate_disable() made it just this little
bit more special. It basically comes down to how we need certain per-cpu
kthreads to run on a CPU while it's brought up, before userspace is
allowed on, and similarly they need to run on the CPU after userspace is
no longer allowed on in order to bring it down.

(IOW, they must be allowed to violate the active mask)

Due to migrate_disable() we had to move the migration code from the very
last cpu-down stage, to earlier. This in turn brought the expectation
(which is normally met) that per-cpu kthreads will stop/park or
otherwise make themselves scarce when the CPU goes down. We can no
longer force migrate them.

Workqueues are the sole exception to that, they've got some really
'dodgy' hotplug behaviour.