linux-kernel - Re: [PATCH] kthread_worker: re-set CPU affinities if CPU come online

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201026165311.GA97873@mtj.duckdns.org>
Date:   Mon, 26 Oct 2020 12:53:11 -0400
From:   Tejun Heo <tj@...nel.org>
To:     Petr Mladek <pmladek@...e.com>
Cc:     qiang.zhang@...driver.com, akpm@...ux-foundation.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] kthread_worker: re-set CPU affinities if CPU come online

Hello, Petr.

On Mon, Oct 26, 2020 at 05:45:55PM +0100, Petr Mladek wrote:
> > I don't think this works. The kthread may have changed its binding while
> > running using set_cpus_allowed_ptr() as you're doing above. Besides, when a
> > cpu goes offline, the bound kthread can fall back to other cpus but its cpu
> > mask isn't cleared, is it?
> 
> If I get it correctly, select_fallback_rq() calls
> do_set_cpus_allowed() explicitly or in cpuset_cpus_allowed_fallback().
> It seems that the original mask gets lost.

Oh, I see.

> It would make sense to assume that kthread_worker API will take care of
> the affinity when it was set by kthread_create_worker_on_cpu().

I was for some reason thinking this was for all kthreads. Yeah, for
kthread_workers it does make sense.

> But is it safe to assume that the work can be safely proceed also
> on another CPU? We should probably add a warning into
> kthread_worker_fn() when it detects wrong CPU.

Per-cpu workqueues behave like that too. When the CPU goes down, per-cpu
workers on that CPU are unbound and may run anywhere. They get rebound when
CPU comes back up.

> BTW: kthread_create_worker_on_cpu() is currently used only by
>      start_power_clamp_worker(). And it has its own CPU hotplug
>      handling. The kthreads are stopped and started again
>      in powerclamp_cpu_predown() and  powerclamp_cpu_online().

And users which have hard dependency on CPU binding are expected to
implement hotplug events so that e.g. per-cpu work items are flushed when
CPU goes down and scheduled back when it comes back online.

There are pros and cons to the current workqueue behavior but it'd be a good
idea to keep kthread_worker's behavior in sync.

> I havn't checked all details yet. But in principle, the patch looks
> sane to me.

Yeah, agreed.

Thanks.

-- 
tejun