linux-kernel - Re: [RFC PATCH 12/20] kthread: Implement preferred affinity

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <00914d25-f0ae-4b00-9608-2457821c071c@suse.cz>
Date: Mon, 5 Aug 2024 16:53:51 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
 Andrew Morton <akpm@...ux-foundation.org>, Kees Cook <kees@...nel.org>,
 Peter Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>,
 Michal Hocko <mhocko@...nel.org>, linux-mm@...ck.org,
 "Paul E. McKenney" <paulmck@...nel.org>,
 Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
 Joel Fernandes <joel@...lfernandes.org>, Boqun Feng <boqun.feng@...il.com>,
 Zqiang <qiang.zhang1211@...il.com>, rcu@...r.kernel.org
Subject: Re: [RFC PATCH 12/20] kthread: Implement preferred affinity

On 8/5/24 16:28, Frederic Weisbecker wrote:
> Le Tue, Jul 30, 2024 at 05:49:51PM +0200, Vlastimil Babka a écrit :
>> Acked-by: Vlastimil Babka <vbabka@...e.cz>
>> 
>> Nit:
>> 
>> > +int kthread_affine_preferred(struct task_struct *p, const struct cpumask *mask)
>> > +{
>> > +	struct kthread *kthread = to_kthread(p);
>> > +	cpumask_var_t affinity;
>> > +	unsigned long flags;
>> > +	int ret;
>> > +
>> > +	if (!wait_task_inactive(p, TASK_UNINTERRUPTIBLE) || kthread->started) {
>> > +		WARN_ON(1);
>> > +		return -EINVAL;
>> > +	}
>> > +
>> 
>> Should we also fail if kthread->preferred_affinity already exist? In
>> case somebody calls this twice.
> 
> Good point!
> 
>> 
>> Also for some of the use cases (kswapd, kcompactd) it would make sense
>> to be able to add cpus of a node as they are onlined. Which seems we
>> didn't do, except some corner case handling in kcompactd, but maybe we
>> should? I wonder if the current implementation of onlining a completely
>> new node with cpus does the right thing as a result of the individual
>> onlining operations, or we end up with being affined to a single cpu (or
>> none).
>> 
>> But that would need some kind of kthread_affine_preferred_update()
>> implementation?
> 
> So you mean that the "for_each_node_state()" loop in kcompactd doesn't
> handle all possible nodes but only those online when it's called? Or
> am I confused?

If you mean the loop in kcompactd_init() then indeed, but we also have a
hook in online_pages() to start new threads on newly onlined nodes, so
that's not a problem.

The problem (I think) I see is cpumask_of_node(pgdat->node_id) is a snapshot
of cpus running on the NUMA node the time, and is never updated later as new
cpus might be brought up.

kcompactd_cpu_online() does try to update that when cpus are onlined (in a
clumsy way), there was nothing like that for kswapd and after your series
this update is also removed for kcompactd.

> If all users of preferred affinity were to use NUMA nodes, it could be
> a good idea to do a flavour of kernel/smpboot.c which would handle
> per-node kthreads instead of per-cpu kthreads. I initially thought
> about that. It would have handled all the lifecycle of those kthreads,
> including creation, against hotplug. Unfortunately RCU doesn't rely on
> per-NUMA nodes but rather use its own tree.
> 
> If there be more users of real per NUMA nodes kthreads than kswapd and
> kcompactd, of course that would be much worth considering.

Yeah it's not that compelling, but a way to update the preferred affine mask
in response to cpu hotplug events, that kswapd and kcompactd could use,
would be sufficient. And maybe more widely useful.

I guess there could be a callback defined for kthread to provide a new
preferred_affinity, that you'd call from kthreads_hotplug_update() ?
And kcompactd and kswapd could both use the same callback that interprets
kthread_data() as pgdat and fetches a new cpumask of it?

> Thanks.