lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOBoifgp=oEC9SSgFC+4_fYgDgSH_Z_TMgwhOxxaNZmyD-ijig@mail.gmail.com>
Date: Wed, 7 May 2025 10:23:24 -0700
From: Xi Wang <xii@...gle.com>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: Tejun Heo <tj@...nel.org>, linux-kernel@...r.kernel.org, cgroups@...r.kernel.org, 
	Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, 
	Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, 
	Ben Segall <bsegall@...gle.com>, David Rientjes <rientjes@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Valentin Schneider <vschneid@...hat.com>, Waiman Long <longman@...hat.com>, 
	Johannes Weiner <hannes@...xchg.org>, Michal Koutný <mkoutny@...e.com>, 
	Vlastimil Babka <vbabka@...e.cz>, Dan Carpenter <dan.carpenter@...aro.org>, Chen Yu <yu.c.chen@...el.com>, 
	Kees Cook <kees@...nel.org>, Yu-Chun Lin <eleanor15x@...il.com>, 
	Thomas Gleixner <tglx@...utronix.de>, Mickaël Salaün <mic@...ikod.net>, 
	jiangshanlai@...il.com
Subject: Re: [RFC/PATCH] sched: Support moving kthreads into cpuset cgroups

On Wed, May 7, 2025 at 7:11 AM Frederic Weisbecker <frederic@...nel.org> wrote:
>
> Le Tue, May 06, 2025 at 08:43:57PM -0700, Xi Wang a écrit :
> > On Tue, May 6, 2025 at 5:17 PM Tejun Heo <tj@...nel.org> wrote:
> > For the use cases, there are two major requirements at the moment:
> >
> > Dynamic cpu affinity based isolation: CPUs running latency sensitive threads
> > (vcpu threads) can change over time. We'd like to configure kernel thread
> > affinity at run time too.
>
> I would expect such latency sensitive application to run on isolated
> partitions. And those already don't pull unbound kthreads.
>
> > Changing cpu affinity at run time requires cpumask
> > calculations and thread migrations. Sharing cpuset code would be nice.
>
> There is already some (recent) affinity management in the kthread subsystem.
> A list of kthreads having a preferred affinity (but !PF_NO_SETAFFINITY)
> is maintained and automatically handled against hotplug events and housekeeping
> state.
>
> >
> > Support numa based memory daemon affinity: We'd like to restrict kernel memory
> > daemons but maintain their numa affinity at the same time. cgroup hierarchies
> > can be helpful, e.g. create kernel, kernel/node0 and kernel/node1 and move the
> > daemons to the right cgroup.
>
> The kthread subsystem also handles node affinity. See kswapd / kcompactd. And it
> takes care of that while still honouring isolated / isolcpus partitions:
>
>       d1a89197589c ("kthread: Default affine kthread to its preferred NUMA node")
>
> >
> > Workqueue coverage is optional. kworker threads can use their separate
> > mechanisms too.
> >
> > Since the goal is isolation, we'd like to restrict as many kthreads as possible,
> > even the ones that don't directly interact with user applications.
> >
> > The kthreadd case is handled - a new kthread can be forked inside a non root
> > cgroup, but based on flags it can move itself to the root cgroup before threadfn
> > is called.
>
> kthreadd and other kthreads that don't have a preferred affinity are also
> affine outside isolcpus/nohz_full. And since isolated cpuset partitions
> create NULL domains, those kthreads won't run there either.
>
> What am I missing?

Overall I think your arguments depend on kernel and application threads are
significantly different for cpu affinity management, but there isn't enough
evidence for it. If cpuset is a bad idea for kernel threads it's probably not
a good idea for user threads either. Maybe we should just remove cpuset from
kernel and let applications threads go with boot time global variables and
set their own cpu affinities.

-Xi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ