lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aRYGeduETy3RPnFK@slm.duckdns.org>
Date: Thu, 13 Nov 2025 06:25:29 -1000
From: Tejun Heo <tj@...nel.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Waiman Long <llong@...hat.com>, Johannes Weiner <hannes@...xchg.org>,
	Michal Koutný <mkoutny@...e.com>,
	Clark Williams <clrkwllms@...nel.org>,
	Steven Rostedt <rostedt@...dmis.org>, linux-kernel@...r.kernel.org,
	cgroups@...r.kernel.org, linux-rt-devel@...ts.linux.dev,
	Chen Ridong <chenridong@...wei.com>, Pingfan Liu <piliu@...hat.com>,
	Juri Lelli <juri.lelli@...hat.com>
Subject: Re: [cgroup/for-6.19 PATCH] cgroup/cpuset: Make callback_lock a
 raw_spinlock_t

Hello,

On Thu, Nov 13, 2025 at 08:53:56AM +0100, Sebastian Andrzej Siewior wrote:
> On 2025-11-12 13:21:12 [-0500], Waiman Long wrote:
> > On 11/12/25 3:51 AM, Sebastian Andrzej Siewior wrote:
> > > On 2025-11-11 22:57:59 [-0500], Waiman Long wrote:
> > > > The callback_lock is a spinlock_t which is acquired either to read
> > > > a stable set of cpu or node masks or to modify those masks when
> > > > cpuset_mutex is also acquired. Sometime it may need to go up the
> > > > cgroup hierarchy while holding the lock to find the right set of masks
> > > > to use. Assuming that the depth of the cgroup hierarch is finite and
> > > > typically small, the lock hold time should be limited.
> > > We can't assume that, can we?
> > We can theoretically create a cgroup hierarchy with many levels, but no sane
> > users will actually do that. If this is a concern to you, I can certainly
> > drop this patch.
> 
> Someone will think this is sane and will wonder. We usually don't impose
> limits but make sure things are preemptible so it does not matter.

It's always better to be scalable but note that there are cases where the
overhead of nesting can't be hidden completely without significant
sacrifices in other areas and we don't want to overindex on depth
scalability at the cost of practical capabilities. This is also why cgroup
depth is a limited resource controlled by cgroup.max.depth knob.

If something works well with, say, 16 levels of nesting, it's already mostly
acceptable.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ