lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aLk3Fftch9lUMJTv@slm.duckdns.org>
Date: Wed, 3 Sep 2025 20:52:05 -1000
From: Tejun Heo <tj@...nel.org>
To: Chen Ridong <chenridong@...weicloud.com>
Cc: Michal Koutný <mkoutny@...e.com>,
	Yi Tao <escape@...ux.alibaba.com>, hannes@...xchg.org,
	cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cgroup: replace global percpu_rwsem with
 signal_struct->group_rwsem when writing cgroup.procs/threads

Hello,

On Thu, Sep 04, 2025 at 09:40:12AM +0800, Chen Ridong wrote:
...
> > Sorry, I was confused. We no longer need to write lock threadgroup rwsem
> > when CLONE_INTO_CGROUP'ing into an empty cgroup. We do still need
> > cgroup_mutex.
> > 
> >   671c11f0619e ("cgroup: Elide write-locking threadgroup_rwsem when updating csses on an empty subtree")
> > 
> > Thanks.
> > 
> 
> I'm still a bit confused. Commit 671c11f0619e ("cgroup: Elide write-locking threadgroup_rwsem when
> updating csses on an empty subtree") only applies to CSS updates. However, cloning with
> CLONE_INTO_CGROUP still requires acquiring the threadgroup_rwsem.
> 
> cgroup_can_fork
>   cgroup_css_set_fork
>     	if (kargs->flags & CLONE_INTO_CGROUP)
> 		cgroup_lock();
> 	cgroup_threadgroup_change_begin(current);

Ah, yeah, I'm misremembering things, sorry. What got elided in that commit
is down_write of threadgroup_rwsem when enabling controllers on empty
cgroups, which was the only operation which still needed to down_write the
rwsem. Here's an excerpt from the commit message:

    After this optimization, the usage pattern of creating a cgroup, enabling
    the necessary controllers, and then seeding it with CLONE_INTO_CGROUP and
    then removing the cgroup after it becomes empty doesn't need to write-lock
    threadgroup_rwsem at all.

It's true that cgroup_threadgroup_change_begin() down_reads the
threadgroup_rwsem but that is a percpu_rwsem whose read operations are
percpu inc/dec. This doesn't add any noticeable overhead or has any
scalability concerns.

So, if you follow the "recommended" workflow, the only remaining possible
scalability bottleneck is cgroup_mutex.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ