lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aLpKr6_r5exdc3EQ@slm.duckdns.org>
Date: Thu, 4 Sep 2025 16:27:59 -1000
From: Tejun Heo <tj@...nel.org>
To: escape <escape@...ux.alibaba.com>
Cc: hannes@...xchg.org, mkoutny@...e.com, cgroups@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] cgroup: replace global percpu_rwsem with
 signal_struct->group_rwsem when writing cgroup.procs/threads

Hello,

On Fri, Sep 05, 2025 at 10:16:30AM +0800, escape wrote:
> > > +	if (have_favordynmods)
> > > +		up_read(&tsk->signal->group_rwsem);
> > >   	percpu_up_read(&cgroup_threadgroup_rwsem);
> > Hmm... I wonder whether turning on/off the flag is racy. ie. what prevents
> > have_favordynmods flipping while a task is between change_begin and end?
>
> have_favordynmods is read-only after initialization and will not change
> during runtime.

I don't think that's necessarily true. favordynmods can also be specified as
a mount option and mount can race against forks, execs and exits. Also,
IIRC, cgroup2 doesn't allow remounts but there's nothing preventing someone
from unmounting and mounting it again with different options.

> > > @@ -3010,15 +3008,27 @@ struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup,
> > >   	 */
> > >   	if (tsk->no_cgroup_migration || (tsk->flags & PF_NO_SETAFFINITY)) {
> > >   		tsk = ERR_PTR(-EINVAL);
> > > -		goto out_unlock_threadgroup;
> > > +		goto out_unlock_rcu;
> > >   	}
> > > -
> > >   	get_task_struct(tsk);
> > > -	goto out_unlock_rcu;
> > > -out_unlock_threadgroup:
> > > -	cgroup_attach_unlock(*threadgroup_locked);
> > > -	*threadgroup_locked = false;
> > > +	rcu_read_unlock();
> > > +
> > > +	/*
> > > +	 * If we migrate a single thread, we don't care about threadgroup
> > > +	 * stability. If the thread is `current`, it won't exit(2) under our
> > > +	 * hands or change PID through exec(2). We exclude
> > > +	 * cgroup_update_dfl_csses and other cgroup_{proc,thread}s_write
> > > +	 * callers by cgroup_mutex.
> > > +	 * Therefore, we can skip the global lock.
> > > +	 */
> > > +	lockdep_assert_held(&cgroup_mutex);
> > > +	*threadgroup_locked = pid || threadgroup;
> > > +
> > > +	cgroup_attach_lock(tsk, *threadgroup_locked);
> > I'm not sure this relocation is safe. What prevents e.g. @tsk changing its
> > group leader or signal struct before lock is grabbed?
>
> When a non-leader thread in a thread group executes the exec system call,
> the thread group leader is updated, but the signal_struct remains unchanged,
> so this part is safe.

But the leader can change, right? So, we can end up in a situation where
threadgroup is set but the task is not the leader which I think can lead to
really subtle incorrect behaviors like write succeeding but nothing
happening when racing against exec.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ