linux-kernel - Re: [RFC PATCH] pidfs: ensure consistent ENOENT/ESRCH reporting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250410-barhocker-weinhandel-8ed2f619899b@brauner>
Date: Thu, 10 Apr 2025 12:43:14 +0200
From: Christian Brauner <brauner@...nel.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: linux-fsdevel@...r.kernel.org, Jeff Layton <jlayton@...nel.org>, 
	Lennart Poettering <lennart@...ttering.net>, Daan De Meyer <daan.j.demeyer@...il.com>, 
	Mike Yuan <me@...dnzj.com>, linux-kernel@...r.kernel.org, 
	Peter Ziljstra <peterz@...radead.org>
Subject: Re: [RFC PATCH] pidfs: ensure consistent ENOENT/ESRCH reporting

On Thu, Apr 10, 2025 at 12:18:01PM +0200, Oleg Nesterov wrote:
> On 04/09, Oleg Nesterov wrote:
> >
> > Christian,
> >
> > I will actually read your patch tomorrow, but at first glance
> >
> > On 04/09, Christian Brauner wrote:
> > >
> > > The seqcounter might be
> > > useful independent of pidfs.
> >
> > Are you sure? ;) to me the new pid->pid_seq needs more justification...

Yeah, pretty much. I'd make use of this in other cases where we need to
detect concurrent changes to struct pid without having to take any
locks. Multi-threaded exec in de_exec() comes to mind as well.

> > Again, can't we use pid->wait_pidfd->lock if we want to avoid the
> > (minor) problem with the wrong ENOENT?
> 
> I mean
> 
> 	int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
> 	{
> 		int err = 0;
> 
> 		spin_lock_irq(&pid->wait_pidfd->lock);
> 
> 		if (!pid_has_task(pid, PIDTYPE_PID))
> 			err = -ESRCH;
> 		else if (!(flags & PIDFD_THREAD) && !pid_has_task(pid, PIDTYPE_TGID))
> 			err = -ENOENT;
> 
> 		spin_lock_irq(&pid->wait_pidfd->lock);
> 
> 		return err ?: __pidfd_prepare(pid, flags, ret);
> 	}
> 
> To remind, detach_pid(pid, PIDTYPE_PID) does wake_up_all(&pid->wait_pidfd) and
> takes pid->wait_pidfd->lock.
> 
> So if pid_has_task(PIDTYPE_PID) succeeds, __unhash_process() -> detach_pid(TGID)
> is not possible until we drop pid->wait_pidfd->lock.
> 
> If detach_pid(PIDTYPE_PID) was already called and have passed wake_up_all(),
> pid_has_task(PIDTYPE_PID) can't succeed.

I know. I was trying to avoid having to take the lock and just make this
lockless. But if you think we should use this lock here instead I'm
willing to do this. I just find the sequence counter more elegant than
the spin_lock_irq().

And note that it doesn't grow struct pid. There's a 4 byte hole I would
place it into just before struct dentry *. So there's no downside to
this imho and it would give pidfds a reliable way to detect relevant
concurrent changes locklessly without penalizing other critical paths
(e.g., under tasklist_lock) in the kernel.