lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250410-barhocker-weinhandel-8ed2f619899b@brauner>
Date: Thu, 10 Apr 2025 12:43:14 +0200
From: Christian Brauner <brauner@...nel.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: linux-fsdevel@...r.kernel.org, Jeff Layton <jlayton@...nel.org>, 
	Lennart Poettering <lennart@...ttering.net>, Daan De Meyer <daan.j.demeyer@...il.com>, 
	Mike Yuan <me@...dnzj.com>, linux-kernel@...r.kernel.org, 
	Peter Ziljstra <peterz@...radead.org>
Subject: Re: [RFC PATCH] pidfs: ensure consistent ENOENT/ESRCH reporting

On Thu, Apr 10, 2025 at 12:18:01PM +0200, Oleg Nesterov wrote:
> On 04/09, Oleg Nesterov wrote:
> >
> > Christian,
> >
> > I will actually read your patch tomorrow, but at first glance
> >
> > On 04/09, Christian Brauner wrote:
> > >
> > > The seqcounter might be
> > > useful independent of pidfs.
> >
> > Are you sure? ;) to me the new pid->pid_seq needs more justification...

Yeah, pretty much. I'd make use of this in other cases where we need to
detect concurrent changes to struct pid without having to take any
locks. Multi-threaded exec in de_exec() comes to mind as well.

> > Again, can't we use pid->wait_pidfd->lock if we want to avoid the
> > (minor) problem with the wrong ENOENT?
> 
> I mean
> 
> 	int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
> 	{
> 		int err = 0;
> 
> 		spin_lock_irq(&pid->wait_pidfd->lock);
> 
> 		if (!pid_has_task(pid, PIDTYPE_PID))
> 			err = -ESRCH;
> 		else if (!(flags & PIDFD_THREAD) && !pid_has_task(pid, PIDTYPE_TGID))
> 			err = -ENOENT;
> 
> 		spin_lock_irq(&pid->wait_pidfd->lock);
> 
> 		return err ?: __pidfd_prepare(pid, flags, ret);
> 	}
> 
> To remind, detach_pid(pid, PIDTYPE_PID) does wake_up_all(&pid->wait_pidfd) and
> takes pid->wait_pidfd->lock.
> 
> So if pid_has_task(PIDTYPE_PID) succeeds, __unhash_process() -> detach_pid(TGID)
> is not possible until we drop pid->wait_pidfd->lock.
> 
> If detach_pid(PIDTYPE_PID) was already called and have passed wake_up_all(),
> pid_has_task(PIDTYPE_PID) can't succeed.

I know. I was trying to avoid having to take the lock and just make this
lockless. But if you think we should use this lock here instead I'm
willing to do this. I just find the sequence counter more elegant than
the spin_lock_irq().

And note that it doesn't grow struct pid. There's a 4 byte hole I would
place it into just before struct dentry *. So there's no downside to
this imho and it would give pidfds a reliable way to detect relevant
concurrent changes locklessly without penalizing other critical paths
(e.g., under tasklist_lock) in the kernel.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ