[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250829-therapieren-datteln-13c31741c856@brauner>
Date: Fri, 29 Aug 2025 13:07:30 +0200
From: Christian Brauner <brauner@...nel.org>
To: Alexander Monakov <amonakov@...ras.ru>
Cc: linux-fsdevel@...r.kernel.org,
Alexander Viro <viro@...iv.linux.org.uk>, Jan Kara <jack@...e.cz>, linux-kernel@...r.kernel.org
Subject: Re: ETXTBSY window in __fput
On Fri, Aug 29, 2025 at 01:17:16PM +0300, Alexander Monakov wrote:
>
> On Fri, 29 Aug 2025, Christian Brauner wrote:
>
> > On Fri, Aug 29, 2025 at 10:21:35AM +0300, Alexander Monakov wrote:
> > >
> > > On Wed, 27 Aug 2025, Alexander Monakov wrote:
> > >
> > > > Dear fs hackers,
> > > >
> > > > I suspect there's an unfortunate race window in __fput where file locks are
> > > > dropped (locks_remove_file) prior to decreasing writer refcount
> > > > (put_file_access). If I'm not mistaken, this window is observable and it
> > > > breaks a solution to ETXTBSY problem on exec'ing a just-written file, explained
> > > > in more detail below.
> > >
> > > The race in __fput is a problem irrespective of how the testcase triggers it,
> > > right? It's just showing a real-world scenario. But the issue can be
> > > demonstrated without a multithreaded fork: imagine one process placing an
> > > exclusive lock on a file and writing to it, another process waiting on that
> > > lock and immediately execve'ing when the lock is released.
> > >
> > > Can put_file_access be moved prior to locks_remove_file in __fput?
> >
> > Even if we fix this there's no guarantee that the kernel will give that
> > letting the close() of a writably opened file race against a concurrent
> > exec of the same file will not result in EBUSY in some arcane way
> > currently or in the future.
>
> Forget Go and execve. Take the two-process scenario from my last email.
> The program waiting on flock shouldn't be able to observe elevated
> refcounts on the file after the lock is released. It matters not only
> for execve, but also for unmounting the underlying filesystem, right?
What? No. How?: with details, please.
> And maybe other things too. So why not fix the ordering issue in __fput
> and if there are other bugs breaking valid uses of flock, fix them too?
For locks_remove_file() to be movable after put_file_access() we'd have
to prove that no filesystem implementing f_op->lock() doesn't rely on
f_op->release() to not have run. It is fundamentally backwards to have
run f_ops after f_op->release() ran.
Random quick look into 9pfs:
static int v9fs_file_do_lock(struct file *filp, int cmd, struct file_lock *fl)
{
struct p9_flock flock;
struct p9_fid *fid;
uint8_t status = P9_LOCK_ERROR;
int res = 0;
struct v9fs_session_info *v9ses;
fid = filp->private_data;
BUG_ON(fid == NULL);
This relies on filp->private_data to be valid which it wouldn't be
anymore after f_op->release().
Moving put_file_access() before f_op->release() is also wrong and would
require to prove that no filesystem depends on file access not having
changed before f_op->release() has run. So no, not a trivial thing to
move around.
And you are explicitly advertising this as a fix to the go execve
problem; both in the bugtracker and here. And it's simply not a good
solution. The problem remains exec's deny-write mechanism. But hooking
into file locks to serialize exec against writable openers isn't a good
solution. It surely is creative though.
We can give userspace a simple way to sidestep this problem though as I
tried to show.
Powered by blists - more mailing lists