linux-kernel - Re: ETXTBSY window in _

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250826220033.GW39973@ZenIV>
Date: Tue, 26 Aug 2025 23:00:33 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: Alexander Monakov <amonakov@...ras.ru>
Cc: linux-fsdevel@...r.kernel.org, Christian Brauner <brauner@...nel.org>,
	Jan Kara <jack@...e.cz>, linux-kernel@...r.kernel.org
Subject: Re: ETXTBSY window in __fput

On Wed, Aug 27, 2025 at 12:05:38AM +0300, Alexander Monakov wrote:
> Dear fs hackers,
> 
> I suspect there's an unfortunate race window in __fput where file locks are
> dropped (locks_remove_file) prior to decreasing writer refcount
> (put_file_access). If I'm not mistaken, this window is observable and it
> breaks a solution to ETXTBSY problem on exec'ing a just-written file, explained
> in more detail below.
> 
> The program demonstrating the problem is attached (a slightly modified version
> of the demo given by Russ Cox on the Go issue tracker, see URL in first line).
> It makes 20 threads, each executing an infinite loop doing the following:
> 
> 1) open an fd for writing with O_CLOEXEC
> 2) write executable code into it
> 3) close it
> 4) fork
> 5) in the child, attempt to execve the just-written file
> 
> If you compile it with -DNOWAIT, you'll see that execve often fails with
> ETXTBSY. This happens if another thread forked while we were holding an open fd
> between steps 1 and 3, our fd "leaked" in that child, and then we reached our
> step 5 before that child did execve (at which point the leaked fd would be
> closed thanks to O_CLOEXEC).

Egads...  Let me get it straight - you have a bunch of threads sharing descriptor
tables and some of them are forking (or cloning without shared descriptor tables)
while that is going on?

Frankly, in such situation I would spawn a thread for that, did unshare(CLONE_FILES)
in it, replaced the binary and buggered off, with parent waiting for it to complete.