[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <lhuwm1ji7bl.fsf@oldenburg.str.redhat.com>
Date: Thu, 15 Jan 2026 09:55:10 +0100
From: Florian Weimer <fweimer@...hat.com>
To: Christian Brauner <brauner@...nel.org>
Cc: linux-fsdevel@...r.kernel.org, linux-api@...r.kernel.org,
linux-kernel@...r.kernel.org, Al Viro <viro@...iv.linux.org.uk>, David
Howells <dhowells@...hat.com>, DJ Delorie <dj@...hat.com>
Subject: Re: O_CLOEXEC use for OPEN_TREE_CLOEXEC
* Christian Brauner:
> On Tue, Jan 13, 2026 at 11:40:55PM +0100, Florian Weimer wrote:
>> In <linux/mount.h>, we have this:
>>
>> #define OPEN_TREE_CLOEXEC O_CLOEXEC /* Close the file on execve() */
>>
>> This causes a few pain points for us to on the glibc side when we mirror
>> this into <linux/mount.h> becuse O_CLOEXEC is defined in <fcntl.h>,
>> which is one of the headers that's completely incompatible with the UAPI
>> headers.
>>
>> The reason why this is painful is because O_CLOEXEC has at least three
>> different values across architectures: 0x80000, 0x200000, 0x400000
>>
>> Even for the UAPI this isn't ideal because it effectively burns three
>> open_tree flags, unless the flags are made architecture-specific, too.
>
> I think that just got cargo-culted... A long time ago some API define as
> O_CLOEXEC and now a lot of APIs have done the same.
Yes, it looks like inotify is in the same boat.
> I'm pretty sure we can't change that now but we can document that this
> shouldn't be ifdefed and instead be a separate per-syscall bit. But I
> think that's the best we can do right now.
Maybe add something like this as a safety measure, to ensure that the
flags don't overlap?
diff --git a/fs/namespace.c b/fs/namespace.c
index c58674a20cad..5bbfd379ec44 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3069,6 +3069,9 @@ static struct file *vfs_open_tree(int dfd, const char __user *filename, unsigned
bool detached = flags & OPEN_TREE_CLONE;
BUILD_BUG_ON(OPEN_TREE_CLOEXEC != O_CLOEXEC);
+ BUILD_BUG_IN(!(O_CLOEXEC & OPEN_TREE_CLONE));
+ BUILD_BUG_ON(!((AT_EMPTY_PATH | AT_NO_AUTOMOUNT | AT_RECURSIVE | AT_SYMLINK_NOFOLLOW) &
+ (O_CLOEXEC | OPEN_TREE_CLONE)));
if (flags & ~(AT_EMPTY_PATH | AT_NO_AUTOMOUNT | AT_RECURSIVE |
AT_SYMLINK_NOFOLLOW | OPEN_TREE_CLONE |
@@ -3100,7 +3103,7 @@ static struct file *vfs_open_tree(int dfd, const char __user *filename, unsigned
SYSCALL_DEFINE3(open_tree, int, dfd, const char __user *, filename, unsigned, flags)
{
- return FD_ADD(flags, vfs_open_tree(dfd, filename, flags));
+ return FD_ADD(flags & O_CLOEXEC, vfs_open_tree(dfd, filename, flags));
}
/*
(Completely untested.)
Passing the mix of flags to FD_ADD isn't really future-proof if FD_ADD
ever recognizes more than just O_CLOEXEC.
Thanks,
Florian
Powered by blists - more mailing lists