[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250818205606.GD222315@ZenIV>
Date: Mon, 18 Aug 2025 21:56:06 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: linux-fsdevel@...r.kernel.org
Cc: brauner@...nel.org, jack@...e.cz, linux-kernel@...r.kernel.org,
linux-kernel-mentees@...ts.linux.dev,
Linus Torvalds <torvalds@...ux-foundation.org>,
Ryan Chung <seokwoo.chung130@...il.com>
Subject: Re: [RFC] {do_,}lock_mount() behaviour wrt races and move_mount(2)
with empty to_path (was Re: [PATCH] fs/namespace.c: fix mountpath handling
in do_lock_mount())
On Mon, Aug 18, 2025 at 09:14:28PM +0100, Al Viro wrote:
> Alternative would be to treat these races as "act as if we'd won and
> the other guy had overmounted ours", i.e. *NOT* follow mounts. Again,
> for old syscalls that's fine - if another thread has raced with us and
> mounted something on top of the place we want to mount on, it could just
> as easily have come *after* we'd completed mount(2) and mounted their
> stuff on top of ours. If userland is not fine with such outcome, it needs
> to provide serialization between the callers. For move_mount(2)... again,
> the only real question is empty to_path case.
>
> Comments?
Thinking about it a bit more... Unfortunately, there's another corner
case: "." as mountpoint. That would affect that old syscalls as well
and I'm not sure that there's no userland code that relies upon the
current behaviour.
Background: pathname resolution does *NOT* follow mounts on the starting
point and it does not follow mounts after "."
; mkdir /tmp/foo
; mount -t tmpfs none /tmp/foo
; cd /tmp/foo
; echo under > a
; cat /tmp/foo/a
under
; mount -t tmpfs none /tmp/foo
; cat a
under
; cat /tmp/foo/a
cat: /tmp/foo/a: no such file or directory
; echo under > b
; cat b
under
; cat /tmp/foo/b
cat: /tmp/foo/b: no such file or directory
;
It's been a bad decision (if it can be called that - it's been more
of an accident, AFAICT), but it's decades too late to change it.
And interaction with mount is also fun: mount(2) *DOES* follow mounts
on the end of any pathname, no matter what. So in case when we are
standing in an overmounted directory, ls . will show the contents of
that directory, but mount <something> . will mount on top of whatever's
mounted there.
So the alternative I've mentioned above would change the behaviour of
old syscalls in a corner case that just might be actually used in userland
code - including the scripts run at the boot time, of all things ;-/
IOW, it probably falls under "can't touch that, no matter how much we'd
like to" ;-/ Pity, that...
That leaves the question of MOVE_MOUNT_BENEATH with empty pathname -
do we want a variant that would say "slide precisely under the opened
directory I gave you, no matter what might overmount it"?
At the very least this corner case needs to be documented in move_mount(2)
- behaviour of
move_mount(_, _, dir_fd, "",
MOVE_MOUNT_T_EMPTY | MOVE_MOUNT_BENEATH)
has two apriori reasonable variants ("slide right under the top of
whatever pile there might be over dir_fd" and "slide right under dir_fd
itself, no matter what pile might be on top of that") and leaving it
unspecified is not good, IMO...
Powered by blists - more mailing lists