lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250818201428.GC222315@ZenIV>
Date: Mon, 18 Aug 2025 21:14:28 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: Ryan Chung <seokwoo.chung130@...il.com>
Cc: brauner@...nel.org, jack@...e.cz, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-kernel-mentees@...ts.linux.dev,
	kernel test robot <lkp@...el.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: [RFC] {do_,}lock_mount() behaviour wrt races and move_mount(2) with
 empty to_path (was Re: [PATCH] fs/namespace.c: fix mountpath handling in
 do_lock_mount())

On Tue, Aug 19, 2025 at 02:22:35AM +0900, Ryan Chung wrote:
> Updates documentation for do_lock_mount() in fs/namespace.c
> to clarify its parameters and return description to fix
> warning reported by syzbot.
> 
> Reported-by: kernel test robot <lkp@...el.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202506301911.uysRaP8b-lkp@intel.com/
> Signed-off-by: Ryan Chung <seokwoo.chung130@...il.com>
> ---
>  fs/namespace.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index ddfd4457d338..577fdff9f1a8 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -2741,6 +2741,7 @@ static int attach_recursive_mnt(struct mount *source_mnt,
>  /**
>   * do_lock_mount - lock mount and mountpoint
>   * @path:    target path
> + * @pinned: on success, holds a pin guarding the mountpoint

I'm not sure if 'pin' is suitable here and in any case, that's not the
only problem in that description - take a look at "Return:" part in there.

The underlying problem is the semantics of function itself.  lock_mount()
assumed that it was called on the result of pathname resolution; the
question is what to do if we race with somebody mounting something
on top of the same location while we had been grabbing namespace_sem?
"Follow through to the root of whatever's been mounted on top, same as
we'd done if pathname resolution happened slightly later" used to be a
reasonable answer, but these days we have move_mount(2), where we have
	* MOVE_MOUNT_T_EMPTY_PATH combined with empty pathname, which
will have us start with whatever the descriptor is pointing to, mounts
or no mounts.  Choosing to treat that as "follow mounts anyway" is not
a big deal.
	* MOVE_MOUNT_BENEATH - treated as "follow mounts and slip the
damn thing under the topmost one".  Again, OK for non-empty pathname,
but... for empty ones the rationale is weaker.

Alternative would be to treat these races as "act as if we'd won and
the other guy had overmounted ours", i.e. *NOT* follow mounts.  Again,
for old syscalls that's fine - if another thread has raced with us and
mounted something on top of the place we want to mount on, it could just
as easily have come *after* we'd completed mount(2) and mounted their
stuff on top of ours.  If userland is not fine with such outcome, it needs
to provide serialization between the callers.  For move_mount(2)... again,
the only real question is empty to_path case.

Comments?

Note, BTW, that attach_recursive_mnt() used to require dest_mnt/dest_mp
to be on the very top; since 6.16 it treats that as "slip it under
whatever's on top of that" - that's exactly what happens in 'beneath'
case.  So the second alternative is easily doable these days.  And
it would really simplify the lock_mount()/do_lock_mount()...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ