lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <q43d4llud37qniopiaejx5p6hyjhaubvwchnekhekzfbgtbybg@zhunmybqs3dr>
Date: Mon, 17 Nov 2025 13:57:03 -0500
From: Aiden Lambert <alambert48@...ech.edu>
To: Trond Myklebust <trondmy@...nel.org>
Cc: anna@...nel.org, chuck.lever@...cle.org, jlayton@...nel.org, 
	linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] NFS: ensure nfs_safe_remove() atomic nlink drop

On Mon, Nov 17, 2025 at 01:24:54PM -0500, Trond Myklebust wrote:
> On Mon, 2025-11-17 at 13:03 -0500, Aiden Lambert wrote:
> > A race condition occurs when both unlink() and link() are running
> > concurrently on the same inode, and the nlink count from the nfs
> > server
> > received in link()->nfs_do_access() clobbers the nlink count of the
> > inode in nfs_safe_remove() after the "remove" RPC is made to the
> > server
> > but before we decrement the link count. If the nlink value from
> > nfs_do_access() reflects the decremented nlink of the "remove" RPC, a
> > double decrement occurs, which can lead to the dropping of the client
> > side inode, causing the link call to return ENOENT. To fix this, we
> > record an expected nlink value before the "remove" RPC and compare it
> > with the value afterwards---if these two are the same, the drop is
> > performed. Note that this does not take into account nlink values
> > that
> > are a result of multi-client (un)link operations as these are not
> > guaranteed to be atomic by the NFS spec.
> 
> 
> Why do we end up running nfs_do_access() at all in the above test? That
> sounds like a bug. We shouldn't ever need to validate if we can create
> or delete things using ACCESS. That just ends up producing an
> unnecessary TOCTOU race.
> 
> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trondmy@...nel.org, trond.myklebust@...merspace.com

It seems that the call chain is
1. vfs_link()
2. may_create()
3. inode_permission()/nfs_permission() which fails to get a cached value as the cache is
invalidated by the (un)link operations
4. nfs_do_access()

My initial thought for the fix was to elevate the inode rwsem to
exclusive write access in nfs_do_access if the thread does not already
have it, but I figured this design of not locking was a conscious one.

A second thought was to add a new lock around all operations talking to
nfsd and their corresponding client side metadata updates, but that may
not be the way to go.

I have a proof of concept script that catches the symptoms of this race
if you would like me to upload it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ