linux-kernel - [PATCH] fs/9p: fix inode nlink accounting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20240107-fix-nlink-handling-v1-1-8b1f65ebc9b2@kernel.org>
Date: Sun, 07 Jan 2024 19:07:52 +0000
From: Eric Van Hensbergen <ericvh@...nel.org>
To: linux-kernel@...r.kernel.org
Cc: v9fs@...ts.linux.dev, linux_oss@...debyte.com, asmadeus@...ewreck.org, 
 rminnich@...il.com, lucho@...kov.net, 
 Eric Van Hensbergen <ericvh@...nel.org>
Subject: [PATCH] fs/9p: fix inode nlink accounting

I was running some regressions and noticed a (race-y) kernel warning that
happens when nlink becomes less than zero.  Looking through the code
it looks like we aren't good about protecting the inode lock when
manipulating nlink and some code that was added several years ago to
protect against bugs in underlying file systems nlink handling didn't
look quite right either.  I took a look at what NFS was doing and tried to
follow similar approaches in the 9p code.

Signed-off-by: Eric Van Hensbergen <ericvh@...nel.org>
---
 fs/9p/vfs_inode.c      | 32 ++++++++++++++++++++++++--------
 fs/9p/vfs_inode_dotl.c |  2 ++
 2 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index b845ee18a80b..9723c3cbae38 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -508,9 +508,12 @@ static int v9fs_at_to_dotl_flags(int flags)
 /**
  * v9fs_dec_count - helper functon to drop i_nlink.
  *
- * If a directory had nlink <= 2 (including . and ..), then we should not drop
- * the link count, which indicates the underlying exported fs doesn't maintain
- * nlink accurately. e.g.
+ * Put a guards around this so we are only dropping nlink
+ * if it would be valid.  This prevents bugs from an underlying
+ * filesystem implementations from triggering kernel WARNs.  We'll
+ * still print 9p debug messages if the underlying filesystem is wrong.
+ * 
+ * known underlying filesystems which might exhibit this issue:
  * - overlayfs sets nlink to 1 for merged dir
  * - ext4 (with dir_nlink feature enabled) sets nlink to 1 if a dir has more
  *   than EXT4_LINK_MAX (65000) links.
@@ -519,8 +522,13 @@ static int v9fs_at_to_dotl_flags(int flags)
  */
 static void v9fs_dec_count(struct inode *inode)
 {
-	if (!S_ISDIR(inode->i_mode) || inode->i_nlink > 2)
+	spin_lock(&inode->i_lock);
+	if (inode->i_nlink > 0)
 		drop_nlink(inode);
+	else
+		p9_debug(P9_DEBUG_ERROR, "WARNING: nlink is already 0 inode %p\n", 
+			inode);
+	spin_unlock(&inode->i_lock);
 }
 
 /**
@@ -566,8 +574,9 @@ static int v9fs_remove(struct inode *dir, struct dentry *dentry, int flags)
 		 * link count
 		 */
 		if (flags & AT_REMOVEDIR) {
+			spin_lock(&inode->i_lock);
 			clear_nlink(inode);
-			v9fs_dec_count(dir);
+			spin_unlock(&inode->i_lock);
 		} else
 			v9fs_dec_count(inode);
 
@@ -713,7 +722,9 @@ static int v9fs_vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 		err = PTR_ERR(fid);
 		fid = NULL;
 	} else {
+		spin_lock(&dir->i_lock);
 		inc_nlink(dir);
+		spin_unlock(&dir->i_lock);
 		v9fs_invalidate_inode_attr(dir);
 	}
 
@@ -962,14 +973,19 @@ v9fs_vfs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
 error_locked:
 	if (!retval) {
 		if (new_inode) {
-			if (S_ISDIR(new_inode->i_mode))
+			if (S_ISDIR(new_inode->i_mode)) {
+				spin_lock(&new_inode->i_lock);
 				clear_nlink(new_inode);
-			else
+				spin_unlock(&new_inode->i_lock);
+			} else
 				v9fs_dec_count(new_inode);
 		}
 		if (S_ISDIR(old_inode->i_mode)) {
-			if (!new_inode)
+			if (!new_inode) {
+				spin_lock(&new_dir->i_lock);
 				inc_nlink(new_dir);
+				spin_unlock(&new_dir->i_lock);
+			}
 			v9fs_dec_count(old_dir);
 		}
 		v9fs_invalidate_inode_attr(old_inode);
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index c7319af2f471..6cc037f726e7 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -427,7 +427,9 @@ static int v9fs_vfs_mkdir_dotl(struct mnt_idmap *idmap,
 		v9fs_set_create_acl(inode, fid, dacl, pacl);
 		d_instantiate(dentry, inode);
 	}
+	spin_lock(&dir->i_lock);
 	inc_nlink(dir);
+	spin_unlock(&dir->i_lock);
 	v9fs_invalidate_inode_attr(dir);
 error:
 	p9_fid_put(fid);

---
base-commit: 5254c0cbc92d2a08e75443bdb914f1c4839cdf5a
change-id: 20240107-fix-nlink-handling-3c0646f5d927

Best regards,
-- 
Eric Van Hensbergen <ericvh@...nel.org>