linux-kernel - Re: [PATCH RFC 00/12] Allow concurrent directory updates.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-id: <165534094600.26404.4349155093299535793@noble.neil.brown.name>
Date:   Thu, 16 Jun 2022 10:55:46 +1000
From:   "NeilBrown" <neilb@...e.de>
To:     "Daire Byrne" <daire@...g.com>
Cc:     "Al Viro" <viro@...iv.linux.org.uk>,
        "Trond Myklebust" <trond.myklebust@...merspace.com>,
        "Chuck Lever" <chuck.lever@...cle.com>,
        "Linux NFS Mailing List" <linux-nfs@...r.kernel.org>,
        linux-fsdevel@...r.kernel.org,
        "LKML" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC 00/12] Allow concurrent directory updates.

On Wed, 15 Jun 2022, Daire Byrne wrote:
...
> With the patch, the aggregate increases to 15 creates/s for 10 clients
> which again matches the results of a single patched client. Not quite
> a x10 increase but a healthy improvement nonetheless.

Great!

> 
> However, it is at this point that I started to experience some
> stability issues with the re-export server that are not present with
> the vanilla unpatched v5.19-rc2 kernel. In particular the knfsd
> threads start to lock up with stack traces like this:
> 
> [ 1234.460696] INFO: task nfsd:5514 blocked for more than 123 seconds.
> [ 1234.461481]       Tainted: G        W   E     5.19.0-1.dneg.x86_64 #1
> [ 1234.462289] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 1234.463227] task:nfsd            state:D stack:    0 pid: 5514
> ppid:     2 flags:0x00004000
> [ 1234.464212] Call Trace:
> [ 1234.464677]  <TASK>
> [ 1234.465104]  __schedule+0x2a9/0x8a0
> [ 1234.465663]  schedule+0x55/0xc0
> [ 1234.466183]  ? nfs_lookup_revalidate_dentry+0x3a0/0x3a0 [nfs]
> [ 1234.466995]  __nfs_lookup_revalidate+0xdf/0x120 [nfs]

I can see the cause of this - I forget a wakeup.  This patch should fix
it, though I hope to find a better solution.

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 54c2c7adcd56..072130d000c4 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2483,17 +2483,16 @@ int nfs_unlink(struct inode *dir, struct dentry *dentry)
 	if (!(dentry->d_flags & DCACHE_PAR_UPDATE)) {
 		/* Must have exclusive lock on parent */
 		did_set_par_update = true;
+		lock_acquire_exclusive(&dentry->d_update_map, 0,
+				       0, NULL, _THIS_IP_);
 		dentry->d_flags |= DCACHE_PAR_UPDATE;
 	}
 
 	spin_unlock(&dentry->d_lock);
 	error = nfs_safe_remove(dentry);
 	nfs_dentry_remove_handle_error(dir, dentry, error);
-	if (did_set_par_update) {
-		spin_lock(&dentry->d_lock);
-		dentry->d_flags &= ~DCACHE_PAR_UPDATE;
-		spin_unlock(&dentry->d_lock);
-	}
+	if (did_set_par_update)
+		d_unlock_update(dentry);
 out:
 	trace_nfs_unlink_exit(dir, dentry, error);
 	return error;

> 
> So all in all, the performance improvements in the knfsd re-export
> case is looking great and we have real world use cases that this helps
> with (batch processing workloads with latencies >10ms). If we can
> figure out the hanging knfsd threads, then I can test it more heavily.

Hopefully the above patch will allow the more heavy testing to continue.
In any case, thanks a lot for the testing so far,

NeilBrown