[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <25352.56248.283092.213037@quad.stoffel.home>
Date: Fri, 26 Aug 2022 10:42:00 -0400
From: "John Stoffel" <john@...ffel.org>
To: NeilBrown <neilb@...e.de>
Cc: Al Viro <viro@...iv.linux.org.uk>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Daire Byrne <daire@...g.com>,
Trond Myklebust <trond.myklebust@...merspace.com>,
Chuck Lever <chuck.lever@...cle.com>,
Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
linux-fsdevel@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH/RFC 00/10 v5] Improve scalability of directory operations
>>>>> "NeilBrown" == NeilBrown <neilb@...e.de> writes:
NeilBrown> [I made up "v5" - I haven't been counting]
My first comments, but I'm not a serious developer...
NeilBrown> VFS currently holds an exclusive lock on the directory while making
NeilBrown> changes: add, remove, rename.
NeilBrown> When multiple threads make changes in the one directory, the contention
NeilBrown> can be noticeable.
NeilBrown> In the case of NFS with a high latency link, this can easily be
NeilBrown> demonstrated. NFS doesn't really need VFS locking as the server ensures
NeilBrown> correctness.
NeilBrown> Lustre uses a single(?) directory for object storage, and has patches
NeilBrown> for ext4 to support concurrent updates (Lustre accesses ext4 directly,
NeilBrown> not via the VFS).
NeilBrown> XFS (it is claimed) doesn't its own locking and doesn't need the VFS to
NeilBrown> help at all.
This sentence makes no sense to me... I assume you meant to say "...does
it's own locking..."
NeilBrown> This patch series allows filesystems to request a shared lock on
NeilBrown> directories and provides serialisation on just the affected name, not the
NeilBrown> whole directory. It changes both the VFS and NFSD to use shared locks
NeilBrown> when appropriate, and changes NFS to request shared locks.
Are there any performance results? Why wouldn't we just do a shared
locked across all VFS based filesystems?
NeilBrown> The central enabling feature is a new dentry flag DCACHE_PAR_UPDATE
NeilBrown> which acts as a bit-lock. The ->d_lock spinlock is taken to set/clear
NeilBrown> it, and wait_var_event() is used for waiting. This flag is set on all
NeilBrown> dentries that are part of a directory update, not just when a shared
NeilBrown> lock is taken.
NeilBrown> When a shared lock is taken we must use alloc_dentry_parallel() which
NeilBrown> needs a wq which must remain until the update is completed. To make use
NeilBrown> of concurrent create, kern_path_create() would need to be passed a wq.
NeilBrown> Rather than the churn required for that, we use exclusive locking when
NeilBrown> no wq is provided.
Is this a per-operation wq or a per-directory wq? Can there be issues
if someone does something silly like having 1,000 directories, all of
which have multiple processes making parallel changes?
Does it degrade gracefully if a wq can't be allocated?
NeilBrown> One interesting consequence of this is that silly-rename becomes a
NeilBrown> little more complex. As the directory may not be exclusively locked,
NeilBrown> the new silly-name needs to be locked (DCACHE_PAR_UPDATE) as well.
NeilBrown> A new LOOKUP_SILLY_RENAME is added which helps implement this using
NeilBrown> common code.
NeilBrown> While testing I found some odd behaviour that was caused by
NeilBrown> d_revalidate() racing with rename(). To resolve this I used
NeilBrown> DCACHE_PAR_UPDATE to ensure they cannot race any more.
NeilBrown> Testing, review, or other comments would be most welcome,
Powered by blists - more mailing lists