lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <175513507565.2234665.11138440093783281571@noble.neil.brown.name>
Date: Thu, 14 Aug 2025 11:31:15 +1000
From: "NeilBrown" <neil@...wn.name>
To: "Al Viro" <viro@...iv.linux.org.uk>
Cc: "Christian Brauner" <brauner@...nel.org>, "Jan Kara" <jack@...e.cz>,
 "David Howells" <dhowells@...hat.com>,
 "Marc Dionne" <marc.dionne@...istor.com>, "Xiubo Li" <xiubli@...hat.com>,
 "Ilya Dryomov" <idryomov@...il.com>, "Tyler Hicks" <code@...icks.com>,
 "Miklos Szeredi" <miklos@...redi.hu>, "Richard Weinberger" <richard@....at>,
 "Anton Ivanov" <anton.ivanov@...bridgegreys.com>,
 "Johannes Berg" <johannes@...solutions.net>,
 "Trond Myklebust" <trondmy@...nel.org>, "Anna Schumaker" <anna@...nel.org>,
 "Chuck Lever" <chuck.lever@...cle.com>, "Jeff Layton" <jlayton@...nel.org>,
 "Amir Goldstein" <amir73il@...il.com>, "Steve French" <sfrench@...ba.org>,
 "Namjae Jeon" <linkinjeon@...nel.org>, "Carlos Maiolino" <cem@...nel.org>,
 linux-fsdevel@...r.kernel.org, linux-afs@...ts.infradead.org,
 netfs@...ts.linux.dev, ceph-devel@...r.kernel.org, ecryptfs@...r.kernel.org,
 linux-um@...ts.infradead.org, linux-nfs@...r.kernel.org,
 linux-unionfs@...r.kernel.org, linux-cifs@...r.kernel.org,
 linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject:
 Re: [PATCH 09/11] VFS: use global wait-queue table for d_alloc_parallel()

On Wed, 13 Aug 2025, Al Viro wrote:
> On Tue, Aug 12, 2025 at 12:25:12PM +1000, NeilBrown wrote:
> 
> > +** mandatory**
> > +
> > +d_alloc_parallel() no longer requires a waitqueue_head.  It uses one
> > +from an internal table when needed.
> 
> Misleading, IMO - that sounds like "giving it a wq is optional, it will
> pick one if needed" when reality is "calling conventions have changed,
> no more passing it a waitqueue at all".

I'll rephrase it.

> 
> > +#define	PAR_LOOKUP_WQ_BITS	8
> > +#define PAR_LOOKUP_WQS (1 << PAR_LOOKUP_WQ_BITS)
> > +static wait_queue_head_t par_wait_table[PAR_LOOKUP_WQS] __cacheline_aligned;
> 
> I wonder how hot these cachelines will be...

Are you questioning the __cacheline_aligned??  I confess I just copied
the annotation on bit_wait_table.

My guess is that concurrent attempts to add the same name to the dcache
are rare, so these wait_queue_heads will be rarely used.

> 
> > +static int __init par_wait_init(void)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < PAR_LOOKUP_WQS; i++)
> > +		init_waitqueue_head(&par_wait_table[i]);
> > +	return 0;
> > +}
> > +fs_initcall(par_wait_init);
> 
> Let's not open _that_ can of worms; just call it from dcache_init().
> 
> > +static inline void d_wake_waiters(struct wait_queue_head *d_wait,
> > +				  struct dentry *dentry)
> > +{
> > +	/* ->d_wait is only set if some thread is actually waiting.
> > +	 * If we find it is NULL - the common case - then there was no
> > +	 * contention and there are no waiters to be woken.
> > +	 */
> > +	if (d_wait)
> > +		__wake_up(d_wait, TASK_NORMAL, 0, dentry);
> 
> Might be worth a note re "this is wake_up_all(), except that key is dentry
> rather than NULL" - or a helper in wait.h to that effect, for that matter.
> I see several other places where we have the same thing (do_notify_pidfd(),
> nfs4_callback_notify_lock(), etc.), so...
> 
> 

As there are no exclusive waiters, any wakeup is a wake_up_all so I
think that emphasising the "all" adds no value.  So I could equally have
used "1" instead of "0", but I chose "0" as the number was irrelevant.

Having a "wake_up_key()" which does 
   __wake_up(wq, TASK_NORMAL, 1, key)
would be nice.

> > +		struct wait_queue_head *wq;
> > +		if (!dentry->d_wait)
> > +			dentry->d_wait = &par_wait_table[hash_ptr(dentry,
> > +								  PAR_LOOKUP_WQ_BITS)];
> > +		wq = dentry->d_wait;
> 
> Yecchhh...  Cosmetic change: take
> 	&par_wait_table[hash_ptr(dentry, PAR_LOOKUP_WQ_BITS)];
> into an inlined helper, please.
> 
> BTW, while we are at it - one change I have for that function is
> (in the current form)
> static bool d_wait_lookup(struct dentry *dentry,
> 			  struct dentry *parent,
> 			  const struct qstr *name)
> {
> 	bool valid = true;
> 	spin_lock(&dentry->d_lock);
>         if (d_in_lookup(dentry)) {
> 		DECLARE_WAITQUEUE(wait, current);
> 		add_wait_queue(dentry->d_wait, &wait);
> 		do {   
> 			set_current_state(TASK_UNINTERRUPTIBLE);
> 			spin_unlock(&dentry->d_lock);
> 			schedule();
> 			spin_lock(&dentry->d_lock);
> 		} while (d_in_lookup(dentry));
> 	}
> 	/*
> 	 * it's not in-lookup anymore; in principle the caller should repeat
> 	 * everything from dcache lookup, but it's likely to be what
> 	 * d_lookup() would've found anyway.  If so, they can use it as-is.
> 	 */
> 	if (unlikely(dentry->d_name.hash != name->hash ||
> 		     dentry->d_parent != parent ||
> 		     d_unhashed(dentry) ||
> 		     !d_same_name(dentry, parent, name)))
> 		valid = false;
> 	spin_unlock(&dentry->d_lock);
> 	return valid;
> }
> 
> with
> 	if (unlikely(d_wait_lookup(dentry, parent, name))) {
>                 dput(dentry);
> 		goto retry;
> 	}
> 	dput(new);
> 	return dentry;
> in the caller (d_alloc_parallel()).  Caller easier to follow and fewer functions
> that are not neutral wrt ->d_lock...  I'm not suggesting to fold that with
> yours - just a heads-up on needing to coordinate.

I see the value in that, but it does mean the function is doing more
than just waiting, and it might make my life a bit harder....

One of the steps toward per-dentry locking involves finding a new
solution to excluding all other accesses when rmdir() is happening.  An
exclusive lock on the directory will no longer be sufficient.

So I set a flag which says "rmdir processing has started" and cause
d_alloc_parallel() (and dentry_lock) to wait for that flag to clear.
A new rmdir_lock() needs to wait for all current DCACHE_PAR_LOOKUP
dentries to complete the lookup and my code currently uses
d_wait_lookup().  The extra test you've added at the end wouldn't be
harmful exactly but would be unnecessary.
Maybe we could have d_wait_lookup_and_check() for your version and
d_wait_lookup() for me?

> 
> Anyway, modulo fs_initcall() thing it's all cosmetical; I certainly like
> the simplified callers, if nothing else.
> 
> That's another patch I'd like to see pulled in front of the queue.
> 

Thanks.

NeilBrown

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ