linux-kernel - Re: [PATCH v3 20/21] __dentry

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20260121215550.GD3183987@ZenIV>
Date: Wed, 21 Jan 2026 21:55:50 +0000
From: Al Viro <viro@...iv.linux.org.uk>
To: Max Kellermann <max.kellermann@...os.com>
Cc: linux-fsdevel@...r.kernel.org, Christian Brauner <brauner@...nel.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 20/21] __dentry_kill(): new locking scheme

On Tue, Jul 08, 2025 at 06:45:14AM +0200, Max Kellermann wrote:

> I believe the busy-wait was accidental.
> I've been trying to make you aware that this is effectively a
> busy-wait, one that can take a long time burning CPU cycles, but I
> have a feeling I can't reach you.
> 
> Al, please confirm that it was your intention to busy-wait until dying
> dentries disappear!

It's not so much an intention as having nothing good to wait on.

Theoretically, there's a way to deal with that - dentry in the middle
of stuck iput() from dentry_unlink_inode() from __dentry_kill() is
guaranteed to be
	* negative
	* unhashed
	* not in-lookup

What we could do is adding an hlist_head aliased with ->d_alias, ->d_rcu
and ->d_in_lookup_hash.  Then select_collect2() running into a dentry
with negative refcount would set _that_ as victim and bugger off, same
as we do for ones on shrink lists.

shrink_dcache_parent() would do this:
                if (data.victim) {
			struct dentry *v = data.victim;

			spin_lock(&v->d_lock);
			if (v->d_lockref.count < 0 &&
			    !(v->d_flags & DCACHE_DENTRY_KILLED)) {
				init_completion(&data.completion);
				hlist_add_head(&data.node, &v->d_new_field);
				spin_unlock(&v->d_lock);
				rcu_read_unlock();
				wait_for_completion(&data.completion);
			} else if (!lock_for_kill(data.victim)) {  
				spin_unlock(&data.victim->d_lock);
				rcu_read_unlock();
			} else {
				shrink_kill(data.victim); 
		}

and dentry_unlist() -
        dentry->d_flags |= DCACHE_DENTRY_KILLED;
	while (unlikely(dentry->d_new_field.first)) {
		struct select_data *p;

		p = hlist_entry(dentry->d_new_field.first,
				struct select_data,
				node);
		hlist_del_init(&p->node);
		complete(&p->complete);
	}
	...

AFAICS, that ought to be safe and would guaratee progress on each
iteration in shrink_dcache_parent() (note that finding negative
refcount and seeing that it had already been past dentry_unlist()
would mean falling through to lock_for_kill() and instantly
failing there; in any case, that dentry definitely won't be
found on any subsequent d_walk(), so we still get progress there).

Comments?