[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260107205410.GN1712166@ZenIV>
Date: Wed, 7 Jan 2026 20:54:10 +0000
From: Al Viro <viro@...iv.linux.org.uk>
To: Breno Leitao <leitao@...ian.org>
Cc: Mateusz Guzik <mjguzik@...il.com>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
jlayton@...nel.org, rostedt@...dmis.org, kernel-team@...a.com
Subject: Re: [PATCH] fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check
in __follow_mount_rcu
On Wed, Jan 07, 2026 at 08:07:50AM -0800, Breno Leitao wrote:
> If I understand correctly, the current code checks the same condition
> twice in succession:
>
> handle_mounts() {
> if (!d_managed(dentry)) /* dentry->d_flags & DCACHE_MANAGED_DENTRY */
> return 0;
> __follow_mount_rcu() { /* Something changes here */
> unsigned int flags = dentry->d_flags;
> if (!(flags & DCACHE_MANAGED_DENTRY))
> return
>
> Is your concern that DCACHE_MANAGED_DENTRY could be cleared between
> these two checks?
It may, but who cares?
> > As in it is possible that by the time __follow_mount_rcu is invoked,
> > DCACHE_MANAGED_DENTRY is no longer set and with the check removed the
> > rest of the routine keeps executing.
>
> I see, but couldn't the same race occur after the second check as
> well?
>
> In other words, whether we have one check or two, DCACHE_MANAGED_DENTRY
> could still be unset immediately after the final check, leading to the
> same situation.
Folks, this is fundamentally a lockless path; there are places on it where
we care about barriers, but this is not one of those. The damn thing
may cease to be a mountpoint under us, or it may turn into a symlink, device
node or a pumpkin for all we care.
> > + unsigned int flags = READ_ONCE(dentry->d_flags);
> > + if (likely(!(dentry->d_flags & DCACHE_MANAGED_DENTRY)))
>
> Minor nit: should this be "if (likely(!(flags & DCACHE_MANAGED_DENTRY)))?"
> Otherwise you're reading d_flags twice but passing the stale value to
> __follow_mount_rcu().
>
> If I understand your intent correctly, you want to read the flags once
> and consistently use that snapshot throughout. Is that right?
TBH, this (starting with READ_ONCE()) is quite pointless; documentation is
badly needed in the area, but asserts will not replace it.
Note that mount traversal *is* wrapped in mount_lock seqcount check; anything
that used to be a mountpoint, but has ceased to be such will automatically
trigger a full repeat of pathwalk in non-rcu mode. What's more, the same
check is done upon the transition from rcu to non-rcu, with the same effect
of a mismatch.
This READ_ONCE() is pointless, with or without a check in the next line
being done the way it's done. We are explicitly OK with the damn thing
changing under us; the check in the beginning of __follow_mount_rcu()
is only a shortcut for very common case, equivalent to what the loop
below would've done if we didn't have that check there.
IOW, correctness is not an issue here; dealing with races belongs higher
in call chain and "let's read the flags into a local variable" has nothing
to do with that - it's getting a decent code generation on the fast
path and nothing beyond that.
There *ARE* places where barriers can be an issue, but those are about
the order of acquisition of inode vs sequence counter of dentry and
placement of rechecking said sequence counter. Not here...
Powered by blists - more mailing lists