[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251213072241.GH1712166@ZenIV>
Date: Sat, 13 Dec 2025 07:22:41 +0000
From: Al Viro <viro@...iv.linux.org.uk>
To: Hugh Dickins <hughd@...gle.com>
Cc: Miklos Szeredi <miklos@...redi.hu>,
Christian Brauner <brauner@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: 6.19 tmpfs __d_lookup() lockup
On Fri, Dec 12, 2025 at 02:12:17AM -0800, Hugh Dickins wrote:
> Well, more than that: it's exactly the right thing to do, isn't it?
> shmem_mknod() already called d_make_peristent() which called __d_rehash(),
> calling it a second time naturally leads to the __d_lookup() lockup seen.
> And I can't see a place now for shmem_whiteout()'s "Cheat and hash" comment.
>
> Al, may I please leave you to send in the fix to Christian and/or Linus?
> You may have noticed other things on the way, that you might want to add.
>
> But if your patch resembles the below (which has now passed xfstests
> auto runs on tmpfs), please feel free to add or omit any or all of
>
> Reported-by: Hugh Dickins <hughd@...gle.com>
> Acked-by: Hugh Dickins <hughd@...gle.com>
> Tested-by: Hugh Dickins <hughd@...gle.com>
The problem is that the comment is not quite accurate ;-)
What it's trying to say is that we get whiteout and old_dentry
sharing parent, name and both hashed, but that won't last for
long - as soon as we get to d_move(), old_dentry will change
name and/or parent.
The trouble is, it might not _get_ to that d_move() at
all. It used to be guaranteed back when shmem_whiteout() had
been introduced (shmem_renameat2() used to have no failure
exits past shmem_whiteout() returning success), but it's no longer
true - not since a2e459555c5f "shmem: stable directory offsets"
two years ago.
Failure, AFAICS, requires severe a OOM, but it's still
a bug. What's more, simple_offset_rename() itself does not recover
from a failure, without any whiteouts being involved.
What I'm going to do is a couple of patches - one fixing
the regression in this cycle (pretty much what you'd been testing),
then a separate fix for stable offsets failure handling (present
since 2023). I'll feed them to Linus; I hoped to do that with
old regression fixed first, to reduce the PITA for backports,
but if I don't have that debugged tomorrow, I'll send the recent
regression fix first.
Powered by blists - more mailing lists