[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260130062848.GA863940@zen.localdomain>
Date: Thu, 29 Jan 2026 22:28:48 -0800
From: Boris Burkov <boris@....io>
To: Matthew Wilcox <willy@...radead.org>
Cc: Qu Wenruo <quwenruo.btrfs@....com>, JP Kobryn <inwardvessel@...il.com>,
clm@...com, dsterba@...e.com, linux-btrfs@...r.kernel.org,
linux-kernel@...r.kernel.org, kernel-team@...a.com,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>
Subject: Re: [RFC PATCH] btrfs: defer freeing of subpage private state to
free_folio
On Fri, Jan 30, 2026 at 05:14:26AM +0000, Matthew Wilcox wrote:
> On Fri, Jan 30, 2026 at 01:46:59PM +1030, Qu Wenruo wrote:
> > Another question is, why only two fses (nfs for dir inode, and orangefs) are
> > utilizing the free_folio() callback.
>
> Alas, secretmem and guest_memfd are also using it. Nevertheless, I'm
> not a fan of this interface existing, and would prefer to not introduce
>From the VFS documentation:
"free_folio is called once the folio is no longer visible in the page cache
in order to allow the cleanup of any private data."
That sounds like the perfect solution to exactly the issue JP observed.
So if it is not intended to be used in this way, I think it would be
beneficial to more accurately document the proper (never?) use of
free_folio() to avoid this pitfall.
I will also point out that the usage in orangefs closely matches this
one in spirit: kfree-ing some longer lived object tracking ranges in the
folio.
As for this patch, do you have any suggestion for how to handle the
problem of freeing allocated memory stored in folio->private? To me, it
feels quite wrong to do it in release_folio() if that can end with a
folio that can still be found in the mapping. Should we have all users
that find and lock the folio detect whether or not private is set
up appropriately and re-allocate it if not? Or should we only lookup and
use the private field only from VFS triggered paths (rather than btrfs's
internal relocation stuff here)
> new users. Like launder_folio, which btrfs has also mistakenly used.
>
for context
https://lore.kernel.org/linux-btrfs/070b1d025ef6eb292638bb97683cd5c35ffe42eb.1721775142.git.boris@bur.io/
IIRC, the issue was that we call invalidate_inode_pages2 on outstanding
dirty folios while cleaning up a failed transaction which calls
launder_folio() and release_folio(). At this point, btrfs does not have
private set on the folio so release_folio() is a no-op. So I used
launder_folio() to clean up otherwise leaked state that was 1:1 with
dirty folios, so it also felt like a "natural" fit. I apologize for
this undesirable usage, but I was genuinely trying to do the right
thing and take advantage of an abstract interface that very cleanly
modeled my problem. At the time, I interpreted the existence of an
interface that neatly fit my problem to be a credit to the design of
VFS. In the future, I will make sure to cc fsdevel before adding any
new usage of the VFS.
At a high level, would it be reasonable to mark the page private when
it is dirtied and the qgroup reservation is made, so that we do go into
release_folio() via invalidate_inode_pages2? I can look into that.
Ultimately, cleaning up qgroup reservation accounting while aborting a
transaction is hardly the most important thing in the world. It's just
something that looks messy while cleaning up the filesystem. So if we
want to get rid of launder_folio() even if I can't figure out how to
make it use release_folio(), all we really need is for someone to ask
nicely and I can probably come up with a hack. :)
Thanks,
Boris
Powered by blists - more mailing lists