[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260120065242.GA3436@lst.de>
Date: Tue, 20 Jan 2026 07:52:42 +0100
From: Christoph Hellwig <hch@....de>
To: Gao Xiang <hsiangkao@...ux.alibaba.com>
Cc: Christoph Hellwig <hch@....de>, Hongbo Li <lihongbo22@...wei.com>,
chao@...nel.org, djwong@...nel.org, amir73il@...il.com,
linux-fsdevel@...r.kernel.org, linux-erofs@...ts.ozlabs.org,
linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Christian Brauner <brauner@...nel.org>,
oliver.yang@...ux.alibaba.com
Subject: Re: [PATCH v15 5/9] erofs: introduce the page cache share feature
On Tue, Jan 20, 2026 at 11:07:48AM +0800, Gao Xiang wrote:
>
> Hi Christoph,
>
> Sorry I didn't phrase things clearly earlier, but I'd still
> like to explain the whole idea, as this feature is clearly
> useful for containerization. I hope we can reach agreement
> on the page cache sharing feature: Christian agreed on this
> feature (and I hope still):
>
> https://lore.kernel.org/linux-fsdevel/20260112-begreifbar-hasten-da396ac2759b@brauner
He has to ultimatively decide. I do have an uneasy feeling about this.
It's not super informed as I can keep up, and I'm not the one in charge,
but I hope it is helpful to share my perspective.
> First, let's separate this feature from mounting in user
> namespaces (i.e., unprivileged mounts), because this feature
> is designed specifically for privileged mounts.
Ok.
> The EROFS page cache sharing feature stems from a current
> limitation in the page cache: a file-based folio cannot be
> shared across different inode mappings (or the different
> page index within the same mapping; If this limitation
> were resolved, we could implement a finer-grained page
> cache sharing mechanism at the folio level). As you may
> know, this patchset dates back to 2023,
I didn't..
> and as of 2026; I
> still see no indication that the page cache infra will
> change.
It will be very hard to change unless we move to physical indexing of
the page cache, which has all kinds of downside.s
> So that let's face the reality: this feature introduces
> on-disk xattrs called "fingerprints." --- Since they're
> just xattrs, the EROFS on-disk format remains unchanged.
I think the concept of using a backing file of some sort for the shared
pagecache (which I have no problem with at all), vs the imprecise
selection through a free form fingerprint are quite different aspects,
that could be easily separated. I.e. one could easily imagine using
the data path approach based purely on exact file system metadata.
But that would of course not work with multiple images, which I think
is a key feature here if I'm reading between the lines correctly.
> - Let's not focusing entirely on the random human bugs,
> because I think every practical subsystem should have bugs,
> the whole threat model focuses on the system design, and
> less code doesn't mean anything (buggy or even has system
> design flaw)
Yes, threats through malicious actors are much more intereating
here.
> - EROFS only accesses the (meta)data from the source blobs
> specified at mount time, even with multi-device support:
>
> mount -t erofs -odevice=[blob],device=[blob],... [source]
That is an important part that wasn't fully clear to me.
Powered by blists - more mailing lists