[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <73f2c243-e029-4f95-aa8e-285c7affacac@linux.alibaba.com>
Date: Mon, 19 Jan 2026 17:38:33 +0800
From: Gao Xiang <hsiangkao@...ux.alibaba.com>
To: Christoph Hellwig <hch@....de>
Cc: Hongbo Li <lihongbo22@...wei.com>, chao@...nel.org, brauner@...nel.org,
djwong@...nel.org, amir73il@...il.com, linux-fsdevel@...r.kernel.org,
linux-erofs@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v15 5/9] erofs: introduce the page cache share feature
On 2026/1/19 17:22, Christoph Hellwig wrote:
> On Mon, Jan 19, 2026 at 04:52:54PM +0800, Gao Xiang wrote:
>>> To me this sounds pretty scary, as we have code in the kernel's trust
>>> domain that heavily depends on arbitrary userspace policy decisions.
>>
>> For example, overlayfs metacopy can also points to
>> arbitary files, what's the difference between them?
>> https://docs.kernel.org/filesystems/overlayfs.html#metadata-only-copy-up
>>
>> By using metacopy, overlayfs can access arbitary files
>> as long as the metacopy has the pointer, so it should
>> be a priviledged stuff, which is similar to this feature.
>
> Sounds scary too. But overlayfs' job is to combine underlying files, so
> it is expected. I think it's the mix of erofs being a disk based file
But you still could point to an arbitary page cache
if metacopy is used.
> system, and reaching out beyond the device(s) assigned to the file system
> instance that makes me feel rather uneasy.
You mean the page cache can be shared from other
filesystems even not backed by these devices/files?
I admitted yes, there could be different: but that
is why new mount options "inode_share" and the
"domain_id" mount option are used.
I think they should be regarded as a single super
filesystem if "domain_id" is the same: From the
security perspective much like subvolumes of
a single super filesystem.
And mounting a new filesystem within a "domain_id"
can be regard as importing data into the super
"domain_id" filesystem, and I think only trusted
data within the single domain can be mounted/shared.
>
>>>
>>> Similarly the sharing of blocks between different file system
>>> instances opens a lot of questions about trust boundaries and life
>>> time rules. I don't really have good answers, but writing up the
>>
>> Could you give more details about the these? Since you
>> raised the questions but I have no idea what the threats
>> really come from.
>
> Right now by default we don't allow any unprivileged mounts. Now
> if people thing that say erofs is safe enough and opt into that,
> it needs to be clear what the boundaries of that are. For a file
> system limited to a single block device that boundaries are
> pretty clear. For file systems reaching out to the entire system
> (or some kind of domain), the scope is much wider.
Why multiple device differ for an immutable fses, any
filesystem instance cannot change the primary or
external device/blobs. All data are immutable.
>
>> As for the lifetime: The blob itself are immutable files,
>> what the lifetime rules means?
>
> What happens if the blob gets removed, intentionally or accidentally?
The extra device/blob reference is held during
the whole mount lifetime, much like the primary
(block) device.
And EROFS is an immutable filesystem, so that
inner blocks within the blob won't be go away
by some fs instance too.
>
>> And how do you define trust boundaries? You mean users
>> have no right to access the data?
>>
>> I think it's similar: for blockdevice-based filesystems,
>> you mount the filesystem with a given source, and it
>> should have permission to the mounter.
>
> Yes.
>
>> For multiple-blob EROFS filesystems, you mount the
>> filesystem with multiple data sources, and the blockdevices
>> and/or backed files should have permission to the
>> mounters too.
>
> And what prevents other from modifying them, or sneaking
> unexpected data including unexpected comparison blobs in?
I don't think it's difference from filesystems with single
device.
First, EROFS instances never modify any underlay
device/blobs:
If you say some other program modify the device data, yes,
it can be changed externally, but I think it's just like
trusted FUSE deamons, untrusted FUSE daemon can return
arbitary (meta)data at random times too.
Thanks,
Gao Xiang
Powered by blists - more mailing lists