lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be558d13-6b41-48b7-9f5c-5da0f1ca1fce@linux.alibaba.com>
Date: Mon, 19 Jan 2026 16:12:28 +0800
From: Gao Xiang <hsiangkao@...ux.alibaba.com>
To: Christoph Hellwig <hch@....de>
Cc: Hongbo Li <lihongbo22@...wei.com>, chao@...nel.org, brauner@...nel.org,
 djwong@...nel.org, amir73il@...il.com, linux-fsdevel@...r.kernel.org,
 linux-erofs@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v15 5/9] erofs: introduce the page cache share feature



On 2026/1/19 15:53, Gao Xiang wrote:
> 
> 
> On 2026/1/19 15:29, Christoph Hellwig wrote:
>> On Sat, Jan 17, 2026 at 12:21:16AM +0800, Gao Xiang wrote:
>>> Hi Christoph,
>>>
>>> On 2026/1/16 23:46, Christoph Hellwig wrote:
>>>> I don't really understand the fingerprint idea.  Files with the
>>>> same content will point to the same physical disk blocks, so that
>>>> should be a much better indicator than a finger print?  Also how does
>>>
>>> Page cache sharing should apply to different EROFS
>>> filesystem images on the same machine too, so the
>>> physical disk block number idea cannot be applied
>>> to this.
>>
>> Oh.  That's kinda unexpected and adds another twist to the whole scheme.
>> So in that case the on-disk data actually is duplicated in each image
>> and then de-duplicated in memory only?  Ewwww...
> 
> On-disk deduplication is decoupled from this feature:

Of course, first of all:

  - Data within a single EROFS image is deduplicated of
    course (for example, erofs supports extent-based
    chunks);

> 
> - EROFS can share the same blocks in blobs (multiple
> devices) among different images, so that on-disk data

   This way is like docker layers, common data/layers
can be kept in seperate blobs;

> can be shared by refering the same blobs;

Both deduplication ways above will be applied to the
golden images which will be transfered on the wire.

> 
> - On-disk data won't be deduplicated in image if reflink
> is enabled for backing fses, userspace mounters can
> trigger background GCs to deduplicate the identical
> blocks.

And this way is applied at runtime if underlayfs
supports reflink.

> 
> I just tried to say EROFS doesn't limit what's
> the real meaning of `fingerprint` (they can be serialized
> integer numbers for example defined by a specific image
> publisher, or a specific secure hash.  Currently,
> "mkfs.erofs" will generate sha256 for each files), but
> left them to the image builders:
> 
> 
> 1) if `fingerprint` is distributed as on-disk part of
> signed images, as I said, it could be shared within a
> trusted domain_id (usually the same image builder) --
> that is the top priority thing using dmverity;
> 
> Or
> 
> 2) If `fingerprint` is not distributed in the image
> or images are untrusted (e.g. unknown signatures),
> image fetchers can scan each inode in the golden
> images to generate an extra minimal EROFS
> metadata-only image with local calculated
> `fingerprint` too, which is much similar to the
> current ostree way (parse remote files and calculate
> digests).
> 
> Thanks,
> Gao Xiang


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ