[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <381c349d-2eb7-419f-a2f8-a41ca6a9e9f0@linux.alibaba.com>
Date: Fri, 11 Oct 2024 11:28:42 +0800
From: Gao Xiang <hsiangkao@...ux.alibaba.com>
To: Dave Chinner <david@...morbit.com>
Cc: "Darrick J. Wong" <djwong@...nel.org>,
Matthew Wilcox <willy@...radead.org>, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, Goldwyn Rodrigues <rgoldwyn@...e.de>
Subject: Re: [PATCH 06/12] iomap: Introduce read_inline() function hook
Hi Dave,
On 2024/10/11 08:43, Dave Chinner wrote:
> On Thu, Oct 10, 2024 at 02:10:25PM -0400, Goldwyn Rodrigues wrote:
...
>
> .... there is specific ordering needed.
>
> For writes, the ordering is:
>
> 1. pre-write data compression - requires data copy
> 2. pre-write data encryption - requires data copy
> 3. pre-write data checksums - data read only
> 4. write the data
> 5. post-write metadata updates
>
> We cannot usefully perform compression after encryption -
> random data doesn't compress - and the checksum must match what is
> written to disk, so it has to come after all other transformations
> have been done.
>
> For reads, the order is:
>
> 1. read the data
> 2. verify the data checksum
> 3. decrypt the data - requires data copy
> 4. decompress the data - requires data copy
> 5. place the plain text data in the page cache
Just random stuffs for for reference, currently fsverity makes
markle tree for the plain text, but from the on-disk data
security/authentication and integrity perspective, I guess the
order you mentioned sounds more saner to me, or:
1. read the data
2. pre-verify the encoded data checksum (optional,
dm-verity likewise way)
3. decrypt the data - requires data copy
4. decompress the data - requires data copy
5. post-verify the decoded (plain) checksum (optional)
6. place the plain text data in the page cache
2,5 may apply to different use cases though.
>
...
>
> Compression is where using xattrs gets interesting - the xattrs can
> have a fixed "offset" they blong to, but can store variable sized
> data records for that offset.
>
> If we say we have a 64kB compression block size, we can store the
> compressed data for a 64k block entirely in a remote xattr even if
> compression fails (i.e. we can store the raw data, not the expanded
> "compressed" data). The remote xattr can store any amount of smaller
> data, and we map the compressed data directly into the page cache at
> a high offset. Then decompression can run on the high offset pages
> with the destination being some other page cache offset....
but compressed data itself can also be multiple reference (reflink
likewise), so currently EROFS uses a seperate pseudo inode if it
decides with physical addresses as indexes.
>
> On the write side, compression can be done directly into the high
> offset page cache range for that 64kb offset range, then we can
> map that to a remote xattr block and write the xattr. The xattr
> naturally handles variable size blocks.
Also different from plain text, each compression fses may keep
different encoded data forms (e.g. fses could add headers or
trailers to the on-disk compressed data or add more informations
to extent metadata) for their own needs. So unlike the current
unique plain text process, an unique encoder/decoder may not sound
quite flexible though.
Thanks,
Gao Xiang
Powered by blists - more mailing lists