[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251030174015.GC1624@sol>
Date: Thu, 30 Oct 2025 10:40:15 -0700
From: Eric Biggers <ebiggers@...nel.org>
To: Christoph Hellwig <hch@....de>
Cc: Carlos Llamas <cmllamas@...gle.com>, Keith Busch <kbusch@...nel.org>,
Keith Busch <kbusch@...a.com>, linux-block@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-xfs@...r.kernel.org,
linux-ext4@...r.kernel.org, axboe@...nel.dk,
Hannes Reinecke <hare@...e.de>,
"Martin K. Petersen" <martin.petersen@...cle.com>
Subject: Re: [PATCHv4 5/8] iomap: simplify direct io validity check
On Wed, Oct 29, 2025 at 08:06:18AM +0100, Christoph Hellwig wrote:
> I think we need to take a step back and talk about what alignment
> we're talking about here, as there are two dimensions to it.
>
> The first dimension is: disk alignment vs memory alignment.
>
> Disk alignment:
> Direct I/O obviously needs to be aligned to on-disk sectors to have
> a chance to work, as that is the lowest possible granularity of access.
>
> For fіle systems that write out of place we also need to align writes
> to the logical block size of the file system.
>
> With blk-crypto we need to align to the DUN if it is larger than the
> disk-sector dize.
>
> Memory alignment:
>
> This is the alignment of the buffer in-memory. Hardware only really
> cares about this when DMA engines discard the lowest bits, so a typical
> hardware alignment requirement is to only require a dword (4 byte)
> alignment. For drivers that process the payload in software such
> low alignment have a tendency to cause bugs as they're not written
> thinking about it. Similarly for any additional processing like
> encryption, parity or checksums.
>
> The second dimension is for the entire operation vs individual vectors,
> this has implications both for the disk and memory alignment. Keith
> has done work there recently to relax the alignment of the vectors to
> only require the memory alignment, so that preadv/pwritev-like calls
> can have lots of unaligned segments.
>
> I think it's the latter that's tripping up here now. Hard coding these
> checks in the file systems seem like a bad idea, we really need to
> advertise them in the queue limits, which is complicated by the fact that
> we only want to do that for bios using block layer encryption. i.e., we
> probably need a separate queue limit that mirrors dma_alignment, but only
> for encrypted bios, and which is taken into account in the block layer
> splitting and communicated up by file systems only for encrypted bios.
> For blk-crypto-fallback we'd need DUN alignment so that the algorithms
> just work (assuming the crypto API can't scatter over misaligned
> segments), but for hardware blk-crypto I suspect that the normal DMA
> engine rules apply, and we don't need to restrict alignment.
Allowing DIO segments to be aligned (in memory address and/or length) to
less than crypto_data_unit_size on encrypted files has been attempted
and discussed before. Read the cover letter of
https://lore.kernel.org/linux-fscrypt/20220128233940.79464-1-ebiggers@kernel.org/
We eventually decided to proceed with DIO support without it, since it
would have added a lot of complexity. It would have made the bio
splitting code in the block layer split bios at boundaries where the
length isn't aligned to crypto_data_unit_size, it would have caused a
lot of trouble for blk-crypto-fallback, and it even would have been
incompatible with some of the hardware drivers (e.g. ufs-exynos.c).
It also didn't seem to be all that useful, and it would have introduced
edge cases that don't get tested much. All reachable to unprivileged
userspace code too, of course.
I can't say that the idea seems all that great to me.
We can always reconsider and still add support for this. But it's not
clear to me what's changed.
- Eric
Powered by blists - more mailing lists