[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b7a6f380-c6fa-45e0-b727-ba804c6684e4@acm.org>
Date: Mon, 2 Oct 2023 12:12:49 -0700
From: Bart Van Assche <bvanassche@....org>
To: John Garry <john.g.garry@...cle.com>, axboe@...nel.dk,
kbusch@...nel.org, hch@....de, sagi@...mberg.me,
jejb@...ux.ibm.com, martin.petersen@...cle.com, djwong@...nel.org,
viro@...iv.linux.org.uk, brauner@...nel.org,
chandan.babu@...cle.com, dchinner@...hat.com
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-nvme@...ts.infradead.org, linux-xfs@...r.kernel.org,
linux-fsdevel@...r.kernel.org, tytso@....edu, jbongio@...gle.com,
linux-api@...r.kernel.org
Subject: Re: [PATCH 10/21] block: Add fops atomic write support
On 10/2/23 03:10, John Garry wrote:
> On 29/09/2023 18:51, Bart Van Assche wrote:
>> On 9/29/23 03:27, John Garry wrote:
> > + if (pos % atomic_write_unit_min_bytes)
> > + return false;
>
> See later rules.
Is atomic_write_unit_min_bytes always equal to the logical block size?
If so, can the above test be left out?
> > + if (iov_iter_count(iter) % atomic_write_unit_min_bytes)
> > + return false;
>
> For SCSI, there is an atomic write granularity, which dictates
> atomic_write_unit_min_bytes. So here we need to ensure that the length
> is a multiple of this value.
Are there any SCSI devices that we care about that report an ATOMIC
TRANSFER LENGTH GRANULARITY that is larger than a single logical block?
I'm wondering whether we really have to support such devices.
> > + if (!is_power_of_2(iov_iter_count(iter)))
> > + return false;
>
> This rule comes from FS block alignment and NVMe atomic boundary.
>
> FSes (XFS) have discontiguous extents. We need to ensure that an atomic
> write does not cross discontiguous extents. To do this we ensure extent
> length and alignment and limit atomic_write_unit_max_bytes to that.
>
> For NVMe, an atomic write boundary is a boundary in LBA space which an
> atomic write should not cross. We limit atomic_write_unit_max_bytes such
> that it is evenly divisible into this atomic write boundary.
>
> To ensure that the write does not cross these alignment boundaries we
> say that it must be naturally aligned and a power-of-2 in length.
>
> We may be able to relax this rule but I am not sure it buys us anything
> - typically we want to be writing a 64KB block aligned to 64KB, for
> example.
It seems to me that the requirement is_power_of_2(iov_iter_count(iter))
is necessary for some filesystems but not for all filesystems.
Restrictions that are specific to a single filesystem (XFS) should not
occur in code that is intended to be used by all filesystems
(blkdev_atomic_write_valid()).
Thanks,
Bart.
Powered by blists - more mailing lists