[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <34c08488-a288-45f9-a28f-a514a408541d@acm.org>
Date: Wed, 4 Oct 2023 10:22:23 -0700
From: Bart Van Assche <bvanassche@....org>
To: "Martin K. Petersen" <martin.petersen@...cle.com>
Cc: John Garry <john.g.garry@...cle.com>, axboe@...nel.dk,
kbusch@...nel.org, hch@....de, sagi@...mberg.me,
jejb@...ux.ibm.com, djwong@...nel.org, viro@...iv.linux.org.uk,
brauner@...nel.org, chandan.babu@...cle.com, dchinner@...hat.com,
linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-nvme@...ts.infradead.org, linux-xfs@...r.kernel.org,
linux-fsdevel@...r.kernel.org, tytso@....edu, jbongio@...gle.com,
linux-api@...r.kernel.org
Subject: Re: [PATCH 10/21] block: Add fops atomic write support
On 10/3/23 19:53, Martin K. Petersen wrote:
>
> Bart,
>
>> I'm still wondering whether we really should support storage
>> devices that report an ATOMIC TRANSFER LENGTH GRANULARITY that is
>> larger than the logical block size.
>
> We should. The common case is that the device reports an ATOMIC
> TRANSFER LENGTH GRANULARITY matching the reported physical block
> size. I.e. a logical block size of 512 bytes and a physical block
> size of 4KB. In that scenario a write of a single logical block would
> require read-modify-write of a physical block.
Block devices must serialize read-modify-write operations internally
that happen when there are multiple logical blocks per physical block.
Otherwise it is not guaranteed that a READ command returns the most
recently written data to the same LBA. I think we can ignore concurrent
and overlapping writes in this discussion since these can be considered
as bugs in host software.
In other words, also for the above example it is guaranteed that writes
of a single logical block (512 bytes) are atomic, no matter what value
is reported as the ATOMIC TRANSFER LENGTH GRANULARITY.
>> Is my understanding correct that the NVMe specification makes it
>> mandatory to support single logical block atomic writes since the
>> smallest value that can be reported as the AWUN parameter is one
>> logical block because this parameter is a 0's based value? Is my
>> understanding correct that SCSI devices that report an ATOMIC
>> TRANSFER LENGTH GRANULARITY that is larger than the logical block
>> size are not able to support the NVMe protocol?
>
> That's correct. There are obviously things you can express in SCSI
> that you can't in NVMe. And the other way around. Our intent is to
> support both protocols.
How about aligning the features of the two protocols as much as
possible? My understanding is that all long-term T10 contributors are
all in favor of this.
Thanks,
Bart.
Powered by blists - more mailing lists