[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2ddb92d2-97e8-4eb3-9c76-8c5438bb2a44@oracle.com>
Date: Mon, 17 Jun 2024 19:04:23 +0100
From: John Garry <john.g.garry@...cle.com>
To: Kanchan Joshi <joshi.k@...sung.com>, axboe@...nel.dk, kbusch@...nel.org,
hch@....de, sagi@...mberg.me, jejb@...ux.ibm.com,
martin.petersen@...cle.com, viro@...iv.linux.org.uk,
brauner@...nel.org, dchinner@...hat.com, jack@...e.cz
Cc: djwong@...nel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org,
linux-fsdevel@...r.kernel.org, tytso@....edu, jbongio@...gle.com,
linux-scsi@...r.kernel.org, ojaswin@...ux.ibm.com, linux-aio@...ck.org,
linux-btrfs@...r.kernel.org, io-uring@...r.kernel.org,
nilay@...ux.ibm.com, ritesh.list@...il.com, willy@...radead.org,
agk@...hat.com, snitzer@...nel.org, mpatocka@...hat.com,
dm-devel@...ts.linux.dev, hare@...e.de,
Alan Adamson <alan.adamson@...cle.com>
Subject: Re: [PATCH v8 10/10] nvme: Atomic write support
On 17/06/2024 18:24, Kanchan Joshi wrote:
> On 6/10/2024 4:13 PM, John Garry wrote:
>> +static bool nvme_valid_atomic_write(struct request *req)
>> +{
>> + struct request_queue *q = req->q;
>> + u32 boundary_bytes = queue_atomic_write_boundary_bytes(q);
>> +
>> + if (blk_rq_bytes(req) > queue_atomic_write_unit_max_bytes(q))
>> + return false;
>> +
>> + if (boundary_bytes) {
>> + u64 mask = boundary_bytes - 1, imask = ~mask;
>> + u64 start = blk_rq_pos(req) << SECTOR_SHIFT;
>> + u64 end = start + blk_rq_bytes(req) - 1;
>> +
>> + /* If greater then must be crossing a boundary */
>> + if (blk_rq_bytes(req) > boundary_bytes)
>> + return false;
>
> Nit: I'd cache blk_rq_bytes(req), since that is repeating and this
> function is called for each atomic IO.
blk_rq_bytes() is just a wrapper for rq->__data_len. I suppose that we
could cache that value to stop re-reading that memory, but I would
hope/expect that memory to be in the CPU cache anyway.
>
>> +
>> + if ((start & imask) != (end & imask))
>> + return false;
>> + }
>> +
>> + return true;
>> +}
>> +
>> static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
>> struct request *req, struct nvme_command *cmnd,
>> enum nvme_opcode op)
>> @@ -941,6 +965,12 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
>>
>> if (req->cmd_flags & REQ_RAHEAD)
>> dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH;
>> + /*
>> + * Ensure that nothing has been sent which cannot be executed
>> + * atomically.
>> + */
>> + if (req->cmd_flags & REQ_ATOMIC && !nvme_valid_atomic_write(req))
>> + return BLK_STS_INVAL;
>>
>
> Is this validity check specific to NVMe or should this be moved up to
> block layer as it also knows the limits?
Only NVMe supports an LBA space boundary, so that part is specific to NVMe.
Regardless, the block layer already should ensure that the atomic write
length and boundary is respected. nvme_valid_atomic_write() is just an
insurance policy against the block layer or some other component not
doing its job.
For SCSI, the device would error - for example - if the atomic write
length was larger than the device supported. NVMe silently just does not
execute the write atomically in that scenario, which we must avoid.
Thanks,
John
Powered by blists - more mailing lists