lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2ddb92d2-97e8-4eb3-9c76-8c5438bb2a44@oracle.com>
Date: Mon, 17 Jun 2024 19:04:23 +0100
From: John Garry <john.g.garry@...cle.com>
To: Kanchan Joshi <joshi.k@...sung.com>, axboe@...nel.dk, kbusch@...nel.org,
        hch@....de, sagi@...mberg.me, jejb@...ux.ibm.com,
        martin.petersen@...cle.com, viro@...iv.linux.org.uk,
        brauner@...nel.org, dchinner@...hat.com, jack@...e.cz
Cc: djwong@...nel.org, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org,
        linux-fsdevel@...r.kernel.org, tytso@....edu, jbongio@...gle.com,
        linux-scsi@...r.kernel.org, ojaswin@...ux.ibm.com, linux-aio@...ck.org,
        linux-btrfs@...r.kernel.org, io-uring@...r.kernel.org,
        nilay@...ux.ibm.com, ritesh.list@...il.com, willy@...radead.org,
        agk@...hat.com, snitzer@...nel.org, mpatocka@...hat.com,
        dm-devel@...ts.linux.dev, hare@...e.de,
        Alan Adamson <alan.adamson@...cle.com>
Subject: Re: [PATCH v8 10/10] nvme: Atomic write support

On 17/06/2024 18:24, Kanchan Joshi wrote:
> On 6/10/2024 4:13 PM, John Garry wrote:
>> +static bool nvme_valid_atomic_write(struct request *req)
>> +{
>> +	struct request_queue *q = req->q;
>> +	u32 boundary_bytes = queue_atomic_write_boundary_bytes(q);
>> +
>> +	if (blk_rq_bytes(req) > queue_atomic_write_unit_max_bytes(q))
>> +		return false;
>> +
>> +	if (boundary_bytes) {
>> +		u64 mask = boundary_bytes - 1, imask = ~mask;
>> +		u64 start = blk_rq_pos(req) << SECTOR_SHIFT;
>> +		u64 end = start + blk_rq_bytes(req) - 1;
>> +
>> +		/* If greater then must be crossing a boundary */
>> +		if (blk_rq_bytes(req) > boundary_bytes)
>> +			return false;
> 
> Nit: I'd cache blk_rq_bytes(req), since that is repeating and this
> function is called for each atomic IO.

blk_rq_bytes() is just a wrapper for rq->__data_len. I suppose that we 
could cache that value to stop re-reading that memory, but I would 
hope/expect that memory to be in the CPU cache anyway.

> 
>> +
>> +		if ((start & imask) != (end & imask))
>> +			return false;
>> +	}
>> +
>> +	return true;
>> +}
>> +
>>    static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
>>    		struct request *req, struct nvme_command *cmnd,
>>    		enum nvme_opcode op)
>> @@ -941,6 +965,12 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
>>    
>>    	if (req->cmd_flags & REQ_RAHEAD)
>>    		dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH;
>> +	/*
>> +	 * Ensure that nothing has been sent which cannot be executed
>> +	 * atomically.
>> +	 */
>> +	if (req->cmd_flags & REQ_ATOMIC && !nvme_valid_atomic_write(req))
>> +		return BLK_STS_INVAL;
>>    
> 
> Is this validity check specific to NVMe or should this be moved up to
> block layer as it also knows the limits?

Only NVMe supports an LBA space boundary, so that part is specific to NVMe.

Regardless, the block layer already should ensure that the atomic write 
length and boundary is respected. nvme_valid_atomic_write() is just an 
insurance policy against the block layer or some other component not 
doing its job.

For SCSI, the device would error - for example - if the atomic write 
length was larger than the device supported. NVMe silently just does not 
execute the write atomically in that scenario, which we must avoid.

Thanks,
John


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ