linux-kernel - Re: [PATCH v7 4/9] block: Add core atomic write support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ee20a47d-3131-41c2-a2fc-39017f535527@suse.de>
Date: Mon, 3 Jun 2024 14:31:04 +0200
From: Hannes Reinecke <hare@...e.de>
To: John Garry <john.g.garry@...cle.com>, axboe@...nel.dk, kbusch@...nel.org,
 hch@....de, sagi@...mberg.me, jejb@...ux.ibm.com,
 martin.petersen@...cle.com, djwong@...nel.org, viro@...iv.linux.org.uk,
 brauner@...nel.org, dchinner@...hat.com, jack@...e.cz
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-nvme@...ts.infradead.org, linux-fsdevel@...r.kernel.org,
 tytso@....edu, jbongio@...gle.com, linux-scsi@...r.kernel.org,
 ojaswin@...ux.ibm.com, linux-aio@...ck.org, linux-btrfs@...r.kernel.org,
 io-uring@...r.kernel.org, nilay@...ux.ibm.com, ritesh.list@...il.com,
 willy@...radead.org, Himanshu Madhani <himanshu.madhani@...cle.com>
Subject: Re: [PATCH v7 4/9] block: Add core atomic write support

On 6/3/24 13:38, John Garry wrote:
> On 03/06/2024 10:26, Hannes Reinecke wrote:
>>>
>>> +static bool rq_straddles_atomic_write_boundary(struct request *rq,
>>> +                    unsigned int front_adjust,
>>> +                    unsigned int back_adjust)
>>> +{
>>> +    unsigned int boundary = queue_atomic_write_boundary_bytes(rq->q);
>>> +    u64 mask, start_rq_pos, end_rq_pos;
>>> +
>>> +    if (!boundary)
>>> +        return false;
>>> +
>>> +    start_rq_pos = blk_rq_pos(rq) << SECTOR_SHIFT;
>>> +    end_rq_pos = start_rq_pos + blk_rq_bytes(rq) - 1;
>>> +
>>> +    start_rq_pos -= front_adjust;
>>> +    end_rq_pos += back_adjust;
>>> +
>>> +    mask = ~(boundary - 1);
>>> +
>>> +    /* Top bits are different, so crossed a boundary */
>>> +    if ((start_rq_pos & mask) != (end_rq_pos & mask))
>>> +        return true;
>>> +
>>> +    return false;
>>> +}
>>
>> But isn't that precisely what 'chunk_sectors' is doing?
>> IE ensuring that requests never cross that boundary?
>>
> 
>> Q1: Shouldn't we rather use/modify/adapt chunk_sectors for this thing?
> 
> So you are saying that we can re-use blk_chunk_sectors_left() to 
> determine whether merging a bio/req would cross the boundary, right?
> 
> It seems ok in principle - we would just need to ensure that it is 
> watertight.
> 

We currently use chunk_sectors for quite some different things, most 
notably zones boundaries, NIOIB, raid stripes etc.
So I don't have an issue adding another use-case for it.

>> Q2: If we don't, shouldn't we align the atomic write boundary to the 
>> chunk_sectors setting to ensure both match up?
> 
> Yeah, right. But we can only handle what HW tells.
> 
> The atomic write boundary is only relevant to NVMe. NVMe NOIOB - which 
> we use to set chunk_sectors - is an IO optimization hint, AFAIK. However 
> the atomic write boundary is a hard limit. So if NOIOB is not aligned 
> with the atomic write boundary - which seems unlikely - then the atomic 
> write boundary takes priority.
> 
Which is what I said; we need to check. And I would treat a NOIOB value 
not aligned to the atomic write boundary as an error.

But the real issue here is that the atomic write boundary only matters
for requests, and not for the entire queue.
So using chunk_sectors is out of question as this would affect all 
requests, and my comment was actually wrong.
I'll retract it.

Cheers,

Hannes