lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 15 Dec 2023 13:55:22 +0000
From: John Garry <john.g.garry@...cle.com>
To: Ming Lei <ming.lei@...hat.com>
Cc: axboe@...nel.dk, kbusch@...nel.org, hch@....de, sagi@...mberg.me,
        jejb@...ux.ibm.com, martin.petersen@...cle.com, djwong@...nel.org,
        viro@...iv.linux.org.uk, brauner@...nel.org, dchinner@...hat.com,
        jack@...e.cz, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org,
        linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        tytso@....edu, jbongio@...gle.com, linux-scsi@...r.kernel.org,
        bvanassche@....org
Subject: Re: [PATCH v2 08/16] block: Limit atomic write IO size according to
 atomic_write_max_sectors

On 15/12/2023 02:27, Ming Lei wrote:
> On Tue, Dec 12, 2023 at 11:08:36AM +0000, John Garry wrote:
>> Currently an IO size is limited to the request_queue limits max_sectors.
>> Limit the size for an atomic write to queue limit atomic_write_max_sectors
>> value.
>>
>> Signed-off-by: John Garry <john.g.garry@...cle.com>
>> ---
>>   block/blk-merge.c | 12 +++++++++++-
>>   block/blk.h       |  3 +++
>>   2 files changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/blk-merge.c b/block/blk-merge.c
>> index 0ccc251e22ff..8d4de9253fe9 100644
>> --- a/block/blk-merge.c
>> +++ b/block/blk-merge.c
>> @@ -171,7 +171,17 @@ static inline unsigned get_max_io_size(struct bio *bio,
>>   {
>>   	unsigned pbs = lim->physical_block_size >> SECTOR_SHIFT;
>>   	unsigned lbs = lim->logical_block_size >> SECTOR_SHIFT;
>> -	unsigned max_sectors = lim->max_sectors, start, end;
>> +	unsigned max_sectors, start, end;
>> +
>> +	/*
>> +	 * We ignore lim->max_sectors for atomic writes simply because
>> +	 * it may less than bio->write_atomic_unit, which we cannot
>> +	 * tolerate.
>> +	 */
>> +	if (bio->bi_opf & REQ_ATOMIC)
>> +		max_sectors = lim->atomic_write_max_sectors;
>> +	else
>> +		max_sectors = lim->max_sectors;

Please note that I mentioned this issue in the cover letter, so you can 
see some discussion there (if missed).

> 
> I can understand the trouble for write atomic from bio split, which
> may simply split in the max_sectors boundary, however this change is
> still too fragile:
> 
> 1) ->max_sectors may be set from userspace
> - so this change simply override userspace setting

yes

> 
> 2) otherwise ->max_sectors is same with ->max_hw_sectors:
> 
> - then something must be wrong in device side or driver side because
> ->write_atomic_unit conflicts with ->max_hw_sectors, which is supposed
> to be figured out before device is setup
> 

Right, so I think that it is proper to limit atomic_write_unit_max et al 
to max_hw_sectors or whatever DMA engine device limits. I can make that 
change.

> 3) too big max_sectors may break driver or device, such as nvme-pci
> aligns max_hw_sectors with DMA optimized mapping size
> 
> And there might be more(better) choices:
> 
> 1) make sure atomic write limit is respected when userspace updates
> ->max_sectors

My mind has been changed and I would say no and we can treat 
atomic_write_unit_max as special. Indeed, max_sectors and 
atomic_write_unit_max are complementary. If someone sets max_sectors to 
4KB and then tries an atomic write of 16KB then they don't know what 
they are doing.

My original idea was to dynamically limit atomic_unit_unit_max et al to 
max_sectors (so that max_sectors is always respected for atomic writes).

As an alternative, how about we keep the value of atomic_unit_unit_max 
static, but reject an atomic write if it exceeds max_sectors? It's not 
too different than dynamically limiting atomic_unit_unit_max. But as 
mentioned, it may be asking for trouble....

> 
> 2) when driver finds that atomic write limits conflict with other
> existed hardware limits, fail or solve(such as reduce write atomic unit) the
> conflict before queue is started; With single write atomic limits update API,
> the conflict can be figured out earlier by block layer too.

I think that we can do this, but I am not sure what other queue limits 
may conflict (apart from max_sectors / max_sectors_hw). There is max 
segment size, but we just rely on a single PAGE per iovec to evaluate 
atomic_unit_unit_max, so should not be an issue.

Thanks,
John

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ