lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8edcdef6-8749-aa45-e7d2-ada677645d76@huaweicloud.com>
Date: Wed, 30 Jul 2025 10:03:50 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Jan Kara <jack@...e.cz>, Yu Kuai <yukuai1@...weicloud.com>
Cc: axboe@...nel.dk, akpm@...ux-foundation.org, yang.yang@...o.com,
 dlemoal@...nel.org, ming.lei@...hat.com, linux-block@...r.kernel.org,
 linux-kernel@...r.kernel.org, yi.zhang@...wei.com, yangerkun@...wei.com,
 johnny.chenyi@...wei.com, Omar Sandoval <osandov@...com>,
 "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH v2 1/2] lib/sbitmap: convert shallow_depth from one word
 to the whole sbitmap

Hi,

在 2025/07/29 18:16, Jan Kara 写道:
> On Tue 29-07-25 11:19:05, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> Currently elevators will record internal 'async_depth' to throttle
>> asynchronous requests, and they both calculate shallow_dpeth based on
>> sb->shift, with the respect that sb->shift is the available tags in one
>> word.
>>
>> However, sb->shift is not the availbale tags in the last word, see
>> __map_depth:
>>
>> if (index == sb->map_nr - 1)
>>    return sb->depth - (index << sb->shift);
>>
>> For consequence, if the last word is used, more tags can be get than
>> expected, for example, assume nr_requests=256 and there are four words,
>> in the worst case if user set nr_requests=32, then the first word is
>> the last word, and still use bits per word, which is 64, to calculate
>> async_depth is wrong.
>>
>> One the other hand, due to cgroup qos, bfq can allow only one request
>> to be allocated, and set shallow_dpeth=1 will still allow the number
>> of words request to be allocated.
>>
>> Fix this problems by using shallow_depth to the whole sbitmap instead
>> of per word, also change kyber, mq-deadline and bfq to follow this.
>>
>> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
> 
> I agree with these problems but AFAIU this implementation of shallow depth
> has been done for a reason. Omar can chime in here as the original author
> or perhaps Jens but the idea of current shallow depth implementation is
> that each sbitmap user regardless of used shallow depth has a chance to
> allocate from each sbitmap word which evenly distributes pressure among
> available sbitmap words. With the implementation you've chosen there will
> be higher pressure (and thus contention) on words with low indices.

Yes, this make sense. However, consider that shallow depth is only used
by elevator, this higher pressure should be negligible for deadline and
bfq. As for kyber, this might be a problem.
> 
> So I think we would be good to fix issues with shallow depth for small
> number of sbitmap words (because that's where these buggy cornercases may
> matter in practice) but I believe the logic which constrains number of used
> bits from each *word* when shallow_depth is specified should be kept.  It
> might make sense to change the API so that shallow_depth is indeed
> specified compared to the total size of the bitmap, not to the size of the
> word (because that's confusing practically everybody I've met and is a
> constant source of bugs) if it can be made to perform well.

Do you think will it be ok to add a new shallow depth API to use the
total size, and convert bfq and deadline to use it?

Thanks,
Kuai

> 
> 								Honza


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ