[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20250904063209.12586-1-xue01.he@samsung.com>
Date: Thu, 4 Sep 2025 06:32:09 +0000
From: Xue He <xue01.he@...sung.com>
To: yukuai1@...weicloud.com, axboe@...nel.dk
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
yukuai3@...wei.com
Subject: Re: [PATCH] block: plug attempts to batch allocate tags multiple
times
On 2025/09/03/18:35PM, Yu Kuai wrote:
>On 2025/09/03 16:41 PM, Xue He wrote:
>> On 2025/09/02 08:47 AM, Yu Kuai wrote:
>>> On 2025/09/01 16:22 PM, Xue He wrote:
>> ......
>>
>> the information of my nvme like this:
>> number of CPU: 16
>> memory: 16G
>> nvme nvme0: 16/0/16 default/read/poll queue
>> cat /sys/class/nvme/nvme0/nvme0n1/queue/nr_requests
>> 1023
>>
>> In more precise terms, I think it is not that the tags are fully exhausted,
>> but rather that after scanning the bitmap for free bits, the remaining
>> contiguous bits are nsufficient to meet the requirement (have but not enough).
>> The specific function involved is __sbitmap_queue_get_batch in lib/sbitmap.c.
>> get_mask = ((1UL << nr_tags) - 1) << nr;
>> if (nr_tags > 1) {
>> printk("before %ld\n", get_mask);
>> }
>> while (!atomic_long_try_cmpxchg(ptr, &val,
>> get_mask | val))
>> ;
>> get_mask = (get_mask & ~val) >> nr;
>>
>> where during the batch acquisition of contiguous free bits, an atomic operation
>> is performed, resulting in the actual tag_mask obtained differing from the
>> originally requested one.
>
>Yes, so this function will likely to obtain less tags than nr_tags,the
>mask is always start from first zero bit with nr_tags bit, and
>sbitmap_deferred_clear() is called uncondionally, it's likely there are
>non-zero bits within this range.
>
>Just wonder, do you consider fixing this directly in
>__blk_mq_alloc_requests_batch()?
>
> - call sbitmap_deferred_clear() and retry on allocation failure, so
>that the whole word can be used even if previous allocated request are
>done, especially for nvme with huge tag depths;
> - retry blk_mq_get_tags() until data->nr_tags is zero;
>
I haven't tried this yet, as I'm concerned that if it spin here, it might
introduce more latency. Anyway, I may try to implement this idea and do some
tests to observe the results.
Thanks.
Powered by blists - more mailing lists