[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <81214347-3806-4F54-B60F-3E5A1A5EC84D@unimore.it>
Date: Tue, 31 May 2022 11:19:15 +0200
From: Paolo Valente <paolo.valente@...more.it>
To: Yu Kuai <yukuai3@...wei.com>
Cc: Jan Kara <jack@...e.cz>, Jens Axboe <axboe@...nel.dk>,
Tejun Heo <tj@...nel.org>, cgroups@...r.kernel.org,
linux-block <linux-block@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, yi.zhang@...wei.com
Subject: Re: [PATCH -next v7 2/3] block, bfq: refactor the counting of
'num_groups_with_pending_reqs'
> Il giorno 31 mag 2022, alle ore 11:06, Yu Kuai <yukuai3@...wei.com> ha scritto:
>
> 在 2022/05/31 16:36, Paolo VALENTE 写道:
>>> Il giorno 30 mag 2022, alle ore 10:40, Yu Kuai <yukuai3@...wei.com> ha scritto:
>>>
>>> 在 2022/05/30 16:34, Yu Kuai 写道:
>>>> 在 2022/05/30 16:10, Paolo Valente 写道:
>>>>>
>>>>>
>>>>>> Il giorno 28 mag 2022, alle ore 11:50, Yu Kuai <yukuai3@...wei.com> ha scritto:
>>>>>>
>>>>>> Currently, bfq can't handle sync io concurrently as long as they
>>>>>> are not issued from root group. This is because
>>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>>>>> bfq_asymmetric_scenario().
>>>>>>
>>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>>>>
>>>>>> Before this patch:
>>>>>> 1) root group will never be counted.
>>>>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the requests.
>>>>>>
>>>>>> After this patch:
>>>>>> 1) root group is counted.
>>>>>> 2) Count if bfqg have at least one bfqq that is marked busy.
>>>>>> 3) Don't count if bfqg doesn't have any busy bfqqs.
>>>>>
>>>>> Unfortunately, I see a last problem here. I see a double change:
>>>>> (1) a bfqg is now counted only as a function of the state of its child
>>>>> queues, and not of also its child bfqgs
>>>>> (2) the state considered for counting a bfqg moves from having pending
>>>>> requests to having busy queues
>>>>>
>>>>> I'm ok with with (1), which is a good catch (you are lady explained
>>>>> the idea to me some time ago IIRC).
>>>>>
>>>>> Yet I fear that (2) is not ok. A bfqq can become non busy even if it
>>>>> still has in-flight I/O, i.e. I/O being served in the drive. The
>>>>> weight of such a bfqq must still be considered in the weights_tree,
>>>>> and the group containing such a queue must still be counted when
>>>>> checking whether the scenario is asymmetric. Otherwise service
>>>>> guarantees are broken. The reason is that, if a scenario is deemed as
>>>>> symmetric because in-flight I/O is not taken into account, then idling
>>>>> will not be performed to protect some bfqq, and in-flight I/O may
>>>>> steal bandwidth to that bfqq in an uncontrolled way.
>>>> Hi, Paolo
>>>> Thanks for your explanation.
>>>> My orginal thoughts was using weights_tree insertion/removal, however,
>>>> Jan convinced me that using bfq_add/del_bfqq_busy() is ok.
>>>> From what I see, when bfqq dispatch the last request,
>>>> bfq_del_bfqq_busy() will not be called from __bfq_bfqq_expire() if
>>>> idling is needed, and it will delayed to when such bfqq get scheduled as
>>>> in-service queue again. Which means the weight of such bfqq should still
>>>> be considered in the weights_tree.
>>>> I also run some tests on null_blk with "irqmode=2
>>>> completion_nsec=100000000(100ms) hw_queue_depth=1", and tests show
>>>> that service guarantees are still preserved on slow device.
>>>> Do you this is strong enough to cover your concern?
>> Unfortunately it is not. Your very argument is what made be believe
>> that considering busy queues was enough, in the first place. But, as
>> I found out, the problem is caused by the queues that do not enjoy
>> idling. With your patch (as well as in my initial version) they are
>> not counted when they remain without requests queued. And this makes
>> asymmetric scenarios be considered erroneously as symmetric. The
>> consequence is that idling gets switched off when it had to be kept
>> on, and control on bandwidth is lost for the victim in-service queues.
>
> Hi,Paolo
>
> Thanks for your explanation, are you thinking that if bfqq doesn't enjoy
> idling, then such bfqq will clear busy after dispatching the last
> request?
>
> Please kindly correct me if I'm wrong in the following process:
>
> If there are more than one bfqg that is activatied, then bfqqs that are
> not enjoying idle are still left busy after dispatching the last
> request.
>
> details in __bfq_bfqq_expire:
>
> if (RB_EMPTY_ROOT(&bfqq->sort_list) &&
> ┊ !(reason == BFQQE_PREEMPTED &&
> ┊ idling_needed_for_service_guarantees(bfqd, bfqq))) {
> -> idling_needed_for_service_guarantees will always return true,
It returns true only is the scenario is symmetric. Not counting bfqqs
with in-flight requests makes an asymmetric scenario be considered
wrongly symmetric. See function bfq_asymmetric_scenario().
Paolo
> bfqq(whether or not enjoy idling) will stay busy.
> if (bfqq->dispatched == 0)
> /*
> ┊* Overloading budget_timeout field to store
> ┊* the time at which the queue remains with no
> ┊* backlog and no outstanding request; used by
> ┊* the weight-raising mechanism.
> ┊*/
> bfqq->budget_timeout = jiffies;
>
> bfq_del_bfqq_busy(bfqd, bfqq, true);
>
> Thanks,
> Kuai
Powered by blists - more mailing lists