linux-kernel - Re: [PATCH -next v7 2/3] block, bfq: refactor the counting of 'num_groups_with_pending

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8ffa050c-1254-0974-1457-4ce4cb39dcb4@huawei.com>
Date:   Tue, 31 May 2022 17:24:05 +0800
From:   Yu Kuai <yukuai3@...wei.com>
To:     Paolo Valente <paolo.valente@...more.it>
CC:     Jan Kara <jack@...e.cz>, Jens Axboe <axboe@...nel.dk>,
        Tejun Heo <tj@...nel.org>, <cgroups@...r.kernel.org>,
        linux-block <linux-block@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>, <yi.zhang@...wei.com>
Subject: Re: [PATCH -next v7 2/3] block, bfq: refactor the counting of
 'num_groups_with_pending_reqs'

在 2022/05/31 17:19, Paolo Valente 写道:
> 
> 
>> Il giorno 31 mag 2022, alle ore 11:06, Yu Kuai <yukuai3@...wei.com> ha scritto:
>>
>> 在 2022/05/31 16:36, Paolo VALENTE 写道:
>>>> Il giorno 30 mag 2022, alle ore 10:40, Yu Kuai <yukuai3@...wei.com> ha scritto:
>>>>
>>>> 在 2022/05/30 16:34, Yu Kuai 写道:
>>>>> 在 2022/05/30 16:10, Paolo Valente 写道:
>>>>>>
>>>>>>
>>>>>>> Il giorno 28 mag 2022, alle ore 11:50, Yu Kuai <yukuai3@...wei.com> ha scritto:
>>>>>>>
>>>>>>> Currently, bfq can't handle sync io concurrently as long as they
>>>>>>> are not issued from root group. This is because
>>>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>>>>>> bfq_asymmetric_scenario().
>>>>>>>
>>>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>>>>>
>>>>>>> Before this patch:
>>>>>>> 1) root group will never be counted.
>>>>>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the requests.
>>>>>>>
>>>>>>> After this patch:
>>>>>>> 1) root group is counted.
>>>>>>> 2) Count if bfqg have at least one bfqq that is marked busy.
>>>>>>> 3) Don't count if bfqg doesn't have any busy bfqqs.
>>>>>>
>>>>>> Unfortunately, I see a last problem here. I see a double change:
>>>>>> (1) a bfqg is now counted only as a function of the state of its child
>>>>>>       queues, and not of also its child bfqgs
>>>>>> (2) the state considered for counting a bfqg moves from having pending
>>>>>>       requests to having busy queues
>>>>>>
>>>>>> I'm ok with with (1), which is a good catch (you are lady explained
>>>>>> the idea to me some time ago IIRC).
>>>>>>
>>>>>> Yet I fear that (2) is not ok.  A bfqq can become non busy even if it
>>>>>> still has in-flight I/O, i.e.  I/O being served in the drive.  The
>>>>>> weight of such a bfqq must still be considered in the weights_tree,
>>>>>> and the group containing such a queue must still be counted when
>>>>>> checking whether the scenario is asymmetric.  Otherwise service
>>>>>> guarantees are broken.  The reason is that, if a scenario is deemed as
>>>>>> symmetric because in-flight I/O is not taken into account, then idling
>>>>>> will not be performed to protect some bfqq, and in-flight I/O may
>>>>>> steal bandwidth to that bfqq in an uncontrolled way.
>>>>> Hi, Paolo
>>>>> Thanks for your explanation.
>>>>> My orginal thoughts was using weights_tree insertion/removal, however,
>>>>> Jan convinced me that using bfq_add/del_bfqq_busy() is ok.
>>>>>  From what I see, when bfqq dispatch the last request,
>>>>> bfq_del_bfqq_busy() will not be called from __bfq_bfqq_expire() if
>>>>> idling is needed, and it will delayed to when such bfqq get scheduled as
>>>>> in-service queue again. Which means the weight of such bfqq should still
>>>>> be considered in the weights_tree.
>>>>> I also run some tests on null_blk with "irqmode=2
>>>>> completion_nsec=100000000(100ms) hw_queue_depth=1", and tests show
>>>>> that service guarantees are still preserved on slow device.
>>>>> Do you this is strong enough to cover your concern?
>>> Unfortunately it is not.  Your very argument is what made be believe
>>> that considering busy queues was enough, in the first place.  But, as
>>> I found out, the problem is caused by the queues that do not enjoy
>>> idling.  With your patch (as well as in my initial version) they are
>>> not counted when they remain without requests queued.  And this makes
>>> asymmetric scenarios be considered erroneously as symmetric.  The
>>> consequence is that idling gets switched off when it had to be kept
>>> on, and control on bandwidth is lost for the victim in-service queues.
>>
>> Hi，Paolo
>>
>> Thanks for your explanation, are you thinking that if bfqq doesn't enjoy
>> idling, then such bfqq will clear busy after dispatching the last
>> request?
>>
>> Please kindly correct me if I'm wrong in the following process:
>>
>> If there are more than one bfqg that is activatied, then bfqqs that are
>> not enjoying idle are still left busy after dispatching the last
>> request.
>>
>> details in __bfq_bfqq_expire:
>>
>>         if (RB_EMPTY_ROOT(&bfqq->sort_list) &&
>>         ┊   !(reason == BFQQE_PREEMPTED &&
>>         ┊     idling_needed_for_service_guarantees(bfqd, bfqq))) {
>> -> idling_needed_for_service_guarantees will always return true,
> 
> It returns true only is the scenario is symmetric.  Not counting bfqqs
> with in-flight requests makes an asymmetric scenario be considered
> wrongly symmetric.  See function bfq_asymmetric_scenario().
Hi,

Yes, with this patchset, If there are more than one bfqg that is
activatied(contain busy bfqq), bfq_asymmetric_scenario() will return
true:

bfq_asymmetric_scenario()
  	return varied_queue_weights || multiple_classes_busy
  #ifdef CONFIG_BFQ_GROUP_IOSCHED
	       || bfqd->num_groups_with_busy_queues > 1
  #endif

 From what I see, bfqd->num_groups_with_busy_queues > 1 is always true...
> 
> Paolo
> 
>> bfqq(whether or not enjoy idling) will stay busy.
>>                 if (bfqq->dispatched == 0)
>>                         /*
>>                         ┊* Overloading budget_timeout field to store
>>                         ┊* the time at which the queue remains with no
>>                         ┊* backlog and no outstanding request; used by
>>                         ┊* the weight-raising mechanism.
>>                         ┊*/
>>                         bfqq->budget_timeout = jiffies;
>>
>>                 bfq_del_bfqq_busy(bfqd, bfqq, true);
>>
>> Thanks,
>> Kuai
> 
> .
>