linux-kernel - Re: [PATCH v4 5/8] sched/fair: Take into account latency priority at wakeup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b02452b2-900c-89da-c7b7-40a61268065e@arm.com>
Date:   Tue, 20 Sep 2022 15:18:16 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com,
        linux-kernel@...r.kernel.org, parth@...ux.ibm.com,
        qais.yousef@....com, chris.hyser@...cle.com,
        valentin.schneider@....com, patrick.bellasi@...bug.net,
        David.Laight@...lab.com, pjt@...gle.com, pavel@....cz,
        tj@...nel.org, qperret@...gle.com, tim.c.chen@...ux.intel.com,
        joshdon@...gle.com
Subject: Re: [PATCH v4 5/8] sched/fair: Take into account latency priority at
 wakeup

On 19/09/2022 17:39, Vincent Guittot wrote:
> On Mon, 19 Sept 2022 at 12:05, Dietmar Eggemann
> <dietmar.eggemann@....com> wrote:
>>
>> On 16/09/2022 10:03, Vincent Guittot wrote:
>>
>> [...]
>>
>>> @@ -4512,7 +4519,7 @@ int sched_fork(unsigned long clone_flags, struct task_struct *p)
>>>       p->prio = current->normal_prio;
>>>
>>>       /* Propagate the parent's latency requirements to the child as well */
>>> -     p->latency_nice = current->latency_nice;
>>> +     p->latency_prio = current->latency_prio;
>>
>> Isn't here a `set_latency_offset(p)` missing here?
> 
> Hmm, I think it's the opposite and the line above is a nop from the
> beginning (i.e. patch 2).

Yeah, you're right! It looked suspicious ...

[...]

>>> +      * the idle thread and don't set next buddy as a candidate for being
>>> +      * picked in priority.
>>> +      * In case of simultaneous wakeup from idle, the latency sensitive tasks
>>> +      * lost opportunity to preempt non sensitive tasks which woke up
>>> +      * simultaneously.
>>> +      */
>>
>> The position of this comment block within this function is somehow
>> misleading since it describes the reason for the function rather then a
>> particular condition within this function. Wouldn't it be more readable
>> when it would be a function header comment instead?
> 
> I put it after the usual early return tests to put the comment close
> to the useful part: the use of next buddy and __pick_first_entity()

So you want to have the `wakeup_preempt_entity(se, pse) == 1` condition
from check_preempt_wakeup() also for cfs_task woken up by others.

[...]

>>> +      * requirement that needs to be evaluated versus other entity.
>>> +      * Otherwise, use the latency weight to evaluate how much scheduling
>>> +      * delay is acceptable by se.
>>> +      */
>>> +     if ((se->latency_offset < 0) || (curr->latency_offset < 0))
>>> +             latency_offset -= curr->latency_offset;
>>
>> I still don't get the rationale behind why when either one (se or curr)
>> of the latency_nice values is negative, we use the diff between them but
>> if not, we only care about se's value. Why don't you always use the diff
>> between se and curr? Since we have a range [-20 ... 19] why shouldn't we
>> use the difference between let's say se = 19 and curr = 5?
>> You discussed this with Tao Zhou on the v1 but I didn't understand it fully.
> 
> Let say that current has a latency nice prio of 19 and a task A with a
> latency nice of 10 wakes up. Both tasks don't care about scheduling
> latency (current more than task A). If we use the diff, the output of
> wakeup_latency_gran() would be negative (-10ms) which reflects the
> fact that the waking task is sensitive to the latency and wants to
> preempt current even if its vruntime is after. But obviously both
> current and task A don't care to preempt at wakeup.

OK, I understand but there is a certain level of unsteadiness here.

If p has >0 it gets treated differently in case current has >=0 and case
current has <0.

Do we expect that tasks set their value to [1..19] in this case, when
the default 0 already indicates a 'don't care'?