lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtA1UFRGsDRVRNn_M2OEQ2G8A5MzWP_3GpmSsf1=B9+DdA@mail.gmail.com>
Date:   Thu, 22 Sep 2022 09:12:19 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Dietmar Eggemann <dietmar.eggemann@....com>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com,
        linux-kernel@...r.kernel.org, parth@...ux.ibm.com,
        qais.yousef@....com, chris.hyser@...cle.com,
        valentin.schneider@....com, patrick.bellasi@...bug.net,
        David.Laight@...lab.com, pjt@...gle.com, pavel@....cz,
        tj@...nel.org, qperret@...gle.com, tim.c.chen@...ux.intel.com,
        joshdon@...gle.com
Subject: Re: [PATCH v4 5/8] sched/fair: Take into account latency priority at wakeup

On Thu, 22 Sept 2022 at 00:41, Dietmar Eggemann
<dietmar.eggemann@....com> wrote:
>
> On 20/09/2022 17:49, Vincent Guittot wrote:
> > On Tue, 20 Sept 2022 at 15:18, Dietmar Eggemann
> > <dietmar.eggemann@....com> wrote:
> >>
> >> On 19/09/2022 17:39, Vincent Guittot wrote:
> >>> On Mon, 19 Sept 2022 at 12:05, Dietmar Eggemann
> >>> <dietmar.eggemann@....com> wrote:
> >>>>
> >>>> On 16/09/2022 10:03, Vincent Guittot wrote:
>
> [...]
>
> >>>>> +      * the idle thread and don't set next buddy as a candidate for being
> >>>>> +      * picked in priority.
> >>>>> +      * In case of simultaneous wakeup from idle, the latency sensitive tasks
> >>>>> +      * lost opportunity to preempt non sensitive tasks which woke up
> >>>>> +      * simultaneously.
> >>>>> +      */
> >>>>
> >>>> The position of this comment block within this function is somehow
> >>>> misleading since it describes the reason for the function rather then a
> >>>> particular condition within this function. Wouldn't it be more readable
> >>>> when it would be a function header comment instead?
> >>>
> >>> I put it after the usual early return tests to put the comment close
> >>> to the useful part: the use of next buddy and __pick_first_entity()
> >>
> >> So you want to have the `wakeup_preempt_entity(se, pse) == 1` condition
> >> from check_preempt_wakeup() also for cfs_task woken up by others.
> >
> > I wake the wakeup_preempt_entity(cfs_rq->next, left) < 1 in
> > pick_next_entity() to pick the task with highest latency constraint
> > when another class is running while waking up
>
> That's correct. This is where you potentially pick this task since it is
> the next_buddy.
> All I wanted to say is that check_preempt_from_others() and its `next &&
> wakeup_preempt_entity(next, se) == 1` is the counterpart of the
> `wakeup_preempt_entity(se, pse) == 1` in check_preempt_wakeup() to be
> able to set next_buddy even curr is from an other class than CFS.
>
> [...]
>
> >>>> I still don't get the rationale behind why when either one (se or curr)
> >>>> of the latency_nice values is negative, we use the diff between them but
> >>>> if not, we only care about se's value. Why don't you always use the diff
> >>>> between se and curr? Since we have a range [-20 ... 19] why shouldn't we
> >>>> use the difference between let's say se = 19 and curr = 5?
> >>>> You discussed this with Tao Zhou on the v1 but I didn't understand it fully.
> >>>
> >>> Let say that current has a latency nice prio of 19 and a task A with a
> >>> latency nice of 10 wakes up. Both tasks don't care about scheduling
> >>> latency (current more than task A). If we use the diff, the output of
> >>> wakeup_latency_gran() would be negative (-10ms) which reflects the
> >>> fact that the waking task is sensitive to the latency and wants to
> >>> preempt current even if its vruntime is after. But obviously both
> >>> current and task A don't care to preempt at wakeup.
> >>
> >> OK, I understand but there is a certain level of unsteadiness here.
> >>
> >> If p has >0 it gets treated differently in case current has >=0 and case
> >
> > "If p >=0"; 0 has same behavior than [1..19]
> >
> >> current has <0.
>
> Not quite. It depends on curr. With sysctl_sched_latency = 24ms:

I thought you were speaking about priority 0 vs [1..19] as you made a
difference in your previous comment below

>
> (1) p = 10 curr =  19 -> wakeup_latency_gran() returns 12ms
>
> (2) p = 10 curr = -10 -> wakeup_latency_gran() returns 24ms
>
> In (1) only p's own latency counts whereas in (2) we take the diff,

Yes because  curr is latency sensitive in (2) whereas it's not in (1)

>
> In (A) we 'punish' p even though it competes against curr which has an
> even lower latency requirement than p,

What is (A) ?  Assuming you meant (1), having a positive nice latency
means that you don't have latency requirement but you are tolerant to
scheduling delay so we don't 'punish' p. P will preempt curr is we are
above the tolerance

>
> >> Do we expect that tasks set their value to [1..19] in this case, when
> >> the default 0 already indicates a 'don't care'?
> >
> > I'm not sure that I understand your concern as [0..19] are treated in
> > the same way. Only tasks (curr or se) with offset <0 need a relative
> > comparison to the other. If curr and se has both a latency nice of
> > -19, se should not blindly preempt curr but only if curr already run
> > for its amount of time
>
> With p = -19 and curr = -19 we would take the diff, so 0ms.
>
> With p = 19 and curr = 19, if we would use `latency_offset -=
> curr->latency_offset` wakeup_latency_gran() would return 973/1024*24ms -
> 973/1024*24ms = 0ms and nothing will shift.
>
> OTHA, in case (1) wakeup_latency_gran() would return 512/1024*24ms -
> 973/1024*24ms = - 10.80ms. So p would gain an advantage here instead of
> a penalty.

And that's all the point. A priority >= 0 means that you don't care
about scheduling delays so there is no reason to be more aggressive
with a task that is also latency tolerant. We only have to ensure that
the delay stays in the acceptable range

>
> Essentially using the full [-20 .. 19] nice scope for `se vs. curr`
> comparison.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ