linux-kernel - Re: [RFC][PATCH 15/15] sched/eevdf: Use sched_attr::sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKfTPtCTe7pc=fahynt1kTffUXk5B18usEE_Ay40vE-yjVt0=A@mail.gmail.com>
Date:   Thu, 1 Jun 2023 15:55:18 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     mingo@...nel.org, linux-kernel@...r.kernel.org,
        juri.lelli@...hat.com, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, corbet@....net, qyousef@...alina.io,
        chris.hyser@...cle.com, patrick.bellasi@...bug.net, pjt@...gle.com,
        pavel@....cz, qperret@...gle.com, tim.c.chen@...ux.intel.com,
        joshdon@...gle.com, timj@....org, kprateek.nayak@....com,
        yu.c.chen@...el.com, youssefesmat@...omium.org,
        joel@...lfernandes.org, efault@....de, tglx@...utronix.de
Subject: Re: [RFC][PATCH 15/15] sched/eevdf: Use sched_attr::sched_runtime to
 set request/slice

On Wed, 31 May 2023 at 14:47, Peter Zijlstra <peterz@...radead.org> wrote:
>
> As an alternative to the latency-nice interface; allow applications to
> directly set the request/slice using sched_attr::sched_runtime.
>
> The implementation clamps the value to: 0.1[ms] <= slice <= 100[ms]
> which is 1/10 the size of HZ=1000 and 10 times the size of HZ=100.

There were some discussions about the latency interface and setting a
raw time value. The problems with using a raw time value are:
- what  does this raw time value mean ? and how it applies to the
scheduling latency of the task. Typically what does setting
sched_runtime to 1ms means ? Regarding the latency, users would expect
to be scheduled in less than 1ms but this is not what will (always)
happen with a sched_slice set to 1ms whereas we ensure that the task
will run for sched_runtime in the sched_period (and before
sched_deadline) when using it with deadline scheduler. so this will be
confusing
- more than a runtime, we want to set a scheduling latency hint which
would be more aligned with a deadline
- Then the user will complain that he set 1ms but its task is
scheduled after several (or even dozens) ms in some cases. Also, you
will probably end up with everybody setting 0.1ms and expecting 0.1ms
latency. The latency nice like the nice give an opaque weight against
others without any determinism that we can't respect
- How do you set that you don't want to preempt others ? But still
want to keep your allocated running time.

>
> Applications should strive to use their periodic runtime at a high
> confidence interval (95%+) as the target slice. Using a smaller slice
> will introduce undue preemptions, while using a larger value will
> reduce latency.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
>  kernel/sched/core.c |   24 ++++++++++++++++++------
>  1 file changed, 18 insertions(+), 6 deletions(-)
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7494,10 +7494,18 @@ static void __setscheduler_params(struct
>
>         p->policy = policy;
>
> -       if (dl_policy(policy))
> +       if (dl_policy(policy)) {
>                 __setparam_dl(p, attr);
> -       else if (fair_policy(policy))
> +       } else if (fair_policy(policy)) {
>                 p->static_prio = NICE_TO_PRIO(attr->sched_nice);
> +               if (attr->sched_runtime) {
> +                       p->se.slice = clamp_t(u64, attr->sched_runtime,
> +                                             NSEC_PER_MSEC/10,   /* HZ=1000 * 10 */
> +                                             NSEC_PER_MSEC*100); /* HZ=100  / 10 */
> +               } else {
> +                       p->se.slice = sysctl_sched_base_slice;
> +               }
> +       }
>
>         /*
>          * __sched_setscheduler() ensures attr->sched_priority == 0 when
> @@ -7689,7 +7697,9 @@ static int __sched_setscheduler(struct t
>          * but store a possible modification of reset_on_fork.
>          */
>         if (unlikely(policy == p->policy)) {
> -               if (fair_policy(policy) && attr->sched_nice != task_nice(p))
> +               if (fair_policy(policy) &&
> +                   (attr->sched_nice != task_nice(p) ||
> +                    (attr->sched_runtime && attr->sched_runtime != p->se.slice)))
>                         goto change;
>                 if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
>                         goto change;
> @@ -8017,12 +8027,14 @@ static int sched_copy_attr(struct sched_
>
>  static void get_params(struct task_struct *p, struct sched_attr *attr)
>  {
> -       if (task_has_dl_policy(p))
> +       if (task_has_dl_policy(p)) {
>                 __getparam_dl(p, attr);
> -       else if (task_has_rt_policy(p))
> +       } else if (task_has_rt_policy(p)) {
>                 attr->sched_priority = p->rt_priority;
> -       else
> +       } else {
>                 attr->sched_nice = task_nice(p);
> +               attr->sched_runtime = p->se.slice;
> +       }
>  }
>
>  /**
>
>