lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 4 Nov 2022 14:57:11 +0000
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Qais Yousef <qyousef@...alina.io>, mingo@...hat.com,
        peterz@...radead.org, juri.lelli@...hat.com,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
        linux-kernel@...r.kernel.org, parth@...ux.ibm.com,
        qais.yousef@....com, chris.hyser@...cle.com,
        patrick.bellasi@...bug.net, David.Laight@...lab.com,
        pjt@...gle.com, pavel@....cz, tj@...nel.org, qperret@...gle.com,
        tim.c.chen@...ux.intel.com, joshdon@...gle.com, timj@....org,
        kprateek.nayak@....com, yu.c.chen@...el.com,
        youssefesmat@...omium.org
Subject: Re: [PATCH v7 6/9] sched/fair: Add sched group latency support

On Fri, Nov 04, 2022 at 03:24:23PM +0100, Vincent Guittot wrote:
> On Fri, 4 Nov 2022 at 12:21, Qais Yousef <qyousef@...alina.io> wrote:
> >
> > On 11/03/22 18:02, Vincent Guittot wrote:
> > > On Thu, 3 Nov 2022 at 15:27, Qais Yousef <qyousef@...alina.io> wrote:
> > > >
> > > > On 11/03/22 09:46, Vincent Guittot wrote:
> > > > > On Tue, 1 Nov 2022 at 20:28, Qais Yousef <qyousef@...alina.io> wrote:
> > > > > >
> > > > > > On 10/28/22 11:34, Vincent Guittot wrote:
> > > > > > > Task can set its latency priority with sched_setattr(), which is then used
> > > > > > > to set the latency offset of its sched_enity, but sched group entities
> > > > > > > still have the default latency offset value.
> > > > > > >
> > > > > > > Add a latency.nice field in cpu cgroup controller to set the latency
> > > > > > > priority of the group similarly to sched_setattr(). The latency priority
> > > > > > > is then used to set the offset of the sched_entities of the group.
> > > > > > >
> > > > > > > Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> > > > > > > ---
> > > > > > >  Documentation/admin-guide/cgroup-v2.rst |  8 ++++
> > > > > > >  kernel/sched/core.c                     | 52 +++++++++++++++++++++++++
> > > > > > >  kernel/sched/fair.c                     | 33 ++++++++++++++++
> > > > > > >  kernel/sched/sched.h                    |  4 ++
> > > > > > >  4 files changed, 97 insertions(+)
> > > > > > >
> > > > > > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> > > > > > > index be4a77baf784..d8ae7e411f9c 100644
> > > > > > > --- a/Documentation/admin-guide/cgroup-v2.rst
> > > > > > > +++ b/Documentation/admin-guide/cgroup-v2.rst
> > > > > > > @@ -1095,6 +1095,14 @@ All time durations are in microseconds.
> > > > > > >          values similar to the sched_setattr(2). This maximum utilization
> > > > > > >          value is used to clamp the task specific maximum utilization clamp.
> > > > > > >
> > > > > > > +  cpu.latency.nice
> > > > > > > +     A read-write single value file which exists on non-root
> > > > > > > +     cgroups.  The default is "0".
> > > > > > > +
> > > > > > > +     The nice value is in the range [-20, 19].
> > > > > > > +
> > > > > > > +     This interface file allows reading and setting latency using the
> > > > > > > +     same values used by sched_setattr(2).
> > > > > >
> > > > > > I'm still not sure about this [1].
> > > > >
> > > > > I'm still not sure about what you are trying to say here ...
> > > > >
> > > > > This is about setting a latency nice prio to a group level.
> > > > >
> > > > > >
> > > > > > In some scenarios we'd like to get the effective latency_nice of the task. How
> > > > > > will the task inherit the cgroup value or be impacted by it?
> > > > > >
> > > > > > For example if there are tasks that belong to a latency sensitive cgroup; and
> > > > > > I'd like to skip some searches in EAS to improve that latency sensitivity - how
> > > > > > would I extract this info in EAS path given these tasks are using default
> > > > > > latency_nice value? And if should happen if their latency_nice is set to
> > > > > > something else other than default?
> > > > > >
> > > > > > [1] https://lore.kernel.org/lkml/20221012160734.hrkb5jcjdq7r23pr@wubuntu/
> > > > >
> > > > > Hmm so you are speaking about something that is not part of the patch.
> > > > > Let focus on the patchset for now
> > > >
> > > > I am focusing on this patchset. Isn't this an essential part of the design?
> > > > Once the interface is out we can't change it. As it stands, I can't see how it
> > >
> > > So, are you speaking about the interface i.e. setting a value between [-20:19]
> >
> > About how the cgroup and per task interface interact.
> >
> > How to get the effective value of latency_nice for a task that belongs to
> > a hierarchy?
> 
> At the common parents level of the 2 entities that you want to compare
> and root level if there no other entity to compare with
> 
> >
> > If I have a task that has p->latency_nice = 20 but it belongs to a cgroup that
> > has tg->cpu.latency.nice = -19
> 
> according to what i said above, latency_nice = 20 inside the group and
> -19 when comparing at the parent level
> 
> >
> > And I want to use this interface in EAS; how should I interpret these values?
> > How should I walk up the hierarchy and decide the _effective_ latency_nice
> > value?
> 
> The current use of latency_nice doesn't need to walk the hierarchy
> because it applies at each scheduling level so the childs
> automatically follow parents' latency.

Not really, I don't see how that will work that way in the wake up path. The
wake up path (EAS in particular) does not walk through CPU controller group
hierarchy from top level, it only cares about cpuset/affinities and the
"effective" values of tasks.

So when you wake up a task, how will you retrieve the attribute for 'prefer
idle' in the wakeup path using this patchset? The only way is to aggregate
the CGroup hierarchy information to get a per-task effective value; say using
a min function.

If you see uclamp_rq_util_with(), that also is using doing uclamp
aggregation similarly.

So I think Qais is asking about the aggregation function in the EAS wakeup
path.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ