linux-kernel - Re: [PATCH v8 6/9] sched/fair: Add sched group latency support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y3Jq4gBB5+Qg67u7@google.com>
Date:   Mon, 14 Nov 2022 17:20:50 +0100
From:   Patrick Bellasi <patrick.bellasi@...bug.net>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
        linux-kernel@...r.kernel.org, parth@...ux.ibm.com,
        qyousef@...alina.io, chris.hyser@...cle.com,
        David.Laight@...lab.com, pjt@...gle.com, pavel@....cz,
        tj@...nel.org, qperret@...gle.com, tim.c.chen@...ux.intel.com,
        joshdon@...gle.com, timj@....org, kprateek.nayak@....com,
        yu.c.chen@...el.com, youssefesmat@...omium.org,
        joel@...lfernandes.org
Subject: Re: [PATCH v8 6/9] sched/fair: Add sched group latency support

Hi Vincent,

On 10-Nov 18:50, Vincent Guittot wrote:

[...]

> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index be4a77baf784..a4866cd4e58c 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1095,6 +1095,16 @@ All time durations are in microseconds.
>          values similar to the sched_setattr(2). This maximum utilization
>          value is used to clamp the task specific maximum utilization clamp.
>  
> +  cpu.latency.nice
> +	A read-write single value file which exists on non-root
> +	cgroups.  The default is "0".
> +
> +	The nice value is in the range [-20, 19].
> +
> +	This interface file allows reading and setting latency using the
> +	same values used by sched_setattr(2). The latency_nice of a group is
> + used to limit the impact of the latency_nice of a task outside the
> + group.

This control model is not clear to me.

It does not seem matching what we have for uclamp, where the cgroup values are
used to restrict how much a task can ask or give (in terms of latency here).

in the uclamp's requested-vs-effective values model:

A) a task can "request" (or give up) latency as much as it likes

B) the cgroup in which the task is in any moment limits wthe maximum
   latency a task can "request" (or give up)

C) the system wide knob set the "request" limit for the root cgroup an any task
   not in a cgroup.

This model seems to be what we should use here too.

IOW, for each task compute an "effective" latency_nice value which is defined
starting for a task "requested" latency value and by restricting this value
based on the (B) cgroup value and the (C) system wide value.

That's what we do in uclamp_eff_get():
   https://elixir.bootlin.com/linux/v6.0/source/kernel/sched/core.c#L1484

Why such a model cannot be used for latency_nice too?
Am I missing something?


Best,
patrick

-- 
#include <best/regards.h>

Patrick Bellasi