lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f48b5233-ce60-7e1a-02e6-1bfbcc852271@arm.com>
Date:   Tue, 15 Jun 2021 12:06:57 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Josh Don <joshdon@...gle.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Paul Turner <pjt@...gle.com>,
        David Rientjes <rientjes@...gle.com>,
        Oleg Rombakh <olegrom@...gle.com>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Steve Sistare <steven.sistare@...cle.com>,
        Tejun Heo <tj@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched: cgroup SCHED_IDLE support

On 12/06/2021 01:34, Josh Don wrote:
> On Fri, Jun 11, 2021 at 9:43 AM Dietmar Eggemann
> <dietmar.eggemann@....com> wrote:
>>
>> On 10/06/2021 21:14, Josh Don wrote:
>>> Hey Dietmar,
>>>
>>> On Thu, Jun 10, 2021 at 5:53 AM Dietmar Eggemann
>>> <dietmar.eggemann@....com> wrote:
>>>>
>>>> Any reason why this should only work on cgroup-v2?
>>>
>>> My (perhaps incorrect) assumption that new development should not
>>> extend v1. I'd actually prefer making this work on v1 as well; I'll
>>> add that support.
>>>
>>>> struct cftype cpu_legacy_files[] vs. cpu_files[]
>>>>
>>>> [...]
>>>>
>>>>> @@ -11340,10 +11408,14 @@ void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq,
>>>>>
>>>>>  static DEFINE_MUTEX(shares_mutex);
>>>>>
>>>>> -int sched_group_set_shares(struct task_group *tg, unsigned long shares)
>>>>> +#define IDLE_WEIGHT sched_prio_to_weight[ARRAY_SIZE(sched_prio_to_weight) - 1]
>>>>
>>>> Why not 3 ? Like for tasks (WEIGHT_IDLEPRIO)?
>>>>
>>>> [...]
>>>
>>> Went back and forth on this; on second look, I do think it makes sense
>>> to use the IDLEPRIO weight of 3 here. This gets converted to a 0,
>>> rather than a 1 for display of cpu.weight, which is also actually a
>>> nice property.
>>
>> I'm struggling to see the benefit here.
>>
>> For a taskgroup A: Why setting A/cpu.idle=1 to force a minimum A->shares
>> when you can set it directly via A/cpu.weight (to 1 (minimum))?
>>
>> WEIGHT     cpu.weight   tg->shares
>>
>> 3          0            3072
>>
>> 15         1            15360
>>
>>            1            10240
>>
>> `A/cpu.weight` follows cgroup-v2's `weights` `resource distribution
>> model`* but I can only see `A/cpu.idle` as a layer on top of it forcing
>> `A/cpu.weight` to get its minimum value?
>>
>> *Documentation/admin-guide/cgroup-v2.rst
> 
> Setting cpu.idle carries additional properties in addition to just the
> weight. Currently, it primarily includes (a) special wakeup preemption
> handling, and (b) contribution to idle_h_nr_running for the purpose of
> marking a cpu as a sched_idle_cpu(). Essentially, the current
> SCHED_IDLE mechanics. I've also discussed with Peter a potential
> extension to SCHED_IDLE to manipulate vruntime.

Right, I forgot about (b).

But IMHO, (a) could be handled with this special tg->shares value for
SCHED_IDLE.

If there would be a way to open up `cpu.weight`, `cpu.weight.nice` (and
`cpu,shares` for v1) to take a special value for SCHED_IDLE, then you
won't need cpu.idle.
And you could handle the functionality from sched_group_set_idle()
directly in sched_group_set_shares().
In this case sched_group_set_shares() wouldn't have to be rejected on an
idle tg.
A tg would just become !idle by writing a different cpu.weight value.
Currently, if you !idle a tg it gets the default NICE_0_LOAD.


I guess cpu.weight [1, 10000] would be easy, 0 could be taken for that
and mapped into weight = WEIGHT_IDLEPRIO (3, 3072) to call
sched_group_set_shares(..., scale_load(weight).
cpu.weight = 1 maps to (10, 10240)

cpu.weight.nice [-20, 19] would be already more complicated, 20?

And for cpu.shares [2, 2 << 18] 0 could be used. The issue here is that
WEIGHT_IDLEPRIO (3, 3072) is a valid value already for shares.

> We set the cgroup weight here, since by definition SCHED_IDLE entities
> have the least scheduling weight. From the perspective of your
> question, the analogous statement for tasks would be that we set task
> weight to the min when doing setsched(SCHED_IDLE), even though we
> already have a renice mechanism.

I agree. `cpu.idle = 1` is like setting the task policy to SCHED_IDLE.
And there is even the `cpu.weight.nice` to support the `task - tg`
analogy on nice values.

I'm just wondering if integrating this into `cpu.weight` and friends
would be better to make the code behind this easier to grasp.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ