lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YMobzbLecaFYuLtq@slm.duckdns.org>
Date:   Wed, 16 Jun 2021 11:42:05 -0400
From:   Tejun Heo <tj@...nel.org>
To:     Josh Don <joshdon@...gle.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Paul Turner <pjt@...gle.com>,
        David Rientjes <rientjes@...gle.com>,
        Oleg Rombakh <olegrom@...gle.com>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Steve Sistare <steven.sistare@...cle.com>,
        linux-kernel@...r.kernel.org, Rik van Riel <riel@...riel.com>
Subject: Re: [PATCH] sched: cgroup SCHED_IDLE support

Hello,

On Tue, Jun 08, 2021 at 04:11:32PM -0700, Josh Don wrote:
> This extends SCHED_IDLE to cgroups.
> 
> Interface: cgroup/cpu.idle.
>  0: default behavior
>  1: SCHED_IDLE
> 
> Extending SCHED_IDLE to cgroups means that we incorporate the existing
> aspects of SCHED_IDLE; a SCHED_IDLE cgroup will count all of its
> descendant threads towards the idle_h_nr_running count of all of its
> ancestor cgroups. Thus, sched_idle_rq() will work properly.
> Additionally, SCHED_IDLE cgroups are configured with minimum weight.
> 
> There are two key differences between the per-task and per-cgroup
> SCHED_IDLE interface:
> 
> - The cgroup interface allows tasks within a SCHED_IDLE hierarchy to
> maintain their relative weights. The entity that is "idle" is the
> cgroup, not the tasks themselves.
> 
> - Since the idle entity is the cgroup, our SCHED_IDLE wakeup preemption
> decision is not made by comparing the current task with the woken task,
> but rather by comparing their matching sched_entity.
> 
> A typical use-case for this is a user that creates an idle and a
> non-idle subtree. The non-idle subtree will dominate competition vs
> the idle subtree, but the idle subtree will still be high priority
> vs other users on the system. The latter is accomplished via comparing
> matching sched_entity in the waken preemption path (this could also be
> improved by making the sched_idle_rq() decision dependent on the
> perspective of a specific task).

A high-level problem that I see with the proposal is that this would bake
the current recursive implementation into the interface. The semantics of
the currently exposed interface, at least the weight based part, is abstract
and doesn't necessarily dictate how the scheduling is actually performed.
Adding this would mean that we're now codifying the current behavior of
fully nested scheduling into the interface.

There are several practical challenges with the current implementation
caused by the full nesting - e.g. nesting levels are expensive for context
switch heavy applicaitons often going over >1% per level, and heuristics
which assume global queue may behave unexpectedly - ie. we can create
conditions where things like idle-wakeup boost behave very differently
depending on whether tasks are inside a cgroup or not even when the eventual
relative weights and past usages are similar.

Can you please give more details on why this is beneficial? Is the benefit
mostly around making configuration easy or are there actual scheduling
behaviors that you can't achieve otherwise?

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ