linux-kernel - Task group cleanups and optimizations (was: Re: [RFC 00/60] Coscheduling for Linux)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <282230fe-b8de-01f9-c19b-6070717ba5f8@amazon.de>
Date:   Sat, 15 Sep 2018 10:48:20 +0200
From:   "Jan H. Schönherr" <jschoenh@...zon.de>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
        Paul Turner <pjt@...gle.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Tim Chen <tim.c.chen@...ux.intel.com>
Subject: Task group cleanups and optimizations (was: Re: [RFC 00/60]
 Coscheduling for Linux)

On 09/14/2018 06:25 PM, Jan H. Schönherr wrote:
> On 09/14/2018 01:12 PM, Peter Zijlstra wrote:
>>
>> There are known scalability problems with the existing cgroup muck; you
>> just made things a ton worse. The existing cgroup overhead is
>> significant, you also made that many times worse.
>>
>> The cgroup stuff needs cleanups and optimization, not this.

[...]

> With respect to the need of cleanups and optimizations: I agree, that
> task groups are a bit messy. For example, here's my current wish list
> off the top of my head:
> 
> a) lazy scheduler operations; for example: when dequeuing a task, don't bother
>    walking up the task group hierarchy to dequeue all the SEs -- do it lazily
>    when encountering an empty CFS RQ during picking when we hold the lock anyway.
> 
> b) ability to move CFS RQs between CPUs: someone changed the affinity of
>    a cpuset? No problem, just attach the runqueue with all the tasks elsewhere.
>    No need to touch each and every task.
> 
> c) light-weight task groups: don't allocate a runqueue for every CPU in the
>    system, when it is known that tasks in the task group will only ever run
>    on at most two CPUs, or so. (And while there is of course a use case for
>    VMs in this, another class of use cases are auxiliary tasks, see eg, [1-5].)
> 
> Is this the level of optimizations, you're thinking about? Or do you want
> to throw away the whole nested CFS RQ experience in the code?

I guess, it would be possible to flatten the task group hierarchy, that is usually
created when nesting cgroups. That is, enqueue task group SEs always within the
root task group.

That should take away much of the (runtime-)overhead, no?

The calculation of shares would need to be a different kind of complex than it is
now. But that might be manageable.

CFS bandwidth control would also need to change significantly as we would now
have to dequeue/enqueue nested cgroups below a throttled/unthrottled hierarchy.
Unless *those* task groups don't participate in this flattening.

(And probably lots of other stuff, I didn't think about right now.)

Regards
Jan