[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <da011fa3-6021-5012-4ef7-7c99c4c65c5b@amazon.de>
Date: Wed, 19 Sep 2018 11:23:22 +0200
From: "Jan H. Schönherr" <jschoenh@...zon.de>
To: Rik van Riel <riel@...riel.com>,
Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
Paul Turner <pjt@...gle.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Morten Rasmussen <morten.rasmussen@....com>,
Tim Chen <tim.c.chen@...ux.intel.com>
Subject: Re: Task group cleanups and optimizations (was: Re: [RFC 00/60]
Coscheduling for Linux)
On 09/18/2018 04:35 PM, Rik van Riel wrote:
> On Tue, 2018-09-18 at 15:22 +0200, Jan H. Schönherr wrote:
[...]
> Task priorities in a flat runqueue are relatively straightforward, with
> vruntime scaling just like done for nice levels, but I have to admit
> that throttled groups provide a challenge.
Do you know/have an idea how the flat approach would further skew the
approximations currently done in calc_group_shares()/calc_group_runnable()?
With the nested hierarchy the (shared) task group SE is updated
whenever something changes. With the flat approach, you'd only be
able to update a task SE, when you have to touch the task anyway.
Just from thinking briefly about it, it feels like values would be
out of date for longer periods of time.
>> However, the only efficient way that I can currently think of, is a
>> hybrid model between the "full nesting" that is currently there, and
>> the "no nesting" you were describing above.
>>
>> It would flatten all task groups that do not actively contribute some
>> function, which would be all task groups purely for accounting
>> purposes and those for *unthrottled* CFS hierarchies (and those for
>> coscheduling that contain exactly one SE in a runqueue). The nesting
>> would still be kept for *throttled* hierarchies (and the coscheduling
>> stuff). (And if you wouldn't have mentioned a way to get rid of
>> nesting completely, I would have kept a single level of nesting for
>> accounting purposes as well.)
>>
>> This would allow us to lazily dequeue SEs that have run out of
>> bandwidth when we encounter them, and already enqueue them in the
>> nested task group (whose SE is not enqueued at the moment). That way,
>> it's still a O(1) operation to re-enable all tasks, once runtime is
>> available again. And O(1) to throttle a repeat offender.
>
> I suspect most systems will have a number of runnable tasks no larger
> than the number of CPUs most of the time.
>
> That makes "re-enable all the tasks" often equivalent to "re-enable one
> task".
>
> Can we handle the re-enabling (or waking up!) of one task almost as fast
> as we can without the cpu controller?
We could start by transparently special-casing the "just one SE in a
runqueue" case, where that single SE is enqueued directly into the next
parent, and everything falls back to nesting, the moment a second SE pops
up.
That might allow reaping the benefits for many cases without hurting
other cases. It's also more a gradual conversion of code.
Regards
Jan
Powered by blists - more mailing lists