[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtA763zLxToVJpOCKc8TAgD3aZwpwhMZbbzrKiok+UHFaA@mail.gmail.com>
Date: Mon, 7 Oct 2019 17:27:10 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Rik van Riel <riel@...riel.com>
Cc: linux-kernel <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Phil Auld <pauld@...hat.com>,
Valentin Schneider <valentin.schneider@....com>,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
Quentin Perret <quentin.perret@....com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Morten Rasmussen <Morten.Rasmussen@....com>,
Hillf Danton <hdanton@...a.com>
Subject: Re: [PATCH v3 09/10] sched/fair: use load instead of runnable load in
wakeup path
On Mon, 7 Oct 2019 at 17:14, Rik van Riel <riel@...riel.com> wrote:
>
> On Thu, 2019-09-19 at 09:33 +0200, Vincent Guittot wrote:
> > runnable load has been introduced to take into account the case where
> > blocked load biases the wake up path which may end to select an
> > overloaded
> > CPU with a large number of runnable tasks instead of an underutilized
> > CPU with a huge blocked load.
> >
> > Tha wake up path now starts to looks for idle CPUs before comparing
> > runnable load and it's worth aligning the wake up path with the
> > load_balance.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
>
> On a single socket system, patches 9 & 10 have the
> result of driving a woken up task (when wake_wide is
> true) to the CPU core with the lowest blocked load,
> even when there is an idle core the task could run on
> right now.
>
> With the whole series applied, I see a 1-2% regression
> in CPU use due to that issue.
>
> With only patches 1-8 applied, I see a 1% improvement in
> CPU use for that same workload.
Thanks for testing.
patch 8-9 have just replaced runnable load by blocked load and then
removed the duplicated metrics in find_idlest_group.
I'm preparing an additional patch that reworks find_idlest_group() to
behave similarly to find_busiest_group(). It gathers statistics what
it already does, then classifies the groups and finally selects the
idlest one. This should fix the problem that you mentioned above when
it selects a group with lowest blocked load whereas there are idle
cpus in another group with high blocked load.
>
> Given that it looks like select_idle_sibling and
> find_idlest_group_cpu do roughly the same thing, I
> wonder if it is enough to simply add an additional
> test to find_idlest_group to have it return the
> LLC sg, if it is called on the LLC sd on a single
> socket system.
That make sense to me
>
> That way find_idlest_group_cpu can still find an
> idle core like it does today.
>
> Does that seem like a reasonable thing?
That's worth testing
>
> I can run tests with that :)
>
> --
> All Rights Reversed.
Powered by blists - more mailing lists