[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtBJb-f+LVLuQy4dbQrjGOiGzMe_3+wcwBiFgJttameCaQ@mail.gmail.com>
Date: Wed, 28 Aug 2019 17:02:44 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Rik van Riel <riel@...riel.com>
Cc: linux-kernel <linux-kernel@...r.kernel.org>,
Kernel Team <kernel-team@...com>, Paul Turner <pjt@...gle.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Morten Rasmussen <morten.rasmussen@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Mel Gorman <mgorman@...hsingularity.net>
Subject: Re: [PATCH 03/15] sched,fair: redefine runnable_load_avg as the sum
of task_h_load
On Wed, 28 Aug 2019 at 16:48, Rik van Riel <riel@...riel.com> wrote:
>
> On Wed, 2019-08-28 at 15:50 +0200, Vincent Guittot wrote:
> > Hi Rik,
> >
> > On Thu, 22 Aug 2019 at 04:18, Rik van Riel <riel@...riel.com> wrote:
> > > The runnable_load magic is used to quickly propagate information
> > > about
> > > runnable tasks up the hierarchy of runqueues. The runnable_load_avg
> > > is
> > > mostly used for the load balancing code, which only examines the
> > > value at
> > > the root cfs_rq.
> > >
> > > Redefine the root cfs_rq runnable_load_avg to be the sum of
> > > task_h_loads
> > > of the runnable tasks. This works because the hierarchical
> > > runnable_load of
> > > a task is already equal to the task_se_h_load today. This provides
> > > enough
> > > information to the load balancer.
> > >
> > > The runnable_load_avg of the cgroup cfs_rqs does not appear to be
> > > used for anything, so don't bother calculating those.
> > >
> > > This removes one of the things that the code currently traverses
> > > the
> > > cgroup hierarchy for, and getting rid of it brings us one step
> > > closer
> > > to a flat runqueue for the CPU controller.
> >
> > I like your proposal but just wanted to clarify one thing with this
> > patch.
> > Although you removed the computation of runnable_load_avg of the
> > cgroup cfs_rq, we are still traversing the hierarchy to update the
> > root cfs_rq runnable_load_avg because we are traversing the hierarchy
> > for computing the task_h_loads
>
> The task_h_load hierarchy traversal in update_cfs_rq_h_load
> is rate limited to once a jiffy, though. Rate limiting the
Ah yes. I forgot that it was jiffies and not clock_task that is used
for limiting the update
> hierarchy traversal significantly reduces overhead.
>
> > That being said, if we manage to remove the need on using
> > runnable_load_avg we will completely skip this traversal. I have a
> > proposal to remove it from load balance and wake up path but i
> > haven't
> > look at numa stats which also use it
>
> --
> All Rights Reversed.
Powered by blists - more mailing lists