linux-kernel - Re: [PATCH 8/8] sched,fair: flatten hierarchical runqueues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <27e1ce40c50a1b575527531bfc8dd562843b8ad5.camel@surriel.com>
Date:   Fri, 28 Jun 2019 15:36:15 -0400
From:   Rik van Riel <riel@...riel.com>
To:     Dietmar Eggemann <dietmar.eggemann@....com>, peterz@...radead.org
Cc:     mingo@...hat.com, linux-kernel@...r.kernel.org, kernel-team@...com,
        morten.rasmussen@....com, tglx@...utronix.de,
        mgorman@...hsingularity.com, vincent.guittot@...aro.org
Subject: Re: [PATCH 8/8] sched,fair: flatten hierarchical runqueues

On Fri, 2019-06-28 at 12:26 +0200, Dietmar Eggemann wrote:
> On 6/12/19 9:32 PM, Rik van Riel wrote:
> > Flatten the hierarchical runqueues into just the per CPU rq.cfs
> > runqueue.
> > 
> > Iteration of the sched_entity hierarchy is rate limited to once per
> > jiffy
> > per sched_entity, which is a smaller change than it seems, because
> > load
> > average adjustments were already rate limited to once per jiffy
> > before this
> > patch series.
> > 
> > This patch breaks CONFIG_CFS_BANDWIDTH. The plan for that is to
> > park tasks
> > from throttled cgroups onto their cgroup runqueues, and slowly
> > (using the
> > GENTLE_FAIR_SLEEPERS) wake them back up, in vruntime order, once
> > the cgroup
> > gets unthrottled, to prevent thundering herd issues.
> > 
> > Signed-off-by: Rik van Riel <riel@...riel.com>
> > ---
> >  include/linux/sched.h |   2 +
> >  kernel/sched/fair.c   | 478 +++++++++++++++++---------------------
> > ----
> >  kernel/sched/pelt.c   |   6 +-
> >  kernel/sched/pelt.h   |   2 +-
> >  kernel/sched/sched.h  |   2 +-
> >  5 files changed, 194 insertions(+), 296 deletions(-)
> > 
> 
> [...]
> 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> 
> [...]
> 
> > @@ -3491,7 +3544,7 @@ static inline bool update_load_avg(struct
> > cfs_rq *cfs_rq, struct sched_entity *s
> >  	 * track group sched_entity load average for task_h_load calc
> > in migration
> >  	 */
> >  	if (se->avg.last_update_time && !(flags & SKIP_AGE_LOAD))
> > -		updated = __update_load_avg_se(now, cfs_rq, se);
> > +		updated = __update_load_avg_se(now, cfs_rq, se, curr,
> > curr);
> 
> I wonder if task migration is still working correctly.
> 
> migrate_task_rq_fair(p, ...) -> remove_entity_load_avg(&p->se) would
> use
> cfs_rq = se->cfs_rq (i.e. root cfs_rq). So load (and util) will not
> propagate through the taskgroup hierarchy.
> 
> [...]

Good point. This should be the group cfs_rq, and
then on the next tick the load change will be 
propagated up.

Let me add that change in for v2 as well.

-- 
All Rights Reversed.

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)