[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YdxeoRUeZhl2D+dK@FVFF7649Q05P>
Date: Mon, 10 Jan 2022 16:29:19 +0000
From: Vincent Donnefort <vincent.donnefort@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: peterz@...radead.org, mingo@...hat.com,
linux-kernel@...r.kernel.org, dietmar.eggemann@....com,
Valentin.Schneider@....com, Morten.Rasmussen@....com,
qperret@...gle.com
Subject: Re: [PATCH 2/3] sched/fair: Fix newidle_balance() for overutilized
systems
[...]
>
> > can spuriously maintain overutilized for a long period of time.
> >
> > We then need newidle_balance() to proceed with balancing if the system is
> > overutilized.
>
> Always triggering a costly newidle_balance when you are already
> overutilized for the sole purpose of clearing overutilized seems to be
> a bit overkill.
But the only cases where newidle_balance() would now run while it used not to,
are when overutilized is set but overload is not. Which is either a transient
state for which we do not anticipate more than one stat update or it is the
situation where one of the biggest CPU is overutilized while having nr_running <
2.
It can indeed add some additional costly calls to newidle_balance, but they
will not be plentiful, especially with the other patch from this series:
"sched/fair: Do not raise overutilized for idle CPUs"
>
> Furthermore, nothing prevents us to abort newidle_balance before
> reaching the root domain
should_we_balance() always return true in the case of newidle. So I suppose you
refer to max_newidle_lb_cost?
>
> So this doesn't seem like the good way to proceed
What are our other options?
Resolving it in the nohz balancer would need to change should_we_balance().
I also tried solely to not raise overutilized when the CPU is idle but this is
not a solution either as when a task migration is pending, you can end-up with
a !idle CPU but with nr_running < 2, so once again overutilized set, overload
not.
>
> >
> > Fixes: 2802bf3cd936 ("sched/fair: Add over-utilization/tipping point indicator")
> > Signed-off-by: Vincent Donnefort <vincent.donnefort@....com>
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index e2f6fa14e5e7..51f6f55abb37 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -10849,7 +10849,8 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
> > rcu_read_lock();
> > sd = rcu_dereference_check_sched_domain(this_rq->sd);
> >
> > - if (!READ_ONCE(this_rq->rd->overload) ||
> > + if ((!READ_ONCE(this_rq->rd->overload) &&
> > + !READ_ONCE(this_rq->rd->overutilized)) ||
> > (sd && this_rq->avg_idle < sd->max_newidle_lb_cost)) {
> >
> > if (sd)
> > --
> > 2.25.1
> >
Powered by blists - more mailing lists