[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKfTPtC157Z2vsnW3MLqKcMBYB-0D255rYr1Y-vD5xYDLBNoVQ@mail.gmail.com>
Date: Fri, 25 Jun 2021 10:50:12 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Tim Chen <tim.c.chen@...ux.intel.com>
Cc: Qais Yousef <qais.yousef@....com>,
Joel Fernandes <joel@...lfernandes.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Paul McKenney <paulmck@...nel.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Dietmar Eggeman <dietmar.eggemann@....com>,
Ben Segall <bsegall@...gle.com>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Mel Gorman <mgorman@...e.de>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
"Uladzislau Rezki (Sony)" <urezki@...il.com>,
Neeraj upadhyay <neeraj.iitr10@...il.com>,
Aubrey Li <aubrey.li@...ux.intel.com>
Subject: Re: [PATCH] sched/fair: Rate limit calls to update_blocked_averages()
for NOHZ
On Fri, 18 Jun 2021 at 18:14, Tim Chen <tim.c.chen@...ux.intel.com> wrote:
>
>
>
> On 6/18/21 3:28 AM, Vincent Guittot wrote:
>
> >>
> >> The current logic is when a CPU becomes idle, next_balance occur very
> >> shortly (usually in the next jiffie) as get_sd_balance_interval returns
> >> the next_balance in the next jiffie if the CPU is idle. However, in
> >> reality, I saw most CPUs are 95% busy on average for my workload and
> >> a task will wake up on an idle CPU shortly. So having frequent idle
> >> balancing towards shortly idle CPUs is counter productive and simply
> >> increase overhead and does not improve performance.
> >
> > Just to make sure that I understand your problem correctly: Your problem is:
> > - that we have an ilb happening on the idle CPU and consume cycle
>
> That's right. The cycles are consumed heavily in update_blocked_averages()
> when cgroup is enabled.
But they are normally consumed on an idle CPU and the ILB checks
need_resched() before running load balance for the next idle CPU.
Does it mean that your problem is coming from update_blocked_average()
spending a long time with rq_lock_irqsave and increasing the wakeup
latency of your short running task ?
>
> > - or that the ilb will pull a task on an idle CPU on which a task will
> > shortly wakeup which ends to 2 tasks competing for the same CPU.
> >
>
> Because for the OLTP workload I'm looking at, we have tasks that sleep
> for a short while and wake again very shortly (i.e. the CPU actually
> is ~95% busy on average), pulling tasks to such a CPU is really not
> helpful to improve overall CPU utilization in the system. So my
> intuition is for such almost fully busy CPU, we should defer load
> balancing to it (see prototype patch 3).
Note that this is at the opposite of what you said earlier:
"
Though in our test environment, sysctl_sched_migration_cost was kept
much lower (25000) compared to the default (500000), to encourage
migrations to idle cpu
and reduce latency.
"
But, it will be quite hard to find a value that fits to requirements
for everybody and some will have UCs for which they want to pull tasks
even if the CPU is 95% busy; You can have 2ms of idle time but having
a utilization above 95% and an ILB inside a Core or at LLC is somewhat
cheap and would take advantage of those 2ms
>
> Tim
>
>
>
>
Powered by blists - more mailing lists