[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtCtR7Q6PxRRXGxfKnhyPTODBGs5cFRVL6A0nHx_GnpA9w@mail.gmail.com>
Date: Wed, 3 Sep 2025 16:14:04 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Valentin Schneider <vschneid@...hat.com>
Cc: "Christoph Lameter (Ampere)" <cl@...two.org>, Adam Li <adamli@...amperecomputing.com>, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de, frederic@...nel.org,
linux-kernel@...r.kernel.org, patches@...erecomputing.com
Subject: Re: [PATCH] sched/nohz: Fix NOHZ imbalance by adding options for ILB CPU
On Wed, 3 Sept 2025 at 14:35, Valentin Schneider <vschneid@...hat.com> wrote:
>
> On 28/08/25 08:44, Christoph Lameter (Ampere) wrote:
> > On Thu, 28 Aug 2025, Valentin Schneider wrote:
> >
> >> > Yes, binding the threads to CPU can work around the performance
> >> > issue caused by load imbalance. Should we document that 'nohz_full' may cause
> >> > the scheduler load balancing not working well and CPU affinity is preferred?
> >> >
> >>
> >> Yeah I guess we could highlight that.
> >
> > We need to make sure that the idle cpus are used when available and
> > needed. Otherwise the scheduler is buggy.
> >
> > Such a load balancing action means that there is a cpu that is running
> > multiple processes. Therefore the timer interrrupt and the scheduler
> > processing is active on at least one cpu. We can therefore do something
> > about the situation.
> >
> > The scheduler needs to move one of the processes onto the idle cpu.
>
> AFAICT we have (at least) two options:
> 1) Trigger NOHZ balancing on a busy housekeeping CPU (what this patch does)
>
> This is somewhat against idle load balancing rules (only spend CPU time
> on that if there is no "genuine" work to run), but I guess from a CPU
> isolation PoV this can be tallied as just another housekeeping activity
In this case, this should only be done for full nohz case and not for
other cases because the ILB overhead is not negligible on a busy cpu
and I don't see anything that enable 1) only for full no hz
>
> 2) Trigger NOHZ balancing on an idle NOHZ_FULL CPU
this patch also does 2) for no smt case
I wonder why this happens only for no smt case ? If the sibling is
used by another thread with full nohz, it already interferes with this
one
But we might want to do is_core_idle() instead
>
> That doesn't steal useful CPU time, but that also potentially causes
> interference, albeit only if racing with the NOHZ_FULL workload spawning
> (which shouldn't be the steady state).
>
> The more I think about it the more I'm leaning towards 1), but I'd like
> other folks' opinion.
>
Powered by blists - more mailing lists