lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtCtR7Q6PxRRXGxfKnhyPTODBGs5cFRVL6A0nHx_GnpA9w@mail.gmail.com>
Date: Wed, 3 Sep 2025 16:14:04 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Valentin Schneider <vschneid@...hat.com>
Cc: "Christoph Lameter (Ampere)" <cl@...two.org>, Adam Li <adamli@...amperecomputing.com>, mingo@...hat.com, 
	peterz@...radead.org, juri.lelli@...hat.com, dietmar.eggemann@....com, 
	rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de, frederic@...nel.org, 
	linux-kernel@...r.kernel.org, patches@...erecomputing.com
Subject: Re: [PATCH] sched/nohz: Fix NOHZ imbalance by adding options for ILB CPU

On Wed, 3 Sept 2025 at 14:35, Valentin Schneider <vschneid@...hat.com> wrote:
>
> On 28/08/25 08:44, Christoph Lameter (Ampere) wrote:
> > On Thu, 28 Aug 2025, Valentin Schneider wrote:
> >
> >> > Yes, binding the threads to CPU can work around the performance
> >> > issue caused by load imbalance. Should we document that 'nohz_full' may cause
> >> > the scheduler load balancing not working well and CPU affinity is preferred?
> >> >
> >>
> >> Yeah I guess we could highlight that.
> >
> > We need to make sure that the idle cpus are used when available and
> > needed. Otherwise the scheduler is buggy.
> >
> > Such a load balancing action means that there is a cpu that is running
> > multiple processes. Therefore the timer interrrupt and the scheduler
> > processing is active on at least one cpu. We can therefore do something
> > about the situation.
> >
> > The scheduler needs to move one of the processes onto the idle cpu.
>
> AFAICT we have (at least) two options:
> 1) Trigger NOHZ balancing on a busy housekeeping CPU (what this patch does)
>
>    This is somewhat against idle load balancing rules (only spend CPU time
>    on that if there is no "genuine" work to run), but I guess from a CPU
>    isolation PoV this can be tallied as just another housekeeping activity

In this case, this should only be done for full nohz case and not for
other cases because the ILB overhead is not negligible on a busy cpu
and I don't see anything that enable 1) only for full no hz

>
> 2) Trigger NOHZ balancing on an idle NOHZ_FULL CPU

this patch also does 2) for no smt case

I wonder why this happens only for no smt case ?   If the sibling is
used by another thread with full nohz, it already interferes with this
one

But we might want to do is_core_idle() instead

>
>    That doesn't steal useful CPU time, but that also potentially causes
>    interference, albeit only if racing with the NOHZ_FULL workload spawning
>    (which shouldn't be the steady state).
>
> The more I think about it the more I'm leaning towards 1), but I'd like
> other folks' opinion.
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ