lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aLm6q5-4bZ78cM5P@localhost.localdomain>
Date: Thu, 4 Sep 2025 18:13:31 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: "Christoph Lameter (Ampere)" <cl@...two.org>
Cc: Valentin Schneider <vschneid@...hat.com>,
	Adam Li <adamli@...amperecomputing.com>, mingo@...hat.com,
	peterz@...radead.org, juri.lelli@...hat.com,
	vincent.guittot@...aro.org, dietmar.eggemann@....com,
	rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
	linux-kernel@...r.kernel.org, patches@...erecomputing.com
Subject: Re: [PATCH] sched/nohz: Fix NOHZ imbalance by adding options for ILB
 CPU

Le Thu, Sep 04, 2025 at 08:34:34AM -0700, Christoph Lameter (Ampere) a écrit :
> On Thu, 4 Sep 2025, Frederic Weisbecker wrote:
> 
> > Le Wed, Aug 20, 2025 at 10:31:24AM -0700, Christoph Lameter (Ampere) a écrit :
> > > On Wed, 20 Aug 2025, Valentin Schneider wrote:
> > >
> > > > My first question would be: is NOHZ_FULL really right for your workload?
> > >
> > > Yes performance is improved. AI workloads are like HPC workloads in that
> > > they need to do compute and then rendezvous for data exchange.
> >
> > Ok, I was about to say that this is the first known (for me) usecase of
> > nohz_full that is about performance and doesn't strictly require low-latency
> > guarantee. But...
> 
> For me it was always about both. Low latency is required for a high number
> of compute cycles in HPC apps. It is a requiremen for high performance
> parallelized compute.

Right, it's just that until now I was used to workloads that would even
be broken if the occasional jitter reached some threshold, which doesn't
appear to be your case.

> 
> > > The more frequent rendezvous can be performed the better the performance
> > > numbers will be.
> >
> > ...that is low-latency requirement...for performance :-)
> 
> Yea thats why we want this in HPC/HFT and AI applications.

Ok.

> > That's an argument _not_ in favour of dynamic balancing such as ILB, even for
> > this usecase in nohz_full (all the other usecases of nohz_full I know really
> > want static affinity and no balancing at all).
> >
> > So I have to ask, what would be wrong with static affinities to these tasks?
> 
> Static affinities are great but they keep the tick active and thus the
> rendevous can be off off one or the other compute thread.

How do static affinities keep the tick active?

> 
> > > hohz full has been reworked somewhat since the early days and works in a
> > > more general way today.
> >
> > Not sure about that. Although it was not initially intended to, it has
> > been very single purpose since the early days: ie: run a single task in
> > userspace without being disturbed.
> 
> The restrictions have been reduced from what I see in the code and
> syscalls are possible without incurring a 2 second penalty of ticks.

Yes the isolation has been improved overall but the basic constraints remain.

> > > > Here AIUI you're relying on the scheduler load balancing to distribute work
> > > > to the NOHZ_FULL CPUs, so you're going to be penalized a lot by the
> > > > NOHZ_FULL context switch overheads. What's the point? Wouldn't you have
> > > > less overhead with just NOHZ_IDLE?
> > >
> > > The benchmarks show a regression of 10-20% if the tick is operational.
> >
> > Impressive!
> >
> > > The context switch overhead is negligible since the cpus are doing compute
> > > and not system calls.
> >
> > And not many syscalls, right?
> 
> Periodically the data needs to be saved but that can be done from special
> threads or after a large number of compute cycles is complete.

Got it!

Thanks.

-- 
Frederic Weisbecker
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ