[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aLmaJEU-WwVmVdYI@localhost.localdomain>
Date: Thu, 4 Sep 2025 15:54:44 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: "Christoph Lameter (Ampere)" <cl@...two.org>
Cc: Valentin Schneider <vschneid@...hat.com>,
Adam Li <adamli@...amperecomputing.com>, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
linux-kernel@...r.kernel.org, patches@...erecomputing.com
Subject: Re: [PATCH] sched/nohz: Fix NOHZ imbalance by adding options for ILB
CPU
Le Wed, Aug 20, 2025 at 10:31:24AM -0700, Christoph Lameter (Ampere) a écrit :
> On Wed, 20 Aug 2025, Valentin Schneider wrote:
>
> > My first question would be: is NOHZ_FULL really right for your workload?
>
> Yes performance is improved. AI workloads are like HPC workloads in that
> they need to do compute and then rendezvous for data exchange.
Ok, I was about to say that this is the first known (for me) usecase of
nohz_full that is about performance and doesn't strictly require low-latency
guarantee. But...
> Variations
> in the runtime due to timer ticks cause idle periods where the rendezvous
> cannot be completed because some cpus are delayed.
>
> The more frequent rendezvous can be performed the better the performance
> numbers will be.
...that is low-latency requirement...for performance :-)
That's an argument _not_ in favour of dynamic balancing such as ILB, even for
this usecase in nohz_full (all the other usecases of nohz_full I know really
want static affinity and no balancing at all).
So I have to ask, what would be wrong with static affinities to these tasks?
>
> > It's mainly designed to be used with always-running userspace
> tasks, > generally affined to a CPU by the system administrator.
>
> hohz full has been reworked somewhat since the early days and works in a
> more general way today.
Not sure about that. Although it was not initially intended to, it has
been very single purpose since the early days: ie: run a single task in
userspace without being disturbed.
> > Here AIUI you're relying on the scheduler load balancing to distribute work
> > to the NOHZ_FULL CPUs, so you're going to be penalized a lot by the
> > NOHZ_FULL context switch overheads. What's the point? Wouldn't you have
> > less overhead with just NOHZ_IDLE?
>
> The benchmarks show a regression of 10-20% if the tick is operational.
Impressive!
> The context switch overhead is negligible since the cpus are doing compute
> and not system calls.
And not many syscalls, right?
Thanks.
--
Frederic Weisbecker
SUSE Labs
Powered by blists - more mailing lists