[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmhfrdblnp3.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
Date: Thu, 28 Aug 2025 12:56:08 +0200
From: Valentin Schneider <vschneid@...hat.com>
To: Adam Li <adamli@...amperecomputing.com>, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, vincent.guittot@...aro.org
Cc: dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, cl@...ux.com, frederic@...nel.org,
linux-kernel@...r.kernel.org, patches@...erecomputing.com
Subject: Re: [PATCH] sched/nohz: Fix NOHZ imbalance by adding options for
ILB CPU
On 21/08/25 19:18, Adam Li wrote:
> On 8/20/2025 7:46 PM, Valentin Schneider wrote:
>> Right. So other than the NO_HZ_FULL vs NO_HZ_IDLE considerations above, you
>> could manually affine the threads of the workload. Depending on how much
>> control you have over how many threads it spawn, you could either pin on
>> thread per CPU, or just spawn the workload into a cpuset covering the
>> NO_HZ_FULL CPUs.
>>
>
> Yes, binding the threads to CPU can work around the performance
> issue caused by load imbalance. Should we document that 'nohz_full' may cause
> the scheduler load balancing not working well and CPU affinity is preferred?
>
Yeah I guess we could highlight that.
I think it's kind of a gray area; technically we could change load
balancing to make NO_HZ_FULL CPUs better at pulling tasks, but that only
works up to the point where, if you have N NO_HZ_FULL CPUs, you have pulled
N tasks. So there is an underlying assumption that the workload threading
matches your NO_HZ_FULL topology; and if that's the case, you might as well
affine the tasks by hand and avoid any surprises.
Put in another way: yes we can probably make load balancing better
for NO_HZ_FULL CPUs, but that only really works if we have one task to pull
per NO_HZ_FULL CPU, in which case manual affinity binding works just as
well, and I prefer that approach since it means we don't have to add a
NO_HZ_FULL load balancing logic which may end up interfering with
NO_HZ_FULL itself. At least, that is my opinion.
>> Having the scheduler do the balancing is bit of a precarious
>> situation. Your single housekeeping CPU is pretty much going to be always
>> running things, does it make sense to have it run the NOHZ idle balance
>> when there are available idle NOHZ_FULL CPUs? And in the same sense, does
>> it make sense to disturb an idle NOHZ_FULL CPU to get it to spread load on
>> other NOHZ_FULL CPUs? Admins that manually affine their threads will
>> probably say no.
>>
>
> I think when the NOHZ_FULL CPU is added to nohz.idle_cpus_mask and
> its tick is stopped, the CPU is 'very' idle. We can safely assign some work to it.
>
>> 9b019acb72e4 ("sched/nohz: Run NOHZ idle load balancer on HK_FLAG_MISC CPUs")
>> also mentions SMT being an issue.
>>
>
> From the commit message of 9b019acb72e4:
> "The problem was observed with increased jitter on an application
> running on CPU0, caused by NOHZ idle load balancing being run on
> CPU1 (an SMT sibling)."
>
> Can we say if *no* SMT, it is safe to run NOHZ idle load balancing
> on CPU in nohz.idle_cpus_mask? My patch checks '!sched_smt_active()' when
> searching from nohz.idle_cpus_mask.
>
I suppose we could still make this work for SMT with e.g. is_core_idle(),
but see my point above.
> Thanks,
> -adam
Powered by blists - more mailing lists