[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3c1e6b45-79b8-463a-8c41-565d9ed8f76d@redhat.com>
Date: Wed, 29 Oct 2025 22:56:06 -0400
From: Waiman Long <llong@...hat.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: linux-kernel@...r.kernel.org, Gabriele Monaco <gmonaco@...hat.com>,
 Frederic Weisbecker <frederic@...nel.org>,
 Anna-Maria Behnsen <anna-maria@...utronix.de>
Subject: Re: [RESEND PATCH v13 0/9] timers: Exclude isolated cpus from timer
 migration
On 10/20/25 7:27 AM, Gabriele Monaco wrote:
> The timer migration mechanism allows active CPUs to pull timers from
> idle ones to improve the overall idle time. This is however undesired
> when CPU intensive workloads run on isolated cores, as the algorithm
> would move the timers from housekeeping to isolated cores, negatively
> affecting the isolation.
>
> Exclude isolated cores from the timer migration algorithm, extend the
> concept of unavailable cores, currently used for offline ones, to
> isolated ones:
> * A core is unavailable if isolated or offline;
> * A core is available if non isolated and online;
>
> A core is considered unavailable as isolated if it belongs to:
> * the isolcpus (domain) list
> * an isolated cpuset
> Except if it is:
> * in the nohz_full list (already idle for the hierarchy)
> * the nohz timekeeper core (must be available to handle global timers)
>
> CPUs are added to the hierarchy during late boot, excluding isolated
> ones, the hierarchy is also adapted when the cpuset isolation changes.
>
> Due to how the timer migration algorithm works, any CPU part of the
> hierarchy can have their global timers pulled by remote CPUs and have to
> pull remote timers, only skipping pulling remote timers would break the
> logic.
> For this reason, prevent isolated CPUs from pulling remote global
> timers, but also the other way around: any global timer started on an
> isolated CPU will run there. This does not break the concept of
> isolation (global timers don't come from outside the CPU) and, if
> considered inappropriate, can usually be mitigated with other isolation
> techniques (e.g. IRQ pinning).
>
> This effect was noticed on a 128 cores machine running oslat on the
> isolated cores (1-31,33-63,65-95,97-127). The tool monopolises CPUs,
> and the CPU with lowest count in a timer migration hierarchy (here 1
> and 65) appears as always active and continuously pulls global timers,
> from the housekeeping CPUs. This ends up moving driver work (e.g.
> delayed work) to isolated CPUs and causes latency spikes:
>
> before the change:
>
>   # oslat -c 1-31,33-63,65-95,97-127 -D 62s
>   ...
>    Maximum:     1203 10 3 4 ... 5 (us)
>
> after the change:
>
>   # oslat -c 1-31,33-63,65-95,97-127 -D 62s
>   ...
>    Maximum:      10 4 3 4 3 ... 5 (us)
>
> The same behaviour was observed on a machine with as few as 20 cores /
> 40 threads with isocpus set to: 1-9,11-39 with rtla-osnoise-top.
>
> The first 5 patches are preparatory work to change the concept of
> online/offline to available/unavailable, keep track of those in a
> separate cpumask cleanup the setting/clearing functions and change a
> function name in cpuset code.
>
> Patch 6 and 7 adapt isolation and cpuset to prevent domain isolated and
> nohz_full from covering all CPUs not leaving any housekeeping one. This
> can lead to problems with the changes introduced in this series because
> no CPU would remain to handle global timers.
>
> Patch 9 extends the unavailable status to domain isolated CPUs, which
> is the main contribution of the series.
>
> This series is equivalent to v13 but rebased on v6.18-rc2.
Thomas,
This patch series have undergone multiple round of reviews. Do you think 
it is good enough to be merged into tip?
It does contain some cpuset code, but most of the changes are in the 
timer code. So I think it is better to go through the tip tree. It does 
have some minor conflicts with the current for-6.19 branch of the cgroup 
tree, but it can be easily resolved during merge.
What do you think?
Thanks,
Longman
Powered by blists - more mailing lists
 
