[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c111cc52-4133-4f57-b753-139a5ff2b395@arm.com>
Date: Mon, 5 Jan 2026 11:45:56 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: Dietmar Eggemann <dietmar.eggemann@....com>,
Mel Gorman <mgorman@...hsingularity.net>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org,
Aishwarya TCV <Aishwarya.TCV@....com>
Subject: Re: [REGRESSION] sched/fair: Reimplement NEXT_BUDDY to align with
EEVDF goals
On 02/01/2026 15:52, Dietmar Eggemann wrote:
> On 02.01.26 13:38, Ryan Roberts wrote:
>> Hi, I appreciate I sent this report just before Xmas so most likely you haven't
>> had a chance to look, but wanted to bring it back to the top of your mailbox in
>> case it was missed.
>>
>> Happy new year!
>>
>> Thanks,
>> Ryan
>>
>> On 22/12/2025 10:57, Ryan Roberts wrote:
>>> Hi Mel, Peter,
>>>
>>> We are building out a kernel performance regression monitoring lab at Arm, and
>>> I've noticed some fairly large perofrmance regressions in real-world workloads,
>>> for which bisection has fingered this patch.
>>>
>>> We are looking at performance changes between v6.18 and v6.19-rc1, and by
>>> reverting this patch on top of v6.19-rc1 many regressions are resolved. (We plan
>>> to move the testing to linux-next over the next couple of quarters so hopefully
>>> we will be able to deliver this sort of news prior to merging in future).
>>>
>>> All testing is done on AWS Graviton3 (arm64) bare metal systems. (R)/(I) mean
>>> statistically significant regression/improvement, where "statistically
>>> significant" means the 95% confidence intervals do not overlap".
>
> You mentioned that you reverted this patch 'patch 2/2 'sched/fair:
> Reimplement NEXT_BUDDY to align with EEVDF goals'.
>
> Does this mean NEXT_BUDDY is still enabled, i.e. you haven't reverted
> patch 1/2 'sched/fair: Enable scheduler feature NEXT_BUDDY' as well?
Yes that's correct; patch 1 is still present. I could revert that as well and rerun if useful?
>
> ---
>
> Mel mentioned that he tested on a 2-socket machine. So I guess something
> like my Intel Xeon Silver 4314:
>
> cpu0 0 0
> domain0 SMT 00000001,00000001
> domain1 MC 55555555,55555555
> domain2 NUMA ffffffff,ffffffff
>
> node distances:
> node 0 1
> 0: 10 20
> 1: 20 10
>
> Whereas I assume the Graviton3 has 64 CPUs (cores) flat in a single MC
> domain? I guess topology has influence in benchmark numbers here as well.
I can't easily enable scheduler debugging right now (which I think is needed to
get this info directly?). But that's what I'd expect, yes. lscpu confirms there
is a single NUMA node and topology for cpu0 gives this if it helps:
/sys/devices/system/cpu/cpu0/topology$ grep "" -r .
./cluster_cpus:ffffffff,ffffffff
./cluster_cpus_list:0-63
./physical_package_id:0
./core_cpus_list:0
./core_siblings:ffffffff,ffffffff
./cluster_id:0
./core_siblings_list:0-63
./package_cpus:ffffffff,ffffffff
./package_cpus_list:0-63
./thread_siblings_list:0
./core_id:0
./core_cpus:00000000,00000001
./thread_siblings:00000000,00000001
>
> ---
>
> There was also a lot of improvement on schbench (wakeup latency) on
> higher percentiles (>= 99.0th) on the 2-socket machine with those 2
> patches. I guess you haven't seen those on Grav3?
>
I don't have schbench results for 6.18 but I do have them for 6.19-rc1 and for
revert-next-buddy. The means have moved a bit but there are only a couple of
cases that we consisder statistically significant (marked (R)egression /
(I)mprovement):
+----------------------------+------------------------------------------------------+-------------+-------------------+
| Benchmark | Result Class | 6-19-0-rc1 | revert-next-buddy |
+============================+======================================================+=============+===================+
| schbench/thread-contention | -m 16 -t 1 -r 10 -s 1000, avg_rps (req/sec) | 1263.97 | -6.43% |
| | -m 16 -t 1 -r 10 -s 1000, req_latency_p99 (usec) | 15088.00 | -0.28% |
| | -m 16 -t 1 -r 10 -s 1000, wakeup_latency_p99 (usec) | 3.00 | 0.00% |
| | -m 16 -t 4 -r 10 -s 1000, avg_rps (req/sec) | 6433.07 | -10.99% |
| | -m 16 -t 4 -r 10 -s 1000, req_latency_p99 (usec) | 15088.00 | -0.39% |
| | -m 16 -t 4 -r 10 -s 1000, wakeup_latency_p99 (usec) | 4.17 | (R) -16.67% |
| | -m 16 -t 16 -r 10 -s 1000, avg_rps (req/sec) | 1458.33 | -1.57% |
| | -m 16 -t 16 -r 10 -s 1000, req_latency_p99 (usec) | 813056.00 | 15.46% |
| | -m 16 -t 16 -r 10 -s 1000, wakeup_latency_p99 (usec) | 14240.00 | -5.97% |
| | -m 16 -t 64 -r 10 -s 1000, avg_rps (req/sec) | 434.22 | 3.21% |
| | -m 16 -t 64 -r 10 -s 1000, req_latency_p99 (usec) | 11354112.00 | 2.92% |
| | -m 16 -t 64 -r 10 -s 1000, wakeup_latency_p99 (usec) | 63168.00 | -2.87% |
| | -m 32 -t 1 -r 10 -s 1000, avg_rps (req/sec) | 2828.63 | 2.58% |
| | -m 32 -t 1 -r 10 -s 1000, req_latency_p99 (usec) | 15088.00 | 0.00% |
| | -m 32 -t 1 -r 10 -s 1000, wakeup_latency_p99 (usec) | 3.00 | 0.00% |
| | -m 32 -t 4 -r 10 -s 1000, avg_rps (req/sec) | 3182.15 | 5.18% |
| | -m 32 -t 4 -r 10 -s 1000, req_latency_p99 (usec) | 116266.67 | 8.22% |
| | -m 32 -t 4 -r 10 -s 1000, wakeup_latency_p99 (usec) | 6186.67 | (R) -5.34% |
| | -m 32 -t 16 -r 10 -s 1000, avg_rps (req/sec) | 749.20 | 2.91% |
| | -m 32 -t 16 -r 10 -s 1000, req_latency_p99 (usec) | 3702784.00 | (I) 13.76% |
| | -m 32 -t 16 -r 10 -s 1000, wakeup_latency_p99 (usec) | 33514.67 | 0.24% |
| | -m 32 -t 64 -r 10 -s 1000, avg_rps (req/sec) | 392.23 | 3.42% |
| | -m 32 -t 64 -r 10 -s 1000, req_latency_p99 (usec) | 16695296.00 | (I) 5.82% |
| | -m 32 -t 64 -r 10 -s 1000, wakeup_latency_p99 (usec) | 120618.67 | -3.22% |
| | -m 64 -t 1 -r 10 -s 1000, avg_rps (req/sec) | 5951.15 | 5.02% |
| | -m 64 -t 1 -r 10 -s 1000, req_latency_p99 (usec) | 15157.33 | 0.42% |
| | -m 64 -t 1 -r 10 -s 1000, wakeup_latency_p99 (usec) | 3.67 | -4.35% |
| | -m 64 -t 4 -r 10 -s 1000, avg_rps (req/sec) | 1510.23 | -1.38% |
| | -m 64 -t 4 -r 10 -s 1000, req_latency_p99 (usec) | 802816.00 | 13.73% |
| | -m 64 -t 4 -r 10 -s 1000, wakeup_latency_p99 (usec) | 14890.67 | -10.44% |
| | -m 64 -t 16 -r 10 -s 1000, avg_rps (req/sec) | 458.87 | 4.60% |
| | -m 64 -t 16 -r 10 -s 1000, req_latency_p99 (usec) | 11348650.67 | (I) 2.67% |
| | -m 64 -t 16 -r 10 -s 1000, wakeup_latency_p99 (usec) | 63445.33 | (R) -5.48% |
| | -m 64 -t 64 -r 10 -s 1000, avg_rps (req/sec) | 541.33 | 2.65% |
| | -m 64 -t 64 -r 10 -s 1000, req_latency_p99 (usec) | 36743850.67 | (I) 10.95% |
| | -m 64 -t 64 -r 10 -s 1000, wakeup_latency_p99 (usec) | 211370.67 | -1.94% |
+----------------------------+------------------------------------------------------+-------------+-------------------+
I could get the results for 6.18 if useful, but I think what I have probably
shows enough of the picture: This patch has not impacted schbench much on
this HW.
Thanks,
Ryan
Powered by blists - more mailing lists