[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d5cb15bd-1096-45a8-9da6-a37ff490714c@linux.ibm.com>
Date: Wed, 9 Jul 2025 22:16:14 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, mingo@...hat.com, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
vschneid@...hat.com, clm@...a.com,
Madhavan Srinivasan <maddy@...ux.ibm.com>
Subject: Re: [PATCH v2 00/12] sched: Address schbench regression
On 7/9/25 00:32, Peter Zijlstra wrote:
> On Mon, Jul 07, 2025 at 11:49:17PM +0530, Shrikanth Hegde wrote:
>
>> Git bisect points to
>> # first bad commit: [dc968ba0544889883d0912360dd72d90f674c140] sched: Add ttwu_queue support for delayed tasks
>
> Moo.. Are IPIs particularly expensive on your platform?
>
> The 5 cores makes me think this is a partition of sorts, but IIRC the
> power LPAR stuff was fixed physical, so routing interrupts shouldn't be
> much more expensive vs native hardware.
>
Yes, we call it as dedicated LPAR. (Hypervisor optimises such that overhead is minimal,
i think that i true for interrupts too).
Some more variations of testing and numbers:
The system had some configs which i had messed up such as CONFIG_SCHED_SMT=n. I copied the default
distro config back and ran the benchmark again. Slightly better numbers compared to earlier.
Still a major regression. Collected mpstat numbers. It shows much less percentage compared to
earlier.
--------------------------------------------------------------------------
base: 8784fb5fa2e0 (tip/master)
Wakeup Latencies percentiles (usec) runtime 30 (s) (41567569 total samples)
50.0th: 11 (10767158 samples)
90.0th: 22 (16782627 samples)
* 99.0th: 36 (3347363 samples)
99.9th: 52 (344977 samples)
min=1, max=731
RPS percentiles (requests) runtime 30 (s) (31 total samples)
20.0th: 1443840 (31 samples)
* 50.0th: 1443840 (0 samples)
90.0th: 1443840 (0 samples)
min=1433480, max=1444037
average rps: 1442889.23
CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
all 3.24 0.00 11.39 0.00 37.30 0.00 0.00 0.00 0.00 48.07
all 2.59 0.00 11.56 0.00 37.62 0.00 0.00 0.00 0.00 48.23
base + clm's patch + series:
Wakeup Latencies percentiles (usec) runtime 30 (s) (27166787 total samples)
50.0th: 57 (8242048 samples)
90.0th: 120 (10677365 samples)
* 99.0th: 182 (2435082 samples)
99.9th: 262 (241664 samples)
min=1, max=89984
RPS percentiles (requests) runtime 30 (s) (31 total samples)
20.0th: 896000 (8 samples)
* 50.0th: 902144 (10 samples)
90.0th: 928768 (10 samples)
min=881548, max=971101
average rps: 907530.10 <<< close to 40% drop in RPS.
CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
all 1.95 0.00 7.67 0.00 14.84 0.00 0.00 0.00 0.00 75.55
all 1.61 0.00 7.91 0.00 13.53 0.05 0.00 0.00 0.00 76.90
-----------------------------------------------------------------------------
- To be sure, I tried on another system. That system had 30 cores.
base:
Wakeup Latencies percentiles (usec) runtime 30 (s) (40339785 total samples)
50.0th: 12 (12585268 samples)
90.0th: 24 (15194626 samples)
* 99.0th: 44 (3206872 samples)
99.9th: 59 (320508 samples)
min=1, max=1049
RPS percentiles (requests) runtime 30 (s) (31 total samples)
20.0th: 1320960 (14 samples)
* 50.0th: 1333248 (2 samples)
90.0th: 1386496 (12 samples)
min=1309615, max=1414281
base + clm's patch + series:
Wakeup Latencies percentiles (usec) runtime 30 (s) (34318584 total samples)
50.0th: 23 (10486283 samples)
90.0th: 64 (13436248 samples)
* 99.0th: 122 (3039318 samples)
99.9th: 166 (306231 samples)
min=1, max=7255
RPS percentiles (requests) runtime 30 (s) (31 total samples)
20.0th: 1006592 (8 samples)
* 50.0th: 1239040 (9 samples)
90.0th: 1259520 (11 samples)
min=852462, max=1268841
average rps: 1144229.23 << close 10-15% drop in RPS
- Then I resized that 30 core LPAR into a 5 core LPAR to see if the issue pops up in a smaller
config. It did. I see similar regression of 40-50% drop in RPS.
- Then I made it as 6 core system. To see if this is due to any ping pong because of odd numbers.
Numbers are similar to 5 core case.
- Maybe regressions is higher in smaller configurations.
Powered by blists - more mailing lists