[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <874iqpkkid.ffs@tglx>
Date: Thu, 20 Nov 2025 12:31:54 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Prakash Sangappa <prakash.sangappa@...cle.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Peter Zijlstra
<peterz@...radead.org>, Mathieu Desnoyers
<mathieu.desnoyers@...icios.com>, "Paul E. McKenney" <paulmck@...nel.org>,
Boqun Feng <boqun.feng@...il.com>, Jonathan Corbet <corbet@....net>,
Madadi Vineeth Reddy <vineethr@...ux.ibm.com>, K Prateek Nayak
<kprateek.nayak@....com>, Steven
Rostedt <rostedt@...dmis.org>, Sebastian Andrzej Siewior
<bigeasy@...utronix.de>, Arnd Bergmann <arnd@...db.de>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Subject: Re: [patch V3 07/12] rseq: Implement syscall entry work for time
slice extensions
On Thu, Nov 20 2025 at 07:37, Prakash Sangappa wrote:
>> On Nov 19, 2025, at 7:25 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
>> Something like the uncompiled and untested below should work. Though I
>> hate it with a passion.
>
> That works. It addresses DB issue.
>
> With this change, here are the ’swingbench’ performance results I received from our Database team.
> https://www.dominicgiles.com/swingbench/
>
> Kernel based on rseq/slice v3 + above change.
> System: 2 socket AMD.
> Cached DB config - i.e DB files cached on tmpfs.
>
> Response from Database performance engineer:-
>
> Overall the results are very positive and consistent with the earlier
> findings, we see a clear benefit from the optimization running the
> same tests as earlier.
>
> • The sgrant figure in /sys/kernel/debug/rseq/stats increases with the
> DB side optimization enabled, while it stays flat when disabled. I
> believe this indicates that both the kernel-side code & the DB side
> triggers are working as expected.
Correct.
> • Due to the contentious nature of the workload these tests produce
> highly erratic results, but the optimization is showing improved
> performance across 3x tests with/without use of time slice extension.
>
> • Swingbench throughput with use of time slice optimization
> • Run 1: 50,008.10
> • Run 2: 59,160.60
> • Run 3: 67,342.70
> • Swingbench throughput without use of time slice optimization
> • Run 1: 36,422.80
> • Run 2: 33,186.00
> • Run 3: 44,309.80
> • The application performs 55% better on average with the optimization.
55% is insane.
Could you please ask your performance guys to provide numbers for the
below configurations to see how the different parts of this work are
affecting the overall result:
1) Linux 6.17 (no rseq rework, no slice)
2) Linux 6.17 + your initial attempt to enable slice extension
We already have the numbers for the full new stack above (with and
without slice), so that should give us the full picture.
Thanks,
tglx
Powered by blists - more mailing lists