lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <DB4125C7-F0C5-4413-9320-60543F0B20A6@oracle.com>
Date: Wed, 26 Nov 2025 22:02:58 +0000
From: Prakash Sangappa <prakash.sangappa@...cle.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra
	<peterz@...radead.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>, Jonathan Corbet <corbet@....net>,
        Madadi Vineeth Reddy
	<vineethr@...ux.ibm.com>,
        K Prateek Nayak <kprateek.nayak@....com>,
        Steven
 Rostedt <rostedt@...dmis.org>,
        Sebastian Andrzej Siewior
	<bigeasy@...utronix.de>,
        Arnd Bergmann <arnd@...db.de>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Subject: Re: [patch V3 07/12] rseq: Implement syscall entry work for time
 slice extensions



> On Nov 20, 2025, at 4:12 PM, Prakash Sangappa <prakash.sangappa@...cle.com> wrote:
> 
> 
> 
>> On Nov 20, 2025, at 3:31 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
>> 
>> On Thu, Nov 20 2025 at 07:37, Prakash Sangappa wrote:
>>>> On Nov 19, 2025, at 7:25 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
>>>> Something like the uncompiled and untested below should work. Though I
>>>> hate it with a passion.
>>> 
>>> That works. It addresses DB issue.
>>> 
>>> With this change, here are the ’swingbench’ performance results I received from our Database team.
>>> https://www.dominicgiles.com/swingbench/
>>> 
>>> Kernel based on rseq/slice v3 + above change.
>>> System: 2 socket AMD.
>>> Cached DB config - i.e DB files cached on tmpfs.
>>> 
>>> Response from Database performance engineer:-
>>> 
>>> Overall the results are very positive and consistent with the earlier
>>> findings, we see a clear benefit from the optimization running the
>>> same tests as earlier.
>>> 
>>> • The sgrant figure in /sys/kernel/debug/rseq/stats increases with the
>>> DB side optimization enabled, while it stays flat when disabled.  I
>>> believe this indicates that both the kernel-side code & the DB side
>>> triggers are working as expected.
>> 
>> Correct.
>> 
>>> • Due to the contentious nature of the workload these tests produce
>>> highly erratic results, but the optimization is showing improved
>>> performance across 3x tests with/without use of time slice extension.
>>> 
>>> • Swingbench throughput with use of time slice optimization
>>> • Run 1: 50,008.10
>>> • Run 2: 59,160.60
>>> • Run 3: 67,342.70
>>> • Swingbench throughput without use of time slice optimization
>>> • Run 1: 36,422.80
>>> • Run 2: 33,186.00
>>> • Run 3: 44,309.80
>>> • The application performs 55% better on average with the optimization.
>> 
>> 55% is insane.
>> 
>> Could you please ask your performance guys to provide numbers for the
>> below configurations to see how the different parts of this work are
>> affecting the overall result:
>> 
>> 1) Linux 6.17 (no rseq rework, no slice)
>> 
>> 2) Linux 6.17 + your initial attempt to enable slice extension
>> 
>> We already have the numbers for the full new stack above (with and
>> without slice), so that should give us the full picture.
>> 
> 

My previous(initial) implementation on v6.17 kernel was showing higher numbers.
So, to keep things similar to the rseq/slice kernel, got following numbers From DB engineer
with the  previous implementation  built on v6.18-rc4 kernel.

Swingbench thought put with use of slice extension(previous implementation)
	* Run 1: 50824.10
	* Run 2: 54058.30
	* Run 3: 30212.50
Swingbench through put without use of optimization.
	* Run 1: 33036.50
	* Run 2: 35939.60
	* Run 3: 40461.70 
Performs 23% better with time slice optimization.

The workload shows lot of variability. However overall trend seems consistent(ie we see
 improvement with slice extension).
I think above should give an idea of potential gains the underlying rseq framework optimization adds. 

Thanks,
-Prakash

> Ok, will ask him to run these. 
> -Prakash.
> 
>> Thanks,
>> 
>>       tglx
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ