linux-kernel - Re: [RFC PATCH 0/4] Scheduler time slice extension

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <008AE921-4D24-44B4-9244-814D65FC3416@oracle.com>
Date: Wed, 13 Nov 2024 19:56:12 +0000
From: Prakash Sangappa <prakash.sangappa@...cle.com>
To: K Prateek Nayak <kprateek.nayak@....com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "peterz@...radead.org"
	<peterz@...radead.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        Daniel
 Jordan <daniel.m.jordan@...cle.com>
Subject: Re: [RFC PATCH 0/4] Scheduler time slice extension



> On Nov 12, 2024, at 9:43 PM, K Prateek Nayak <kprateek.nayak@....com> wrote:
> 
> Hello Prakash,
> 
> Few questions around the benchmarks.
> 
> On 11/13/2024 5:31 AM, Prakash Sangappa wrote:
>> [..snip..] Test results:
>> =============
>> Test system 2 socket AMD Genoa
>> Lock table test:- a simple database test to grab table lock(spin lock).
>>   Simulates sql query executions.
>>   300 clients + 400 cpu hog tasks to generate load.
> 
> Have you tried running the 300 clients with a nice value of -20 and 400
> CPU hogs with the default nice value / nice 19? Does that help this
> particular case?

Have not tried this with the database. Will have to try it.


> 
>> Without extension : 182K SQL exec/sec
>> With extension    : 262K SQL exec/sec
>>   44% improvement.
>> Swingbench - standard database benchmark
>>   Cached(database files on tmpfs) run, with 1000 clients.
> 
> In this case, how does the performance fare when running the clients
> under SCHED_BATCH? What does the "TASK_PREEMPT_DELAY_REQ" count vs
> "TASK_PREEMPT_DELAY_GRANTED" count look like for the benchmark run?

Not tried SCHED_BATCH. 

With this run, there were about avg ‘166' TASK_PREEMPT_DELAY_GRANTED grants per task, collected from the scheduler stats captured at the end of the run. Test runs for about 5min.  Don't have the count of how many times preempt delay was requested. If the task completes the critical section, it clears the  TASK_PREEMPT_DELAY_REQ flag, so kernel would not see it many cases as this may not be near the end of the time slice. We would have to capture the count in the application. 


> 
> I'm trying to understand what the performance looks like when using
> existing features that inhibit preemption vs putting forward the
> preemption when the userspace is holding a lock. Feel free to quote
> the latency comparisons too if using the existing features lead to
> unacceptable avg/tail latencies.
> 
>> Without extension : 99K SQL exec/sec
>> with extension    : 153K SQL exec/sec
>>   55% improvement in throughput.
>> [..snip..]
> 
> -- 
> Thanks and Regards,
> Prateek