lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250609165532.3265e142@gandalf.local.home>
Date: Mon, 9 Jun 2025 16:55:32 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Prakash Sangappa <prakash.sangappa@...cle.com>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "peterz@...radead.org" <peterz@...radead.org>,
 "mathieu.desnoyers@...icios.com" <mathieu.desnoyers@...icios.com>,
 "tglx@...utronix.de" <tglx@...utronix.de>, "kprateek.nayak@....com"
 <kprateek.nayak@....com>, "vineethr@...ux.ibm.com" <vineethr@...ux.ibm.com>
Subject: Re: [PATCH V5 1/6] Sched: Scheduler time slice extension

On Wed, 4 Jun 2025 21:23:27 +0200
Sebastian Andrzej Siewior <bigeasy@...utronix.de> wrote:

> On 2025-06-04 17:29:44 [+0000], Prakash Sangappa wrote:
> > Don’t know if there were benefits mentioned when RT tasks are involved.
> > 
> > I had shared some benchmark results in this thread showing benefit of using scheduler time extension.
> > https://lore.kernel.org/all/20241113000126.967713-1-prakash.sangappa@oracle.com/
> > The workload did not include RT tasks.  
> 
> I don't question the mechanism/ approach. I just don't want RT tasks
> delayed.
> 

So I applied your patches and fixed up my "extend-sched.c" program to use
your method. I booted on bare-metal PREEMPT_RT and ran:

~# cyclictest --smp -p95 -m -s --system -l 100000  -b 1000
# /dev/cpu_dma_latency set to 0us
policy: fifo: loadavg: 0.35 0.75 0.38 1/549 4219          

T: 0 ( 4213) P:95 I:1000 C:   5163 Min:      3 Act:    3 Avg:    3 Max:      12
T: 1 ( 4214) P:95 I:1500 C:   3444 Min:      3 Act:    4 Avg:    3 Max:       9
T: 2 ( 4215) P:95 I:2000 C:   2582 Min:      3 Act:    3 Avg:    3 Max:       8
T: 3 ( 4216) P:95 I:2500 C:   2066 Min:      3 Act:    4 Avg:    3 Max:       9
T: 4 ( 4217) P:95 I:3000 C:   1721 Min:      3 Act:    4 Avg:    3 Max:       7
T: 5 ( 4218) P:95 I:3500 C:   1474 Min:      3 Act:    4 Avg:    4 Max:      11
T: 6 ( 4219) P:95 I:4000 C:   1290 Min:      3 Act:    3 Avg:    3 Max:       9

In another window, I ran the "extend-sched" and cyclictest immediately turned into:

T: 0 ( 4372) P:95 I:1000 C:  33235 Min:      3 Act:    4 Avg:    3 Max:      36
T: 1 ( 4373) P:95 I:1500 C:  22182 Min:      3 Act:    4 Avg:    3 Max:      39
T: 2 ( 4374) P:95 I:2000 C:  16647 Min:      3 Act:    5 Avg:    3 Max:      35
T: 3 ( 4375) P:95 I:2500 C:  13321 Min:      3 Act:    5 Avg:    3 Max:      36
T: 4 ( 4376) P:95 I:3000 C:  11103 Min:      3 Act:    4 Avg:    3 Max:      35
T: 5 ( 4377) P:95 I:3500 C:   9518 Min:      3 Act:    5 Avg:    3 Max:      36
T: 6 ( 4378) P:95 I:4000 C:   8330 Min:      3 Act:    5 Avg:    3 Max:      35

It went from 12us to 39us. That's more than triple the max latency.

I noticed that the delay was set to 30, so I switched it to 5 and tried again:

~# cat /proc/sys/kernel/sched_preempt_delay_us 
30
~# echo 5 > /proc/sys/kernel/sched_preempt_delay_us 
~# cat /proc/sys/kernel/sched_preempt_delay_us 
5

T: 0 ( 4296) P:95 I:1000 C:  15324 Min:      3 Act:    3 Avg:    4 Max:      21
T: 1 ( 4297) P:95 I:1500 C:  10228 Min:      3 Act:    3 Avg:    4 Max:      21
T: 2 ( 4298) P:95 I:2000 C:   7676 Min:      3 Act:    3 Avg:    4 Max:      21
T: 3 ( 4299) P:95 I:2500 C:   6143 Min:      3 Act:    3 Avg:    4 Max:      20
T: 4 ( 4300) P:95 I:3000 C:   5119 Min:      3 Act:    3 Avg:    4 Max:      21
T: 5 ( 4301) P:95 I:3500 C:   4388 Min:      3 Act:    3 Avg:    4 Max:      20
T: 6 ( 4302) P:95 I:4000 C:   3840 Min:      3 Act:    3 Avg:    4 Max:      19

It went from a max of 12us to 21us. That's almost double. And this with just 5us.

The point we have with this, is it's NOT NOISE! It's addition to the worse
case scenario.

If we have 30us as the worst case latency, using this with 5 will make the
worst case latency 35 (or more as there's some overhead with this).

You cannot say "oh, the system causes 5us latency in general, so we can
just make it 5us", because this adds on top of it. If the system has 5us
latency in general and you set the extended scheduler slice to 5us, then
the system now has a 10us latency in general.

This is why it should be turned off with PREEMPT_RT.

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ