[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250609165532.3265e142@gandalf.local.home>
Date: Mon, 9 Jun 2025 16:55:32 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Prakash Sangappa <prakash.sangappa@...cle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"mathieu.desnoyers@...icios.com" <mathieu.desnoyers@...icios.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "kprateek.nayak@....com"
<kprateek.nayak@....com>, "vineethr@...ux.ibm.com" <vineethr@...ux.ibm.com>
Subject: Re: [PATCH V5 1/6] Sched: Scheduler time slice extension
On Wed, 4 Jun 2025 21:23:27 +0200
Sebastian Andrzej Siewior <bigeasy@...utronix.de> wrote:
> On 2025-06-04 17:29:44 [+0000], Prakash Sangappa wrote:
> > Don’t know if there were benefits mentioned when RT tasks are involved.
> >
> > I had shared some benchmark results in this thread showing benefit of using scheduler time extension.
> > https://lore.kernel.org/all/20241113000126.967713-1-prakash.sangappa@oracle.com/
> > The workload did not include RT tasks.
>
> I don't question the mechanism/ approach. I just don't want RT tasks
> delayed.
>
So I applied your patches and fixed up my "extend-sched.c" program to use
your method. I booted on bare-metal PREEMPT_RT and ran:
~# cyclictest --smp -p95 -m -s --system -l 100000 -b 1000
# /dev/cpu_dma_latency set to 0us
policy: fifo: loadavg: 0.35 0.75 0.38 1/549 4219
T: 0 ( 4213) P:95 I:1000 C: 5163 Min: 3 Act: 3 Avg: 3 Max: 12
T: 1 ( 4214) P:95 I:1500 C: 3444 Min: 3 Act: 4 Avg: 3 Max: 9
T: 2 ( 4215) P:95 I:2000 C: 2582 Min: 3 Act: 3 Avg: 3 Max: 8
T: 3 ( 4216) P:95 I:2500 C: 2066 Min: 3 Act: 4 Avg: 3 Max: 9
T: 4 ( 4217) P:95 I:3000 C: 1721 Min: 3 Act: 4 Avg: 3 Max: 7
T: 5 ( 4218) P:95 I:3500 C: 1474 Min: 3 Act: 4 Avg: 4 Max: 11
T: 6 ( 4219) P:95 I:4000 C: 1290 Min: 3 Act: 3 Avg: 3 Max: 9
In another window, I ran the "extend-sched" and cyclictest immediately turned into:
T: 0 ( 4372) P:95 I:1000 C: 33235 Min: 3 Act: 4 Avg: 3 Max: 36
T: 1 ( 4373) P:95 I:1500 C: 22182 Min: 3 Act: 4 Avg: 3 Max: 39
T: 2 ( 4374) P:95 I:2000 C: 16647 Min: 3 Act: 5 Avg: 3 Max: 35
T: 3 ( 4375) P:95 I:2500 C: 13321 Min: 3 Act: 5 Avg: 3 Max: 36
T: 4 ( 4376) P:95 I:3000 C: 11103 Min: 3 Act: 4 Avg: 3 Max: 35
T: 5 ( 4377) P:95 I:3500 C: 9518 Min: 3 Act: 5 Avg: 3 Max: 36
T: 6 ( 4378) P:95 I:4000 C: 8330 Min: 3 Act: 5 Avg: 3 Max: 35
It went from 12us to 39us. That's more than triple the max latency.
I noticed that the delay was set to 30, so I switched it to 5 and tried again:
~# cat /proc/sys/kernel/sched_preempt_delay_us
30
~# echo 5 > /proc/sys/kernel/sched_preempt_delay_us
~# cat /proc/sys/kernel/sched_preempt_delay_us
5
T: 0 ( 4296) P:95 I:1000 C: 15324 Min: 3 Act: 3 Avg: 4 Max: 21
T: 1 ( 4297) P:95 I:1500 C: 10228 Min: 3 Act: 3 Avg: 4 Max: 21
T: 2 ( 4298) P:95 I:2000 C: 7676 Min: 3 Act: 3 Avg: 4 Max: 21
T: 3 ( 4299) P:95 I:2500 C: 6143 Min: 3 Act: 3 Avg: 4 Max: 20
T: 4 ( 4300) P:95 I:3000 C: 5119 Min: 3 Act: 3 Avg: 4 Max: 21
T: 5 ( 4301) P:95 I:3500 C: 4388 Min: 3 Act: 3 Avg: 4 Max: 20
T: 6 ( 4302) P:95 I:4000 C: 3840 Min: 3 Act: 3 Avg: 4 Max: 19
It went from a max of 12us to 21us. That's almost double. And this with just 5us.
The point we have with this, is it's NOT NOISE! It's addition to the worse
case scenario.
If we have 30us as the worst case latency, using this with 5 will make the
worst case latency 35 (or more as there's some overhead with this).
You cannot say "oh, the system causes 5us latency in general, so we can
just make it 5us", because this adds on top of it. If the system has 5us
latency in general and you set the extended scheduler slice to 5us, then
the system now has a 10us latency in general.
This is why it should be turned off with PREEMPT_RT.
-- Steve
Powered by blists - more mailing lists