linux-kernel - RE: Kernel 4.9: strange behavior with fifo scheduler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <001c01d4be1f$67f74930$37e5db90$@dualis.com>
Date:   Wed, 6 Feb 2019 14:25:18 +0100
From:   Frédéric Mathieu 
        <frederic.mathieu@...lis.com>
To:     "'Dietmar Eggemann'" <dietmar.eggemann@....com>,
        <linux-kernel@...r.kernel.org>
Subject: RE: Kernel 4.9: strange behavior with fifo scheduler

Hi Dietmar,

Attention !, these tests were executed on a kernel with the patch RT and the option CONFIG_PREEMPT_RT_FULL = y. I confirm the truth of my priority settings
On a vanilla kernel, I get the same results as you.

After talking with mike Galbraith, I turned my attention to the priority of kernel threads. The following link explains in the behavior of the scheduler : 
https://wiki.linuxfoundation.org/realtime/documentation/technical_details/hr_timers

Contrary to what I thought, there is no dynamic adjustment of the priority according to the priority of the calling task.

-----Message d'origine-----
De : linux-kernel-owner@...r.kernel.org [mailto:linux-kernel-owner@...r.kernel.org] De la part de Dietmar Eggemann
Envoyé : mercredi 6 février 2019 11:55
À : Frédéric Mathieu <frederic.mathieu@...lis.com>; linux-kernel@...r.kernel.org
Objet : Re: Kernel 4.9: strange behavior with fifo scheduler

Hi Frédéric,

On 2/5/19 11:47 AM, Frédéric Mathieu wrote:
> Hi,
> 
> on an X86_64 architecture (Intel(R) Core(TM) i3-6100U CPU @ 2.30GHz), 
> I use the linux kernel 4.9.146 with patch rt 125.
> uname -a: Linux 4.9.146-rt125 #1 SMP PREEMPT RT Tue Jan 29 14:17:55 
> CET 2019
> x86_64 GNU/Linux
> 
> I observed a strange behavior of the scheduler when several tasks are 
> executed in FIFO mode on a CPU core and a significant CPU activity.
> 
> first test (reference: cpu load=0%):
> cyclictest -m -D 5 -i 1000 -p 50 -a 0
>        	# / dev / cpu_dma_latency set to 0us
>        	policy: fifo: loadavg: 1.95 1.06 0.43 1/159 14305
>        	T: 0 (14145) P: 50 I: 1000 C: 4997 Min: 7 Act: 7 Avg: 7 Max: 
> 18 work fine
> 
> now, i'm loading the system on the cpu core 0 with a homemade process:
>        cpu load 60%, sched FIFO prio 1, cpu 0

Are you sure that your test app runs with prio 1? Is this in the range of the SCHED_FIFO (userspace) priorities shown by chrt -m?

...
SCHED_FIFO min/max priority	: 1/99
...

If I run your setup (test and cyclictest affine to CPU0) on 4.15.0-43
(i7-4750HQ) with:

(1) test prio > cyclictest prio

# chrt -p $PID_TEST

pid 28489's current scheduling policy: SCHED_FIFO pid 28489's current scheduling priority: 51

# cat /proc/$PID_TEST/stat

28489 (test) R 28488 28487 8664 34828 28487 4194304 86 0 0 0 0 0 0 0 -52

I get your behaviour:

# /dev/cpu_dma_latency set to 0us
Thread 0 Interval: 1500
        0:       0:       6
        0:       1:       5
        0:       2:       2
        0:       3:    5419
        0:       4:       3
        0:       5:       2
        0:       6:       2
        0:       7:       2
        0:       8:    5422
        0:       9:       3
...

whereas with:

(2) test prio < cyclictest prio

# chrt -p $PID_TEST

pid 28811's current scheduling policy: SCHED_FIFO pid 28811's current scheduling priority: 49

# cat /proc/$PID_TEST/stat

28811 (test) S 28810 28809 8664 34828 28809 1077936128 90 0 0 0 0 0 0 0 -50

I get:

# /dev/cpu_dma_latency set to 0us
Thread 0 Interval: 1500
        0:       0:       7
        0:       1:       4
        0:       2:       3
        0:       3:       5
        0:       4:       4
        0:       5:       2
        0:       6:       2
        0:       7:       2
        0:       8:       2
        0:       9:       3
...

[...]

> In this case cyclictest detects very long latencies
>        cyclictest -m -D 5 -i 1000 -p 50 -a 0 -v
> 
> Max CPUs = 2
> # /dev/cpu_dma_latency set to 0us
> Thread 0 Interval: 1500
>         0:       0:      13
>         0:       1:       8
>         0:       2:       7
>         0:       3:    5648
>         0:       4:       8
>         0:       5:       7
>         0:       6:       7
>         0:       7:       7
>         0:       8:    5649

[...]

> After verification, although no other process is running with real 
> time scheduler, I see a latency of about 5.6 ms at regular intervals.
> 
> it seems that the priority task 1 (fifo) is not pre-empted by the 
> cyclictest process with a priority of 50 (fifo) when the low priority task is active.
> This  corresponds to the cycle recorded in the file: 6 ms of latency 
> followed by 4 "normal" latencies of 7 us.
> 
> Does anyone have any idea of this problem?
> 
> Best regards
> Frederic MATHIEU