linux-kernel - Re: Latency spikes on V6.15.1 Preempt RT and maybe related to intel IGB

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20250613145434.T2x2ML8_@linutronix.de>
Date: Fri, 13 Jun 2025 16:54:34 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Marc Strämke <marc.straemke@...ropuls.de>
Cc: linux-kernel@...r.kernel.org, linux-rt-users@...r.kernel.org
Subject: Re: Latency spikes on V6.15.1 Preempt RT and maybe related to intel
 IGB

On 2025-06-10 13:23:13 [+0200], Marc Strämke wrote:
> Hello Everyone, I am reposting to LKML as I am not sure the rt-users
Hi,

> mailinglist is read by many people, (I hope that is okay)
> 
> On an AMD Ryzen Embedded machine I am experiencing strange Latency spikes in
> cyclictest and need some hints how to debug that further.
> 
> The system typically has max latencys of  88 us and averages of 4-8 which is
> more then sufficient for my application, but I saw some spikes of many
> hundred us in testing.
> 
> I can provoke latenciess of more then 500-1000 us by invoking "ip l set
> enp1s0 promisc off" on the first network interfaces. The network interface
> is an "Intel Corporation I210 Gigabit Network Connection" using the IGB
> driver.
> 
> I tried more or less all tracers but am not knowledgeable enough to make
> sense of the output. IRQSoff and wakeup_rt trace output attached.

I'm not sure what your two traces captured. The irqsoff_trace captured
208us and this looks like a regular top of the run_ktimerd() invocation.
There is not much going on.

wakeup_rt_trace shows the wakeup of cyclictest. It records 411us. Most
of it is scheduler itself with some 100us from
flush_smp_call_function_queue().
 
> Can anyone point me in the right direction? I am not sure how to interpret
> the function tracers and function_graph tracers output in a meaningful way.
> As mainly a user of of the kernel I am a bit overwhelmed by the interaction
> of the scheduler, RCU and so on..


Could you please try one of the following:
- enable sched tracing events and tell cyclictest to break & stop
  tracing? The options for cyclitest would be -b --tracemark.
  The idea to see the scheduler events before the delay happens. So your
  latency spike is 500us then you can try -b 490 or so.

- use rtla. This might be easier and give you the backtrace of what you
  are looking for
  	https://bristot.me/linux-scheduling-latency-debug-and-analysis/

> I have attached my config for reference.

There is nothing wrong with it. You might want to disable NO_HZ (and use
PERIODIC) and use HZ=250 (or less) 

> Kind Regards
> 
> Marc
> 

Sebastian