[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <97638b0b-cd96-40e2-9dc2-5e6f767b90a4@eltropuls.de>
Date: Sat, 14 Jun 2025 10:52:36 +0200
From: Marc Strämke <marc.straemke@...ropuls.de>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, linux-rt-users@...r.kernel.org
Subject: Re: Latency spikes on V6.15.1 Preempt RT and maybe related to intel?
IGB
Sebastian,
i tried that in the past (rtla top auto analysis). But i do not really
understand the result:
rtla timerlat hit stop tracing
## CPU 1 hit stop tracing, analyzing it ##
IRQ handler delay: 0.00 us
(0.00 %)
IRQ latency: 709.25 us
Blocking thread:
ip:3567
Blocking thread stack trace
-> timerlat_irq
-> __hrtimer_run_queues
-> hrtimer_interrupt
-> __sysvec_apic_timer_interrupt
-> sysvec_apic_timer_interrupt
-> asm_sysvec_apic_timer_interrupt
-> igb_update_mc_addr_list
-> igb_set_rx_mode
-> __dev_change_flags
-> netif_change_flags
-> do_setlink.constprop.0
-> rtnl_newlink
-> rtnetlink_rcv_msg
-> netlink_rcv_skb
-> netlink_unicast
-> netlink_sendmsg
-> ____sys_sendmsg
-> ___sys_sendmsg
-> __sys_sendmsg
-> do_syscall_64
-> entry_SYSCALL_64_after_hwframe
------------------------------------------------------------------------
IRQ latency: 709.25 us (100%)
I do not really understand where the IRQ/Preemption disabling is
happening. What would the next thing be to do? Function (graph?) tracing
on all the functions visible in the backtrace?
I tried to look at the event race output starting with the call to
igb_set_rx_mode. I have attached the trace with all events and a
function filter on igb on only the cpu executing ip. I cannot
understand what is happening between timestasmp 700.149995 and the IRQ
disable event on 700.150795....
Thanks for your help,
Marc
Am 13.06.2025 um 21:58 schrieb Sebastian Andrzej Siewior:
> On 2025-06-13 17:26:15 [+0200], marc.straemke wrote:
>> Thanks Sebastian, I will do that tomorrow.To confirm: Just pure event
>> tracing without the function tracer? (After enabling the sched
>> events)Regards,Marc
> The event tracing should narrow down which of the tasks cause the spike.
> So if you say it is the ip comment then you should see ip.
> Step two would be the function tracer to narrow it down further. You
> could start right away with the function tracer to see where the big gap
> is.
> rtla could speed up the whole process (via the timerlat auto analysis).
>
> Sebastian
View attachment "trace_after_igb_set_rx_mode" of type "text/plain" (11269 bytes)
Powered by blists - more mailing lists