lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFwiDX8MWS8WRkvkt=DgEnn6ZxRZWtiyHuc0hHuSzXoGK+Lpig@mail.gmail.com>
Date: Wed, 10 Jul 2024 16:52:41 +0530
From: Neeraj upadhyay <neeraj.iitr10@...il.com>
To: richard clark <richard.xnu.clark@...il.com>
Cc: paulmck@...nel.org, josh@...htriplett.org, 
	Lai Jiangshan <jiangshanlai@...il.com>, mathieu.desnoyers@...icios.com, 
	Steven Rostedt <rostedt@...dmis.org>, Mark Rutland <mark.rutland@....com>, 
	Linus Torvalds <torvalds@...ux-foundation.org>, rcu@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, 
	linux-rt-users@...r.kernel.org
Subject: Re: 'rcu_preempt detected stalls on CPUs/tasks...' issue of
 cyclictest on rt-linux

Hello Richard,

On Wed, Jul 10, 2024 at 1:56 PM richard clark
<richard.xnu.clark@...il.com> wrote:
>
> Hi,
> I am running a Ubuntu 20.04.5 LTS on Nvidia Jetson AGX Orin platform
> with 12-cores as a guestOS, the kernel version is - 6.1.83-rt28.
> Kernel cmdline is:
> 'root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 mminit_loglevel=4
> console=ttyTCU0,115200 console=tty0 firmware_class.path=/etc/firmware
> fbcon=map:0 net.ifnames=0'
>
> The cyclictest command 'cyclictest -Smp99 -H 3000
> --histfile=orin_idle_hyp_4h.hist -D 4h' will hang randomly during the
> test, then the minicom console will show below messages:
> ...
>
> [97619.450889] [CPU11-E] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [97619.450894] [CPU11-E] rcu:   1-...!: (0 ticks this GP)
> idle=dc88/0/0x0 softirq=0/0 fqs=2 (false positive?)
> [97619.450914] [ CPU1-E] NMI backtrace for cpu 1
> [97619.451912] [CPU11-E] rcu: rcu_preempt kthread timer wakeup didn't
> happen for 5251 jiffies! g6029253 f0x0 RCU_GP_WAIT_FQS(5)
> ->state=0x402
> [97619.451916] [CPU11-E] rcu:   Possible timer handling issue on cpu=1
> timer-softirq=342864

This log indicates that jiffies timers are not getting handled on CPU1, due to
which GP kthread was not woken up. Can you check irq, softirq and timer traces
on CPU1, to see if the softirqs/timers are getting served on this CPU?


- Neeraj

> [97619.451918] [CPU11-E] rcu: rcu_preempt kthread starved for 5252
> jiffies! g6029253 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
> [97619.451921] [CPU11-E] rcu:   Unless rcu_preempt kthread gets
> sufficient CPU time, OOM is now expected behavior.
> [97619.451923] [CPU11-E] rcu: RCU grace-period kthread stack dump:
> [97619.451966] [CPU11-E] rcu: Stack dump where RCU GP kthread last ran:
> ...
> This issue doesn't show if run the Ubuntu 20.04.5 LTS with the same
> kernel natively on the Orin board.
>
> Any comments about this or what can I do to triage this issue?
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ