lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250219-astonishing-nimble-wolverine-1300f0@leitao>
Date: Wed, 19 Feb 2025 07:36:45 -0800
From: Breno Leitao <leitao@...ian.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Andrew Lunn <andrew+netdev@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
	David Wei <dw@...idwei.uk>, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, paulmck@...nel.org,
	kernel-team@...a.com
Subject: Re: [PATCH net-next v2] netdevsim: call napi_schedule from a timer
 context

Hello Jakub,

On Mon, Feb 17, 2025 at 11:50:31AM -0800, Jakub Kicinski wrote:
> On Mon, 17 Feb 2025 09:35:29 -0800 Breno Leitao wrote:
> > The netdevsim driver was experiencing NOHZ tick-stop errors during packet
> > transmission due to pending softirq work when calling napi_schedule().
> > This issue was observed when running the netconsole selftest, which
> > triggered the following error message:
> > 
> >   NOHZ tick-stop error: local softirq work is pending, handler #08!!!
> > 
> > To fix this issue, introduce a timer that schedules napi_schedule()
> > from a timer context instead of calling it directly from the TX path.
> > 
> > Create an hrtimer for each queue and kick it from the TX path,
> > which then schedules napi_schedule() from the timer context.
> 
> This crashes in the nl_netdev test.

Yea, a nasty crash. Looking at the crash, it seems to be  disabling the
timer before initializing it, and timer->base was not properly
assigned/set.

> I think you should move the hrtimer init to nsim_queue_alloc()
> and removal to nsim_queue_free()

That seems to make nl_netdev happier. Let me do more tests, and then ask
NIPA do finish the work.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ