lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5b509b1370d42fd0cc109fc8914272be6dcfcd54.camel@gmx.de>
Date: Fri, 22 Aug 2025 05:54:28 +0200
From: Mike Galbraith <efault@....de>
To: Breno Leitao <leitao@...ian.org>
Cc: Pavel Begunkov <asml.silence@...il.com>, Jakub Kicinski
 <kuba@...nel.org>,  Johannes Berg <johannes@...solutions.net>,
 paulmck@...nel.org, LKML <linux-kernel@...r.kernel.org>, 
 netdev@...r.kernel.org, boqun.feng@...il.com
Subject: Re: netconsole: HARDIRQ-safe -> HARDIRQ-unsafe lock order warning

On Thu, 2025-08-21 at 10:35 -0700, Breno Leitao wrote:
> > On Thu, Aug 21, 2025 at 05:51:59AM +0200, Mike Galbraith wrote:
 
> > > > --- a/drivers/net/netconsole.c
> > > > +++ b/drivers/net/netconsole.c
> > > > @@ -1952,12 +1952,12 @@ static void netcon_write_thread(struct c
> > > >  static void netconsole_device_lock(struct console *con, unsigned long *flags)
> > > >  {
> > > >  	/* protects all the targets at the same time */
> > > > -	spin_lock_irqsave(&target_list_lock, *flags);
> > > > +	spin_lock(&target_list_lock);
> > 
> > I personally think this target_list_lock can be moved to an RCU lock.
> > 
> > If that is doable, then we probably make netconsole_device_lock()
> > to a simple `rcu_read_lock()`, which would solve this problem as well.

The bigger issue for the nbcon patch would seem to be the seemingly
required .write_atomic leading to landing here with disabled IRQs.

WRT my patch, seeing a hard RT crash on wired box cleanly logged with
your nbcon patch applied (plus my twiddle mentioned earlier) tells me
my patch has lost its original reason to exist.  It's relevant to this
thread only in that those once thought to be RT specific IRQ disable
spots turned out to actually be RT agnostic wireless sore spots.

> > > > --- a/net/core/netpoll.c
> > > > +++ b/net/core/netpoll.c
> > > > @@ -58,6 +58,29 @@ static void zap_completion_queue(void);
> > > >  static unsigned int carrier_timeout = 4;
> > > >  module_param(carrier_timeout, uint, 0644);
> > > >  
> > > > +DEFINE_PER_CPU(int, _netpoll_tx_running);
> > > > +EXPORT_PER_CPU_SYMBOL(_netpoll_tx_running);
> > > > +
> > > > +#define
> > > > netpoll_tx_begin(flags)					\
> > > > +	do
> > > > {							\
> > > > +		if (IS_ENABLED(CONFIG_PREEMPT_RT)
> > > > ||		\
> > > > +		   
> > > > IS_ENABLED(CONFIG_NETCONSOLE_NBCON))	\
> > > > +			local_bh_disable();		
> > > > 	\
> > > > +		else					
> > > > 	\
> > > > +			local_irq_save(flags);		
> > > > 	\
> > > > +		this_cpu_write(_netpoll_tx_running,
> > > > 1);		\
> > > > +	} while (0)
> > 
> > Why can't we just use local_bh_disable() in both cases?

Yeah, believe so.

> > > 
> > > > @@ -246,7 +269,7 @@ static void refill_skbs(struct netpoll *
> > > >  static void zap_completion_queue(void)
> > > >  {
> > > >  	unsigned long flags;
> > > > -	struct softnet_data *sd = &get_cpu_var(softnet_data);
> > > > +	struct softnet_data *sd = this_cpu_ptr(&softnet_data);
> > 
> > How do I check if this is safe to do ?

Too much water under the bridge, I don't recall my path to conclusion
reached, and it seems to no longer matter.

	-Mike

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ