netdev - Re: netpoll + xmit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1248904097.4545.2934.camel@calx>
Date:	Wed, 29 Jul 2009 16:48:17 -0500
From:	Matt Mackall <mpm@...enic.com>
To:	Neil Horman <nhorman@...driver.com>
Cc:	Herbert Xu <herbert@...dor.apana.org.au>,
	"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
	Matt Carlson <mcarlson@...adcom.com>
Subject: Re: netpoll + xmit_lock == deadlock

On Wed, 2009-07-29 at 15:43 -0400, Neil Horman wrote:
> On Wed, Jul 29, 2009 at 02:07:58PM -0500, Matt Mackall wrote:
> > On Wed, 2009-07-29 at 15:35 +0800, Herbert Xu wrote:
> > > Hi:
> > > 
> > > While working on TX mitigiation, I noticed that while netpoll
> > > takes care to avoid recursive dead locks on the NAPI path, it
> > > has no protection against the TX path when calling the poll
> > > function.
> > > 
> > > So if a driver is in the TX path, and a printk occurs, then a
> > > recursive dead lock can occur if that driver tries to take the
> > > xmit lock in its poll function to clean up descriptors.
> > > 
> > > Fortunately not a lot of drivers do this but at least some are
> > > vulnerable to it, e.g., tg3.
> > > 
> > > So we need to make it very clear that the poll function must
> > > not take any locks or they must use try_lock if the driver is
> > > to support netpoll.
> > 
> > What do you propose?
> 
> I think there is actually some recursion protection.  If you look in
> netpoll_send_skb (where all netpoll transmits pass through), we do a
> __netif_tx_trylock, and only continue down the tx path if we obtain the lock.
> If not, we call netpoll_poll, wait a while, and try again.  I think that should
> prevent the deadlock condition you are concerned about.

Maybe. The general point remains that drivers implementing poll need to
be aware of possible recursion through printk/netconsole in the xmit
path. If there are private locks, netpoll is powerless to prevent
recursive lock attempts.

It occurs to me that we might be able to know when we've moved from core
kernel into a driver's tx path by wrapping the tx method pointer or call
its call sites with something that disabled netconsole until it exited.

-- 
http://selenic.com : development and support for Mercurial and Linux


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html