lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090804064828.GA32071@gondor.apana.org.au>
Date:	Tue, 4 Aug 2009 16:48:28 +1000
From:	Herbert Xu <herbert@...dor.apana.org.au>
To:	David Miller <davem@...emloft.net>
Cc:	mpm@...enic.com, bhutchings@...arflare.com, netdev@...r.kernel.org,
	mcarlson@...adcom.com
Subject: Re: netpoll + xmit_lock == deadlock

On Mon, Aug 03, 2009 at 09:19:05PM -0700, David Miller wrote:
>
> > So I think we should be clear up front that the solution we're looking
> > for is 'don't crash the box when trying to printk from the tx path'.
> > 
> > I think the most straightforward, driver-agnostic way to do this
> > is to have netpoll wrap a driver's TX entrypoint such that recursion is
> > disabled while in that path. This might actually simplify netpoll's
> > internal locking ugliness. Something like:
> > 
> > int netpoll_xmit_wrapper(struct sk_buff *skb, struct net_device *dev)
> > {
> > 	struct netpoll_info *npinfo = dev->npinfo;
> > 	int ret;
> > 
> > 	npinfo->recurse += 1; /* add appropriate locking */
> > 	ret = npinfo->orig_tx(skb, dev);
> > 	npinfo->recurse -= 1;
> > 	
> > 	return ret;
> > }
> 
> Looks workable.  Probably we even hold the netpoll lock here so
> no additional protection would even be needed.

Yes that should solve the xmit problem.  However, we could still
have similar issues with other parts of the driver.  For instance,
the driver may take a lock in one of the ethtool/control functions
that is also taken on either the xmit or poll path.  If you then
get a printk in the ethtool/control function then you'll be in the
same situation.

Basically anytime we're executing driver code we may be vulnerable
to a netpoll dead-lock.

So I think your suggestion should be extended to cover all entries
into the driver code, and not just the xmit function.  Perhaps
something like

#define netdev_driver_op(op, ...)				\
	({							\
		struct netpoll_info *npinfo = dev->npinfo;	\
		int ret; 					\
								\
		npinfo->recurse++;				\
		ret = op(__VA_ARGS__);				\
		npinfo->recurse--;
								\
		ret;						\
	})

This still leaves driver timers and other async driver code.  We
can add a similar marker for them but someone will have to audit
the drivers.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@...dor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ