lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090419001552.GA15667@gondor.apana.org.au>
Date:	Sun, 19 Apr 2009 08:15:52 +0800
From:	Herbert Xu <herbert@...dor.apana.org.au>
To:	"Michael S. Tsirkin" <m.s.tsirkin@...il.com>
Cc:	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Patrick McHardy <kaber@...sh.net>,
	Matias Zabaljauregui <zabaljauregui@...il.com>, odie@...aau.dk,
	Rusty Russell <rusty@...tcorp.com.au>, lguest@...abs.org,
	virtualization@...ts.osdl.org,
	"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
	Christian Borntraeger <borntraeger@...ibm.com>
Subject: Re: [1/2] tun: Only free a netdev when all tun descriptors are
	closed

On Sat, Apr 18, 2009 at 10:09:39PM +0300, Michael S. Tsirkin wrote:
> 
> Does this work with TUN_PERSIST off?
> I haven't tested this, but won't unregister_netdev block forever
> waiting for device reference to become 0? Maybe you want
> 
> +		tun_put(tun);
> +		if (!(tun->flags & TUN_PERSIST))
> +			unregister_netdev(tun->dev);
> 
> or is there a race here?

Good point.  I should've just left the old code alone as it wasn't
broken.  So here is yet another update.

I also fixed a bug on the allocation error handling path.

tun: Only free a netdev when all tun descriptors are closed

The commit c70f182940f988448f3c12a209d18b1edc276e33 ("tun: Fix
races between tun_net_close and free_netdev") fixed a race where
an asynchronous deletion of a tun device can hose a poll(2) on
a tun fd attached to that device.

However, this came at the cost of moving the tun wait queue into
the tun file data structure.  The problem with this is that it
imposes restrictions on when and where the tun device can access
the wait queue since the tun file may change at any time due to
detaching and reattaching.

In particular, now that we need to use the wait queue on the
receive path it becomes difficult to properly synchronise this
with the detachment of the tun device.

This patch solves the original race in a different way.  Since
the race is only because the underlying memory gets freed, we
can prevent it simply by ensuring that we don't do that until
all tun descriptors ever attached to the device (even if they
have since be detached because they may still be sitting in poll)
have been closed.

This is done by using reference counting the attached tun file
descriptors.  The refcount in tun->sk has been reappropriated
for this purpose since it was already being used for that, albeit
from the opposite angle.

Note that we no longer zero tfile->tun since tun_get will return
NULL anyway after the refcount on tfile hits zero.  Instead it
represents whether this device has ever been attached to a device.

Signed-off-by: Herbert Xu <herbert@...dor.apana.org.au>

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 16716ae..95ae40a 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -156,6 +156,7 @@ static int tun_attach(struct tun_struct *tun, struct file *file)
 	tfile->tun = tun;
 	tun->tfile = tfile;
 	dev_hold(tun->dev);
+	sock_hold(tun->sk);
 	atomic_inc(&tfile->count);
 
 out:
@@ -165,11 +166,8 @@ out:
 
 static void __tun_detach(struct tun_struct *tun)
 {
-	struct tun_file *tfile = tun->tfile;
-
 	/* Detach from net device */
 	netif_tx_lock_bh(tun->dev);
-	tfile->tun = NULL;
 	tun->tfile = NULL;
 	netif_tx_unlock_bh(tun->dev);
 
@@ -339,6 +337,13 @@ static void tun_net_uninit(struct net_device *dev)
 	}
 }
 
+static void tun_free_netdev(struct net_device *dev)
+{
+	struct tun_struct *tun = netdev_priv(dev);
+
+	sock_put(tun->sk);
+}
+
 /* Net device open. */
 static int tun_net_open(struct net_device *dev)
 {
@@ -811,7 +816,7 @@ static void tun_setup(struct net_device *dev)
 	tun->group = -1;
 
 	dev->ethtool_ops = &tun_ethtool_ops;
-	dev->destructor = free_netdev;
+	dev->destructor = tun_free_netdev;
 }
 
 /* Trivial set of netlink ops to allow deleting tun or tap
@@ -848,7 +853,7 @@ static void tun_sock_write_space(struct sock *sk)
 
 static void tun_sock_destruct(struct sock *sk)
 {
-	dev_put(container_of(sk, struct tun_sock, sk)->tun->dev);
+	free_netdev(container_of(sk, struct tun_sock, sk)->tun->dev);
 }
 
 static struct proto tun_proto = {
@@ -920,11 +925,8 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
 		if (!sk)
 			goto err_free_dev;
 
-		/* This ref count is for tun->sk. */
-		dev_hold(dev);
 		sock_init_data(&tun->socket, sk);
 		sk->sk_write_space = tun_sock_write_space;
-		sk->sk_destruct = tun_sock_destruct;
 		sk->sk_sndbuf = INT_MAX;
 		sk->sk_sleep = &tfile->read_wait;
 
@@ -942,11 +944,13 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
 		err = -EINVAL;
 		err = register_netdevice(tun->dev);
 		if (err < 0)
-			goto err_free_dev;
+			goto err_free_sk;
+
+		sk->sk_destruct = tun_sock_destruct;
 
 		err = tun_attach(tun, file);
 		if (err < 0)
-			goto err_free_dev;
+			goto failed;
 	}
 
 	DBG(KERN_INFO "%s: tun_set_iff\n", tun->dev->name);
@@ -1284,14 +1288,16 @@ static int tun_chr_close(struct inode *inode, struct file *file)
 		__tun_detach(tun);
 
 		/* If desireable, unregister the netdevice. */
-		if (!(tun->flags & TUN_PERSIST)) {
-			sock_put(tun->sk);
+		if (!(tun->flags & TUN_PERSIST))
 			unregister_netdevice(tun->dev);
-		}
 
 		rtnl_unlock();
 	}
 
+	tun = tfile->tun;
+	if (tun)
+		sock_put(tun->sk);
+
 	put_net(tfile->net);
 	kfree(tfile);
 
Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@...dor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ