netdev - Re: [1/2] tun: Only free a netdev when all tun descriptors are closed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090416195745.GA5826@dhcp-1-124.tlv.redhat.com>
Date:	Thu, 16 Apr 2009 22:57:45 +0300
From:	"Michael S. Tsirkin" <m.s.tsirkin@...il.com>
To:	Herbert Xu <herbert@...dor.apana.org.au>
Cc:	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Patrick McHardy <kaber@...sh.net>,
	Matias Zabaljauregui <zabaljauregui@...il.com>, odie@...aau.dk,
	Rusty Russell <rusty@...tcorp.com.au>, lguest@...abs.org,
	virtualization@...ts.osdl.org,
	"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
	Christian Borntraeger <borntraeger@...ibm.com>
Subject: Re: [1/2] tun: Only free a netdev when all tun descriptors are
	closed

On Thu, Apr 16, 2009 at 01:08:18AM -0000, Herbert Xu wrote:
> On Wed, Apr 15, 2009 at 10:38:34PM +0800, Herbert Xu wrote:
> > 
> > So how about this? We replace the dev destructor with our own that
> > doesn't immediately call free_netdev.  We only call free_netdev once
> > all tun fd's attached to the device have been closed.
> 
> Here's the patch.  I'd appreciate if everyone can review it
> and see if they can recreate the original race by
> 
> 1) Starting a process using tun and polls on it;
> 2) Doing ip tun del tunXXX while the process is polling.
> 
> tun: Only free a netdev when all tun descriptors are closed
> 
> The commit c70f182940f988448f3c12a209d18b1edc276e33 ("tun: Fix
> races between tun_net_close and free_netdev") fixed a race where
> an asynchronous deletion of a tun device can hose a poll(2) on
> a tun fd attached to that device.
> 
> However, this came at the cost of moving the tun wait queue into
> the tun file data structure.  The problem with this is that it
> imposes restrictions on when and where the tun device can access
> the wait queue since the tun file may change at any time due to
> detaching and reattaching.
> 
> In particular, now that we need to use the wait queue on the
> receive path it becomes difficult to properly synchronise this
> with the detachment of the tun device.
> 
> This patch solves the original race in a different way.  Since
> the race is only because the underlying memory gets freed, we
> can prevent it simply by ensuring that we don't do that until
> all tun descriptors ever attached to the device (even if they
> have since be detached because they may still be sitting in poll)
> have been closed.
> 
> This is done by using reference counting the attached tun file
> descriptors.  The refcount in tun->sk has been reappropriated
> for this purpose since it was already being used for that, albeit
> from the opposite angle.
> 
> Note that we no longer zero tfile->tun since tun_get will return
> NULL anyway after the refcount on tfile hits zero.  Instead it
> represents whether this device has ever been attached to a device.
> 
> Signed-off-by: Herbert Xu <herbert@...dor.apana.org.au>
> 
> 
> Cheers,
> 
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> 
> .....
>
> @@ -1275,20 +1278,18 @@ static int tun_chr_close(struct inode *inode, struct file *file)
>  	struct tun_file *tfile = file->private_data;
>  	struct tun_struct *tun = __tun_get(tfile);
>  
> -
>  	if (tun) {
> -		DBG(KERN_INFO "%s: tun_chr_close\n", tun->dev->name);
> -
> -		rtnl_lock();
> -		__tun_detach(tun);
> -
>  		/* If desireable, unregister the netdevice. */
> -		if (!(tun->flags & TUN_PERSIST)) {
> -			sock_put(tun->sk);
> -			unregister_netdevice(tun->dev);
> -		}
> +		if (!(tun->flags & TUN_PERSIST))
> +			unregister_netdev(tun->dev);
> +		else
> +			tun_put(tun);
> +	} else
> +		tun = tfile->tun;
>  
> -		rtnl_unlock();
> +	if (tun) {
> +		DBG(KERN_INFO "%s: tun_chr_close\n", tun->dev->name);
> +		sock_put(tun->sk);
>  	}
>  
>  	put_net(tfile->net);

This last bit seems to make a simple test using a non-persistent tap device
deadlock for me: we don't drop a reference acquired with __tun_get sock
unregister_netdevice blocks printing unregister_netdevice: waiting for tap0 to
become free. Usage count = 1.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html