lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 28 Sep 2010 18:45:58 +0200
From:	Nicolas Dichtel <nicolas.dichtel@...nd.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	netdev <netdev@...r.kernel.org>,
	Octavian Purdila <opurdila@...acom.com>
Subject: Re: [PATCH] ipv4: remove all rt cache entries on UNREGISTER event

Eric Dumazet wrote:
> Le mardi 28 septembre 2010 à 17:24 +0200, Nicolas Dichtel a écrit :
>> Hi,
>>
>> I face a problem when I try to remove an interface, 
>> netdev_wait_allrefs() complains about refcount.
>>
>> Here is a trivial scenario to reproduce the problem:
>> # ip tunnel add mode ipip remote 10.16.0.164 local 10.16.0.72 dev eth0
>> # ./a.out tunl1
>> # ip tunnel del tunl1
>>
>> Note: a.out binary create an IPv4 raw socket, attach it to tunl1 
>> (SO_BINDTODEVICE), set it as multicast (IP_MULTICAST_LOOP), set the 
>> multicast interface to tunl1 (IP_MULTICAST_IF), build the IP header 
>> (IP_HDRINCL) and then send a single packet (192.168.6.1 -> 224.0.0.18).
>>
>> Note2: when a.out is executed, tunl1 has no ip address and is down.
>>
> 
> CC Octavian Purdila, the patch author.
> 
> I am just wondering why this route is created in the first place.
At first, I asked myself the same question, but it seems that this is 
allowed to send a packet through this kind of socket, even if interface 
is down. Packet will be destroyed by the noop qdisk.
But I agree that it is strange to perform route lookup and everything to 
   destroy the packet at the end ...
Maybe raw_sendmsg() can delete it directly ;-) ... or maybe 
ip_route_output_flow().

Any suggestions welcome.

Regards,
Nicolas

> 
> Maybe a fix would be to forbid this ?
> 
> Some machines have a giant route cache, so its very important to avoid
> expensive scans.
> 
>> Then, I got a serie of "kernel:[1206699.728010] unregister_netdevice: 
>> waiting for tunl1 to become free. Usage count = 3" and after some time, 
>> interface is removed.
>>
>> The problem is that route cache entries are only invalidate on 
>> UNREGISTER event, and not removed (introduced by commit 
>> e2ce146848c81af2f6d42e67990191c284bf0c33). We must wait that 
>> rt_check_expire() remove the remaining route cache entries.
>>
>> To fix the problem, I propose to remove a part of the previous commit.
>>
>> Regards,
>> Nicolas
>> pièce jointe différences entre fichiers
>> (0001-ipv4-remove-all-rt-cache-entries-on-UNREGISTER-even.patch)
>> From 3344e2e0431fe803c4dac8757a8746908357d780 Mon Sep 17 00:00:00 2001
>> From: Nicolas Dichtel <nicolas.dichtel@...nd.com>
>> Date: Tue, 28 Sep 2010 16:38:19 +0200
>> Subject: [PATCH] ipv4: remove all rt cache entries on UNREGISTER event
>>
>> Commit e2ce146848c81af2f6d42e67990191c284bf0c33 (ipv4: factorize cache clearing
>> for batched unregister operations) add a new parameter to fib_disable_ip() to
>> only invalidate route cache entries on unregister event.
>> This is wrong, we should ensure that all cache entries are removed on
>> unregister event, else netdev_wait_allrefs() may complain. A cache entry
>> can be created between event DOWN and UNREGISTER.
>>
>> So, I revert a part of the patch.
>>
>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@...nd.com>
>> ---
>>  net/ipv4/fib_frontend.c |   10 +++++-----
>>  1 files changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
>> index 7d02a9f..377e815 100644
>> --- a/net/ipv4/fib_frontend.c
>> +++ b/net/ipv4/fib_frontend.c
>> @@ -917,11 +917,11 @@ static void nl_fib_lookup_exit(struct net *net)
>>  	net->ipv4.fibnl = NULL;
>>  }
>>  
>> -static void fib_disable_ip(struct net_device *dev, int force, int delay)
>> +static void fib_disable_ip(struct net_device *dev, int force)
>>  {
>>  	if (fib_sync_down_dev(dev, force))
>>  		fib_flush(dev_net(dev));
>> -	rt_cache_flush(dev_net(dev), delay);
>> +	rt_cache_flush(dev_net(dev), 0);
>>  	arp_ifdown(dev);
>>  }
>>  
>> @@ -944,7 +944,7 @@ static int fib_inetaddr_event(struct notifier_block *this, unsigned long event,
>>  			/* Last address was deleted from this interface.
>>  			   Disable IP.
>>  			 */
>> -			fib_disable_ip(dev, 1, 0);
>> +			fib_disable_ip(dev, 1);
>>  		} else {
>>  			rt_cache_flush(dev_net(dev), -1);
>>  		}
>> @@ -959,7 +959,7 @@ static int fib_netdev_event(struct notifier_block *this, unsigned long event, vo
>>  	struct in_device *in_dev = __in_dev_get_rtnl(dev);
>>  
>>  	if (event == NETDEV_UNREGISTER) {
>> -		fib_disable_ip(dev, 2, -1);
>> +		fib_disable_ip(dev, 2);
>>  		return NOTIFY_DONE;
>>  	}
>>  
>> @@ -977,7 +977,7 @@ static int fib_netdev_event(struct notifier_block *this, unsigned long event, vo
>>  		rt_cache_flush(dev_net(dev), -1);
>>  		break;
>>  	case NETDEV_DOWN:
>> -		fib_disable_ip(dev, 0, 0);
>> +		fib_disable_ip(dev, 0);
>>  		break;
>>  	case NETDEV_CHANGEMTU:
>>  	case NETDEV_CHANGE:
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ