lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <201102231243.23579.alexandre.sidorenko@hp.com>
Date:	Wed, 23 Feb 2011 12:43:23 -0500
From:	Alex Sidorenko <alexandre.sidorenko@...com>
To:	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Stale entries in RT_TABLE_LOCAL

Hello,

I have found several scenarios when after deleting IP-address from an 
interface there is a stale entry left in RT_TABLE_LOCAL.

All these scenarios use the fact that it is possible to add the same address 
multiple times to the same interface using different masks.

Let us do the following using dummy0 interface:

ifconfig dummy0 192.168.140.31 netmask 255.255.252.0
ip addr add 192.168.142.109/23 dev dummy0
ip addr add 192.168.142.109/22 dev dummy0
ip addr del 192.168.142.109/22 dev dummy0
ip addr del 192.168.142.109/23 dev dummy0

We add 192.168.142.109/23 and 192.168.142.109/22, then delete them (order is 
important). After that, 192.168.142.109 is not in 'ip addr ls' but there are 
entries using this addr in RT_TABLE_LOCAL.

An attached script demonstrates the problem:

{asid 14:00:57} sudo sh iptest.sh
Tables before the test
13: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 5e:1a:fa:44:90:f6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.140.31/22 brd 192.168.143.255 scope global dummy0
    inet6 fe80::5c1a:faff:fe44:90f6/64 scope link 
       valid_lft forever preferred_lft forever
 
local 192.168.140.31 dev dummy0  proto kernel  scope host  src 192.168.140.31 
broadcast 192.168.140.0 dev dummy0  proto kernel  scope link  src 
192.168.140.31 
broadcast 192.168.143.255 dev dummy0  proto kernel  scope link  src
192.168.140.31 
----------------------
Tables after the test
13: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 5e:1a:fa:44:90:f6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.140.31/22 brd 192.168.143.255 scope global dummy0
    inet6 fe80::5c1a:faff:fe44:90f6/64 scope link 
       valid_lft forever preferred_lft forever
 
local 192.168.140.31 dev dummy0  proto kernel  scope host  src 192.168.140.31 
local 192.168.142.109 dev dummy0  proto kernel  scope host  src 192.168.140.31 
broadcast 192.168.143.255 dev dummy0  proto kernel  scope link  src
192.168.140.31 
broadcast 192.168.143.255 dev dummy0  proto kernel  scope link  src
192.168.142.109 


As you see, even though there is no 192.168.142.109 on dummy0 address list, 
the entries referring to this addr are still present in RT_TABLE_LOCAL.

Another scenario (adding/deleting two addresses, each one twice with different 
mask) can lead to stale entries cross-referencing each other, like

local 192.168.5.8  proto kernel  scope host  src 192.168.5.9 
local 192.168.5.9  proto kernel  scope host  src 192.168.5.8 

Analysis
--------

Both scenarios use the fact that we can add the same address multiple times to 
the same interface, using different masks.

1. When we delete an IP addr, we remove it from the interface addr list and 
send a notifier to routing code (fib_del_ifaddr) asking to delete the 
associated routes.

2. When we enter fib_del_ifaddr(struct in_ifaddr *ifa), the address is already
deleted. But if we add the same IP twice (with different masks), the same
address (even though with different prefix) is present two times. So after the
first deletion we still have its 2nd instance on the list.

3. We do the following in fib_del_ifaddr():

         for (ifa1 = in_dev->ifa_list; ifa1; ifa1 = ifa1->ifa_next) {
                 if (ifa->ifa_local == ifa1->ifa_local)
                         ok |= LOCAL_OK;
                 if (ifa->ifa_broadcast == ifa1->ifa_broadcast)
                         ok |= BRD_OK;
                 if (brd == ifa1->ifa_broadcast)
                         ok |= BRD1_OK;
                 if (any == ifa1->ifa_broadcast)
                         ok |= BRD0_OK;
         }

That is, we loop on all addrs of the interface (in_dev->ifa_list) and compare
address we have just deleted (passed in 'ifa') with addresses on the list. 

As we compare them without taking prefix (mask) into account, the following 
will be true:

ifa->ifa_local == ifa1->ifa_local
ifa->ifa_broadcast == ifa1->ifa_broadcast

4. As a result, after deleting the first instance of IP (192.168.142.109/22) 
we still have  192.168.142.109/23 on the list. The routing code will find that 
this addr (and broadcast) are still present on the list and will not delete 
the routes.

5. When we delete the second time (192.168.142.109/23), there will be no
192.168.142.109 on the list anymore and the routing code will delete the route 
- but only one out of two entries.

How this can be fixed
---------------------

I am not sure what is the best way to fix this, I can think of several 
approaches:

  (a) change the sources so that it would be impossible to add the same IP 
      multiple times, even with different masks. I cannot think of any
      situation where adding the same IP (but with different mask) to the same
      interface could be useful. But maybe I am wrong?

  (b) improve the deletion algorithm in fib_del_ifaddr()

  (c) add a periodic cleanup that will purge all entries from 'local' table if
      there are no corresponding IPs on the interface list


Impact
------

Stale entries in RT_TABLE_LOCAL make ARP reply to requests for that IPs, even 
though these IPs do not belong to any interface.

These scenarios might seem a bit pathological, but in reality they are 
possible on clusters with multiple addresses on several interfaces, where 
addresses are added/deleted for service migration. Address migration can be 
done both by software and by system administrators and if by mistake a wrong 
mask is used, we can get this situation.

And yes, one of HP customers met exactly this problem. They saw a 'duplicate 
IP' issue after migrating some services and found that the host replies to 
ARP-request even though 'ip addr ls' did not show this address. It is not 
common knowledge that ARP implementation uses RT_TABLE_LOCAL to decide whether 
IP is local, so they were unable to understand what is wrong.

Regards,
Alex

------------------------------------------------------------------
Alexandre Sidorenko             email: asid@...com
WTEC Linux                      Hewlett-Packard (Canada)
------------------------------------------------------------------

Download attachment "iptest.sh" of type "application/x-shellscript" (598 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ