netdev - Re: 3.0: unexpected route cache entry for wrong segment?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 09 Feb 2012 18:45:19 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Michael Tokarev <mjt@....msk.ru>
Cc:	netdev <netdev@...r.kernel.org>
Subject: Re: 3.0: unexpected route cache entry for wrong segment?

Le jeudi 09 février 2012 à 21:02 +0400, Michael Tokarev a écrit :
> Hello.
> 
> I'm observing a situation when just one single IP
> address from entirely different segment gets routed
> locally as if it were in a directly-connected network.
> 
> Here's how.  The short version, to show the idea, first:
> 
> A host with single eth0 interface and single IP address
> (not counting loopback interface):
> 
> $ ip addr
> 8: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
>     link/ether 52:54:c0:a8:b1:02 brd ff:ff:ff:ff:ff:ff
>     inet 192.168.177.2/26 scope global eth0
> 
> $ ip route
> default via 192.168.177.5 dev eth0
> 192.168.177.0/26 dev eth0  proto kernel  scope link  src 192.168.177.2
> 
> $ ip neigh
> ...
> 192.168.177.5 dev eth0 lladdr 00:90:27:30:6d:1c REACHABLE
> 192.168.177.33 dev eth0 lladdr 38:60:77:25:3f:95 REACHABLE
> 192.168.19.166 dev eth0  FAILED
> 192.168.177.21 dev eth0 lladdr 52:54:c0:a8:b1:15 REACHABLE
> 
> The address in question is this 192.168.19.166 -- it should
> not be tried on locally connected ethernet segment, but instead
> should go to the (default) gateway at 192.168.177.5.
> 
> This machine is running 3.0.18 kernel.  The gateway (also
> running this kernel) can access the IP in question just fine
> (it is 2 hops away from the gateway, not reachable directly
> neither from the gw nor from the machine in question).
> 
> After some searching we found a very very similarly looking
> issue:
> 
>  http://lists.openwall.net/netdev/2011/11/15/126
>   "Unable to flush ICMP redirect routes in kernel 3.0+"
> 
> with a good reproducer:
> 
>  http://lists.openwall.net/netdev/2011/11/16/138
> 
> The issue however is that, in our case, I can't reproduce
> this problem at all using the way described by Ivan Zahariev
> in the last message: sending redirects from the geateay for
> "random" addresses does not make corresponding "persistent"
> cache entries, once the route on the gw gets removed, that
> IP address starts working again from the machine in question.
> 
> So now we have only one IP address that behaves like this,
> and I can't get other addresses to repeat its behavour.
> 
> The problem appeared suddenly, while the network was in
> use.
> 
> What is also interesting here is that the gateway should
> never send a redirect like that because it has explicit
> route for that network pointing to entirely different
> machine.
> 
> I can work around the _current_ problem we're facing by
> moving the host in question (192.168.19.166) to another
> IP address.  But I'd love to understand what's going on
> here.
> 
> Also, it appears that the patch that emerged from the
> mentioned discussion hasn't been released in any
> stable kernels so far - is there some issue with it?
> 
> And since I can't reproduce the issue here as described
> above, I've one more question: should it be reproducible?
> 
> And finally, here's some more details about our setup.
> It is actually a "bit" more complex, involving bridges,
> vlans, veth and tap devices.
> 
> The "host" in question is a lxc guest on veth interface.
> Its veth iface is connected to a bridge "tls-br" on the
> host.  I'm omiting some details still (like other lxc
> guests which have very similar config, and also kvm
> guests with tap interfaces).
> 
>  host$ ip addr
>  2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
>      link/ether 00:1f:c6:ef:e5:1b brd ff:ff:ff:ff:ff:ff
>  3: tls-vlan@...0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master tls-br state UP
>      link/ether 00:1f:c6:ef:e5:1b brd ff:ff:ff:ff:ff:ff
>  4: tls-br: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
>      link/ether 00:1f:c6:ef:e5:1b brd ff:ff:ff:ff:ff:ff
>      inet 192.168.177.15/26 brd 192.168.177.63 scope global tls-br
>  9: veth-tsrv: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master tls-br state UP qlen 1000
>      link/ether 5e:e8:4f:67:80:17 brd ff:ff:ff:ff:ff:ff
> 
> tls-br connects tls-vlan@...0 and veth-tsrv.  It has an
> address from the same 192.168.177/26 segment as the guest
> in question.
> 
>  host$ ip route
>  default via 192.168.177.5 dev tls-br
>  192.168.177.0/26 dev tls-br  proto kernel  scope link  src 192.168.177.15
>  (this is a complete routing table, there's no more routes)
> 
> What is also very interesting is that this problem with
> this single IP address affects ALL lxc machines on this
> host at once, and the host itself:
> 
>  host$ ip neigh
>  192.168.177.35 dev tls-br lladdr 6c:f0:49:9d:f2:0c STALE
>  192.168.19.166 dev tls-br  FAILED
>  192.168.177.38 dev tls-br lladdr 38:60:77:25:3f:9c STALE
>  192.168.177.5 dev tls-br lladdr 00:90:27:30:6d:1c DELAY
>  ...
> 
> (after trying to ping it).
> 
> Each "subdivision" on this host has its own arp table, but
> every subdivision (host itself or any of it lxc guests which
> all have similar config) always tries to reach thiis very
> IP address directly.
> 
>  otherLXCguest$ ip n
>  192.168.19.166 dev eth0  INCOMPLETE
>  192.168.177.15 dev eth0 lladdr 00:1f:c6:ef:e5:1b STALE
>  192.168.177.5 dev eth0 lladdr 00:90:27:30:6d:1c DELAY
> 
> So.. it looks like something does not work right across
> namespaces.
> 
> Any clue what's going on?
> 
> Thank you!

Did you try to apply by hand commits :

7cc9150ebe8ec06cafea9f1c10d92ddacf88d8ae   // added in 3.2
(route: fix ICMP redirect validation)

and
9cc20b268a5a14f5e57b8ad405a83513ab0d78dc
(ipv4: fix redirect handling)




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html