[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111117111145.252924f5@asterix.rh>
Date: Thu, 17 Nov 2011 11:11:45 -0200
From: Flavio Leitner <fbl@...hat.com>
To: Ivan Zahariev <famzah@...soft.com>
Cc: netdev@...r.kernel.org
Subject: Re: Unable to flush ICMP redirect routes in kernel 3.0+
On Thu, 17 Nov 2011 10:10:08 +0200
Ivan Zahariev <famzah@...soft.com> wrote:
> On 17.11.2011 г. 02:33 ч., Flavio Leitner wrote:
> > On Thu, 17 Nov 2011 00:32:18 +0200
> > Ivan Zahariev<famzah@...soft.com> wrote:
> >
> >> On 11/15/2011 11:09 PM, Eric Dumazet wrote:
> >>> Le mardi 15 novembre 2011 à 22:23 +0200, Ivan Zahariev a écrit :
> >>>> Hello,
> >>>>
> >>>> We have changed nothing in our network infrastructure but only
> >>>> upgraded from Linux kernel 2.6.36.2 to 3.0.3. Here is the problem
> >>>> we are experiencing:
> >>>>
> >>>> ICMP redirected routes are cached forever, and they can be
> >>>> cleared only by a reboot.
> >>>>
> >> ### (bug #1) even though we flushed the route cache,
> >> the<redirected> route resurrects from somewhere; even without
> >> making any TCP requests ### this time what "ip" returns is
> >> consistent with the real (incorrect) routing behavior of machine5
> >> root@...hine5:~# ip route flush cache
> >> root@...hine5:~# ip route list cache match 8.8.4.4
> >> root@...hine5:~# ip route get 8.8.4.4
> >> 8.8.4.4 via 192.168.0.120 dev eth0 src 192.168.0.244
> >> cache<redirected> ipid 0x303a
> >>
> >> ### only a reboot clears the cached<redirected> routes
> > IIRC, the cache flush doesn't affect the inetpeer where the
> > redirected gateway is now stored, so even after flushing the
> > route cache, the inetpeer will restore the old info later.
> >
> > fbl
> OK, I guess my questions now are:
> * How to flush the inetpeer (redirected cache info) without having to
> reboot the machine?
It will expire after 10min if you don't use that specific host.
> * Why "ip route" returns an incorrect route; example:
I am sorry for not being clear before. It is a bug, indeed.
> ### (bug #2) what "ip route" returns is inconsistent, because we are
> using the <redirected> route 192.168.0.120 in reality
> ### note that the count of the route lines increased with one
> root@...hine5:~# ip route list cache match 8.8.4.4
> 8.8.4.4 from 192.168.0.244 tos lowdelay via 192.168.0.8 dev eth0
> cache ipid 0x303a
> 8.8.4.4 tos lowdelay via 192.168.0.8 dev eth0 src 192.168.0.244
> cache ipid 0x303a
> 8.8.4.4 via 192.168.0.8 dev eth0 src 192.168.0.244
> cache
> 8.8.4.4 from 192.168.0.244 tos lowdelay via 192.168.0.8 dev eth0
> cache ipid 0x303a
>
> ### After "ip route flush cache", the output of "ip route" gets
> consistent with the real routing behavior of machine5
> root@...hine5:~# ip route flush cache
> root@...hine5:~# ip route list cache match 8.8.4.4
> root@...hine5:~# ip route get 8.8.4.4
> 8.8.4.4 via 192.168.0.120 dev eth0 src 192.168.0.244
> cache <redirected> ipid 0x303a
>
Now the redirected gateway is stored in inetpeer which represents
an specific peer. In your case, you have one for 8.8.4.4.
When you flush the routing cache everything is flushed, except for
the inetpeer as far as I can tell. Later, when you try to access
the host 8.8.4.4 again, the lookup will create a fresh route but
also find the previous 8.8.4.4 inetpeer, so it will re-use the
previous redirected gateway.
Therefore, the routing is fine, but it is missing a way to
invalidade or expire all related inetpeer entries when the flush
happens.
The inetpeer will expire eventually, so waiting before trying again
would help to work around:
1) flush
2) wait to expire (10min)
3) try again
If you know how to compile a kernel, try to change these thresholds
below to expire faster, then you have to wait less for it to expire
instead of rebooting.
net/ipv4/inetpeer.c:
int inet_peer_minttl __read_mostly = 120 * HZ; /* TTL under high load:
120 sec */ int inet_peer_maxttl __read_mostly = 10 * 60 * HZ; /*
usual time to live: 10 min */
That above is just a workaround, indeed.
I am going to be on vacations in the next couple weeks, so I won't be
able to help fixing this any time soon. However, I am pretty sure
someone else will help though :)
fbl
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists