[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <56964A14.7080008@list.ru>
Date: Wed, 13 Jan 2016 15:59:00 +0300
From: Stas Sergeev <stsp@...t.ru>
To: Hannes Frederic Sowa <hannes@...essinduktion.org>
Cc: netdev <netdev@...r.kernel.org>
Subject: Re: Q: bad routing table cache entries
13.01.2016 02:07, Hannes Frederic Sowa пишет:
> Hi,
>
> On 12.01.2016 23:57, Stas Sergeev wrote:
>> 13.01.2016 01:26, Hannes Frederic Sowa пишет:
>>> I didn't check a full featured setup but just did some dirty testing
>>> with namespaces and I had correct arp request for the now to be
>>> assumed on-link router on the external veth.
>> I haven't checked anything with arp.
>> I set up tcpdump to only capture icmp.
>> What would you like me to check, could you please
>> give the detailed instructions?
>
> Check simply for arp traffic on the interface. arp requests should leave your client and ask directly for the new router you got as next-hop. If it does not answer, there is the problem.
It does not answer:
tcpdump -vn -i eth0 arp host 192.168.10.202
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
15:38:23.334783 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.1 tell 192.168.10.202, length 28
15:38:24.329949 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.1 tell 192.168.10.202, length 28
15:38:25.329946 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.1 tell 192.168.10.202, length 28
15:38:26.338987 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.1 tell 192.168.10.202, length 28
This is what happens when I try to ping via the redirected route.
I wonder why should it answer. Suppose it has shared_media disabled,
should it answer into a different subnet even then?
>>>> If it is not - how can I even see that it exist? How to
>>>> list these redirect routes?
>>>
>>> Yeah, that might be a minor issue. The rt_cache procfs files are empty
>>> since the deletion of the cache and we probably don't have an
>>> interface for next hop exceptions, I consider this todo. :) ip route
>>> get is your only hope right now.
>>>
>>> Anyway, seems like there are problems with redirect timeout somehow. I
>>> am investigating this.
>>>
>>>> I'd like to do some investigations, but this looks no
>>>> more than a black magic without a proper support
>>>> from tools, proper documentation, etc.
>>>
>>> Hmm, so far I think shared_media is behaving like it should,
>> No, unless you correct the documentation:
>> https://www.frozentux.net/ipsysctl-tutorial/chunkyhtml/theconfvariables.html
>>
>> It says not what you say.
>> So this feature is essentially poorly (or wrongly) documented.
>
> I am sorry, but I have no access to this website. I just grepped around in the Documentation/ directory of the kernel:
> <https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/ip-sysctl.txt?id=refs/tags/v4.4#n1014>
>
> It is correctly documented
With just a single-line description like this:
"Send(router) or accept(host) RFC1620 shared media redirects."
Isn't this too few for the feature that completely changes the
meaning of such fundamental things as the netmask is? IMHO the
books and articles should have been written before making it a default. :)
And how about /proc/sys/net/ipv4/route/gc_interval, redirect_load,
redirect_number, redirect_silence? Are they documented at all?
I am trying to make the problematic event to trigger faster for
debugging, or make the cache to expire faster, but this all looks
completely undocumented and the intuitive guesses do not work.
>>> besides maybe it shouldn't be the default setting. Maybe someone who
>>> can remember why it is default could chime in?
>>>
>>>> And I suspect that shared_media is disabled on a 0.1
>>>> router, so I wonder if this can work at all, even if the node
>>>> is cured to do the right thing with those redirects.
>>>> In a nearby message David Miller says:
>>>
>>> Default is that shared_media is enabled,
>> On what OS, and since what version?
>>
>>> so the chances are relatively high that it is enabled if it is not
>>> turned off.
>> I don't even know what is there in a 0.1 router - maybe windows95,
>> who knows. You can't assume the latest linux kernel is everywhere.
>
> Looking into the kernel cvs history the change was done in 2000(!). Would be pretty strange to find such an old kernel there.
The router at 192.168.0.1 may have some other OS, maybe freebsd.
You can't assume linux is everywhere.
Powered by blists - more mailing lists