[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <544875a1-68e6-a8f3-4eb7-44f053605c3e@spamtrap.tnetconsulting.net>
Date: Thu, 21 Jun 2018 15:08:03 -0600
From: Grant Taylor <gtaylor@...tconsulting.net>
To: Julian Anastasov <ja@....bg>
Cc: Akshat Kakkar <akshat.1984@...il.com>,
netdev <netdev@...r.kernel.org>,
cronolog+lartc <cronolog+lartc@...glemail.com>,
lartc <lartc@...r.kernel.org>,
Erik Auerswald <auerswal@...x-ag.uni-kl.de>
Subject: Re: Route fallback issue
On 06/21/2018 01:57 PM, Julian Anastasov wrote:
> Hello,
Hi.
> I think so
Okay.
I'll do some more digging.
> You can search on net. I have some old docs on these issues, they should
> be actual:
>
> http://ja.ssi.bg/dgd-usage.txt
"DGD" or "Dead Gateway Detection" sounds very familiar. I referenced it
in an earlier reply.
I distinctly remember DGD not behaving satisfactorily years ago. Where
unsatisfactorily was something like 90 seconds (or more) to recover.
Which actually matches what I was getting without the
ignore_routes_with_linkdown=1 setting that David A. mentioned.
With ignore_routes_with_linkdown=1 things behaved much better.
> Not true. net/ipv4/fib_semantics.c:fib_select_path() calls
> fib_select_default() only when prefixlen = 0 (default route).
Okay.... My testing last night disagrees with you. Specifically, I was
able to add a alternate routes to the same prefix, 192.0.2.128/26.
There was not any default gateway configured on any of the NetNSs. So
everything was using routes for locally attacked or the two added via
"ip route append".
What am I misinterpreting? Or where are we otherwise talking past each
other?
> Otherwise, only the first route will be considered.
"only the first route" almost sounds like something akin to Equal Cost
Multi Path.
I was not expecting "alternative routes" to use more than one route at a
time, equally or otherwise. I was wanting for the kernel to fall back
to an alternate route / gateway / path in the event that the one that
was being used became unusable / unreachable.
So what should "Alternative Routes" do? How does this compare /
contract to E.C.M.P. or D.G.D.
> fib_select_default() is the function that decides which nexthop
> is reachable and whether to contact it. It uses the ARP state via
> fib_detect_death(). That is all code that is behind this feature called
> "alternative routes": the kernel selects one based on nexthop's ARP
> state.
Please confirm that you aren't entering / referring to E.C.M.P.
territory when you say "nexthop". I think that you are not, but I want
to ask and be sure, particularly seeing as how things are very closely
related.
It sounds like you're referring to literally the router that is the next
hop in the path. I.e. the device on the other end of the wire.
I'll have to find, read, and try to grok the code to have a better idea.
That being said, it looks like (based on the name) that
fib_select_default() deals with the default route. The testing I did
last night, and positive results, indicate that the kernel did what I
wanted it to do. (See above about D.G.D. vs E.C.M.P.)
So, it seems as if something about alternative routes worked using
non-default routes. I have no way of knowing if it was the code that
we're talking about, or something else that produced the results. Given
the way I did the test (specific prefixes, non-default, routes being
appended with no other routes) worked the way that I would have thought
that a feature that uses alternative routes (or historically D.G.D.)
would have worked.
The following ping works just fine as I bounce interfaces on NS1.
ns2# ping -I 192.0.2.254 192.0.2.129
I can confirm that traffic is moving back and forth between the vEth
links between the NetNSs. Granted, the traffic sticks to one vEth
interface until it goes away.
I can shut down ns2a on NS1 so that ns1a sees loss of link but but stays
up on NS2, and traffic moves to vEth-B.
I can then open up ns2a on NS1 so that ns1a sees link on NS2, and
re-append the route on NS1.
I can then shut down ns2b on NS1 so that ns1b sees loss of link but
stays up on NS2, and traffic moves to vEth-A.
I can then open up ns2b on NS1 so that ns1b sees link on NS2, and
re-append the route on NS1.
NS2 behaves exactly as I would hope. Traffic will move from the down
interface to the remaining up interface. Back and forth, no problem.
I don't know where the disconnect is, but I feel like there is one.
> Routes with different metric are considered only when the routes with
> lower metric are removed.
I agree with the statement. What I question is where metric came into
play here. All of the routes had the same (default) metric. None of
the routes I tested had different metrics.
ns1# ip route show
192.0.2.0/26 dev ns2a proto kernel scope link src 192.0.2.1
192.0.2.64/26 dev ns2b proto kernel scope link src 192.0.2.65
192.0.2.128/26 dev dummy0 proto kernel scope link src 192.0.2.129
192.0.2.192/26 via 192.0.2.62 dev ns2a
192.0.2.192/26 via 192.0.2.126 dev ns2b
ns2# ip route show
192.0.2.0/26 dev ns1a proto kernel scope link src 192.0.2.62
192.0.2.64/26 dev ns1b proto kernel scope link src 192.0.2.126
192.0.2.128/26 via 192.0.2.65 dev ns1b
192.0.2.128/26 via 192.0.2.1 dev ns1a
192.0.2.192/26 dev dummy0 proto kernel scope link src 192.0.2.254
> IIRC, this flag invalidates nexthops depending on the link state. If
> your link is always UP it does not help much.
That's what I gathered. So things like DSL & cable modems or other L2
bridging devices might not drop the link when their circuit drops.
This is also why I asked the follow up questions to David's email.
I want to do some testing to see if fib_multipath_use_neigh alters this
behavior at all. I'm hoping that it will invalidate an alternate route
if the MAC is not resolvable even if the physical link stays up.
Sure, the ARP cache may have a 30 ~ 120 second timeout before triggering
this behavior. But having that timeout and starting to use an
alternative route is considerably better than not using an alternative
route.
> If you rely on user space tool, you can check the state of the desired
> hops: device link state, your gateway to ISP, one or more gateways in the
> ISP network which you consider permanent part of the path via this ISP.
This is what I have thought about doing previously.
> First route can be created with 'add' but all next alternative routes
> can be added only with "append". If you successfully add them with
> "add" it means they are not alternatives to the first one, they are not
> considered at all.
ACK
--
Grant. . . .
unix || die
Download attachment "smime.p7s" of type "application/pkcs7-signature" (3982 bytes)
Powered by blists - more mailing lists