[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6de1bf9e-f492-8dec-22c3-d1a1c6940006@spamtrap.tnetconsulting.net>
Date: Mon, 25 Jun 2018 14:07:58 -0600
From: Grant Taylor <gtaylor@...tconsulting.net>
To: Julian Anastasov <ja@....bg>
Cc: Akshat Kakkar <akshat.1984@...il.com>,
netdev <netdev@...r.kernel.org>,
cronolog+lartc <cronolog+lartc@...glemail.com>,
lartc <lartc@...r.kernel.org>,
Erik Auerswald <auerswal@...x-ag.uni-kl.de>
Subject: Re: Route fallback issue
On 06/25/2018 12:50 PM, Julian Anastasov wrote:
> Hello,
Hi Julian,
> Yes, ARP state for unreachable GWs may be updated slowly, there is
> in-time feedback only for reachable state.
Fair.
Most of the installations where I needed D.G.D. to work would be okay
with a < 5 minute timeout. Obviously they would like faster, but
automation is a LOT better than waiting on manual intervention.
IMHO < 30 seconds is great. < 90 seconds is acceptable. < 300 seconds
leaves some room for improvement.
> You can create the two routes, of course. But only the default routes
> are alternative.
Are you saying that the functionality I'm describing only works for
default gateways or that the term "alternative route" only applies to
default gateways?
The testing that I did indicated that alternative routes worked for
specific prefixes too.
I tested multiple NetNSs with only directly attached routes and appended
routes to a destination prefix, no default gateway / route of last resort.
The behavior seemed to be different when ignore_routes_with_linkdown was
set verses unset. Specifically, ignore_routes_with_linkdown seemed to
help considerably.
Hence why I question the requirement for the "default" route verses a
route to a specific prefix.
Can you explain why I saw the behavior difference with
ignore_routes_with_linkdown if it only applies to the default route?
> The alternative routes work in this way:
>
> - on lookup, routes are walked in order - as listed in table
>
> - as long as route contains reachable gateway (ARP state), only this
> route is used
>
> - if some gateway becomes unreachable (ARP state), next alternative
> routes are tried
>
> - if ARP entry is expired (missing), this gateway can be probed if the
> route is before the currently used route. This is what happens initially
> when no ARP state is present for the GWs. It is bad luck if the probed
> GW is actually unreachable.
>
> - active probing by user space (ping GWs) can only help to keep the ARP
> state present for the used gateways. By this way, if ARP entry for GW
> is missing, the kernel will not risk to select unavailable route with
> the goal to probe the GW.
This all makes sense.
Please confirm if "gateway" in this context is the "/default/ gateway"
or not. I ask because arguably "gateway" can be used as a term to
describe the next hop for a route, or gateway, to a prefix. Further,
the "/default/ (gateway,router)" is the gateway or route of last resort.
Which to me means that "gateway" can be any route in this context.
> nexthop is the GW in the route
Thank you for confirming.
> Yes, the kernel avoids alternative routes with unreachable GWs
Fair enough.
> The multipath route uses all its alive nexthops at the same time... But
> you may need in the same way active probing by user space, otherwise
> unavailable GW can be selected.
I assume that the dead ECMP NEXTHOP is also subject to similar timeouts
as alternative routes. Correct?
> Yes, if you prefer, you may run PING every second to avoid such delays...
Agreed.
I'm trying to make sure I understand basic functionality before I do
things to modify it.
--
Grant. . . .
unix || die
Download attachment "smime.p7s" of type "application/pkcs7-signature" (3982 bytes)
Powered by blists - more mailing lists