[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.20.1806212218070.2159@ja.home.ssi.bg>
Date: Thu, 21 Jun 2018 22:57:14 +0300 (EEST)
From: Julian Anastasov <ja@....bg>
To: Grant Taylor <gtaylor@...tconsulting.net>
cc: Akshat Kakkar <akshat.1984@...il.com>,
netdev <netdev@...r.kernel.org>,
cronolog+lartc <cronolog+lartc@...glemail.com>,
lartc <lartc@...r.kernel.org>,
Erik Auerswald <auerswal@...x-ag.uni-kl.de>
Subject: Re: Route fallback issue
Hello,
On Wed, 20 Jun 2018, Grant Taylor wrote:
> On 06/20/2018 01:00 PM, Julian Anastasov wrote:
> > You can also try alternative routes.
>
> "Alternative routes"? I can't say as I've heard that description as a
> specific technique / feature / capability before.
>
> Is that it's official name?
I think so
> Where can I find out more about it?
You can search on net. I have some old docs on
these issues, they should be actual:
http://ja.ssi.bg/dgd-usage.txt
> > But as the kernel supports only default alternative routes, you can put them
> > in their own table:
>
> I don't know that that is the case any more.
>
> I was able to issue the following commands without a problem:
>
> # ip route append 192.0.2.128/26 via 192.0.2.62
> # ip route append 192.0.2.128/26 via 192.0.2.126
>
> I crated two network namespaces and had a pair of vEths between them
> (192.0.2.0/26 and 192.0.2.64/26). I added a dummy network to each NetNS
> (192.0.2.128/26 and 192.0.2.192/26).
>
> I ran the following commands while a persistent ping was running from one
> NetNS to the IP on the other's dummy0 interface:
>
> # ip link set ns2b up && ip route append 192.0.2.192/26 via 192.0.2.126 && ip
> link set ns2a down
> (pause and watch things)
> # ip link set ns2a up && ip route append 192.0.2.192/26 via 192.0.2.62 && ip
> link set ns2b down
> (pause and watch things)
>
> I could iterate between the two above commands and pings continued to work.
>
> So, I think that it's now possible to use "alternate routes" (new to me) on
> specific prefixes in addition to the default. Thus there is no longer any
> need for a separate table and the associated IP rule.
Not true. net/ipv4/fib_semantics.c:fib_select_path()
calls fib_select_default() only when prefixlen = 0 (default route).
Otherwise, only the first route will be considered.
fib_select_default() is the function that decides which
nexthop is reachable and whether to contact it. It uses the ARP
state via fib_detect_death(). That is all code that is behind this
feature called "alternative routes": the kernel selects one
based on nexthop's ARP state. Routes with different metric are
considered only when the routes with lower metric are removed.
> I'm running kernel version 4.9.76.
>
> I did go ahead and set net.ipv4.conf.ns2b.ignore_routes_with_linkdown to 1.
>
> for i in /proc/sys/net/ipv4/conf/*/ignore_routes_with_linkdown; do echo 1 >
> $i; done
IIRC, this flag invalidates nexthops depending on
the link state. If your link is always UP it does not help
much. If you rely on user space tool, you can check the state
of the desired hops: device link state, your gateway to
ISP, one or more gateways in the ISP network which you
consider permanent part of the path via this ISP.
> Doing that dropped the number of dropped pings from 60 ~ 90 (1 / second) to 0
> ~ 5 (1 / second). (Rarely, maybe 1 out of 20 flips, would it take upwards of
> 10 pings / seconds.)
>
> > # Alternative routes use same metric!!!
> > ip route append default via 192.168.1.254 dev eno1 table 100
> > ip route append default via 192.168.2.254 dev eno2 table 100
> > ip rule add prio 100 to 172.16.0.0/12 table 100
>
> I did have to "append" the route. I couldn't just "add" the route. When I
> tried to "add" the second route, I got an error about the route already
> existing. Using "append" instead of "add" with everything else the same
> worked just fine.
>
> Note: I did go ahead and remove the single route that was added via "add" and
> used "append" for both.
First route can be created with 'add' but all next
alternative routes can be added only with "append". If you
successfully add them with "add" it means they are not
alternatives to the first one, they are not considered at all.
Regards
--
Julian Anastasov <ja@....bg>
Powered by blists - more mailing lists