[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241108022026.58907-1-Matt.Muggeridge@hpe.com>
Date: Thu, 7 Nov 2024 21:20:26 -0500
From: Matt Muggeridge <Matt.Muggeridge@....com>
To: idosch@...sch.org
Cc: Matt.Muggeridge@....com, davem@...emloft.net, dsahern@...nel.org,
edumazet@...gle.com, horms@...nel.org, kuba@...nel.org,
linux-api@...r.kernel.org, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, pabeni@...hat.com, stable@...r.kernel.org
Subject: Re: [PATCH net 1/1] net/ipv6: Netlink flag for new IPv6 Default Routes
> > You probably already know how to reproduce it, but in case it helps, I still
> > have the packet captures and can share them with you. Let me know if you'd
> > like me to share them (and how to share them).
>
> It would be best if you could provide a reproducer using iproute2:
> Configure a dummy device using ip-link, install the multipath route
> using ip-route, configure the neighbour table using ip-neigh and then
> perform route queries using "ip route get ..." showing the problem. We
> can then use it as the basis for a new test case in
> tools/testing/selftests/net/fib_tests.sh
I'll try to do that next week.
> BTW, do you have CONFIG_IPV6_ROUTER_PREF=y in your config?
Yes.
$ gunzip -c /proc/config.gz | grep ROUTER_PREF
CONFIG_IPV6_ROUTER_PREF=y
> >
> > As such, it still seems appropriate (to me) that this be implemented in the
> > legacy API as well as ensuring it works with the NH API.
>
> As I understand it you currently get different results because the
> kernel installs two default routes whereas user space can only create
> one default multipath route.
Yes, that's the end result of an underlying problem.
Perhaps more to the point, the fact that a coalesced, INCOMPLETE, multipath
route is selected when a REACHABLE alternative exists, is what prevents us
from using coalesced multipath routes. This seems like a bug, since it violates
RFC4861 6.3.6, bullet 1.
Imagine adding a 2nd router to an IPv6 network for added resiliency, but when
one becomes unreachable, some network flows keep choosing the unreachable
router. This is what is happening with ECMP routes. It doesn't happen with
multiple default routes.
I'll just reiterate earlier comments, this doesn't happen all of the time.
It seems I have a 50/50 chance of the INCOMPLETE route being selected.
> Before adding a new uAPI I want to
> understand the source of the difference and see if we can improve / fix
> the current multipath code so that the two behave the same. If we can
> get them to behave the same then I don't think user space will care
> about two default routes versus one default multipath route.
Exactly, I totally support that approach.
Regards,
Matt.
Powered by blists - more mailing lists