[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZMHwROD1AJrd4pND@Laptop-X1>
Date: Thu, 27 Jul 2023 12:19:16 +0800
From: Hangbin Liu <liuhangbin@...il.com>
To: David Ahern <dsahern@...nel.org>
Cc: Stephen Hemminger <stephen@...workplumber.org>,
Ido Schimmel <idosch@...sch.org>, netdev@...r.kernel.org,
"David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Thomas Haller <thaller@...hat.com>
Subject: Re: [Questions] Some issues about IPv4/IPv6 nexthop route
On Wed, Jul 26, 2023 at 09:57:59AM -0600, David Ahern wrote:
> > So my questions are, should we show weight/scope for IPv4? How to deal the
> > type/proto info missing for IPv6? How to deal with the difference of merging
> > policy for IPv4/IPv6?
> > + ip route add 172.16.105.0/24 table 100 via 172.16.104.100 dev dummy1
> > + ip route append 172.16.105.0/24 table 100 via 172.16.104.100 dev dummy2
>
> > + ip route add 172.16.106.0/24 table 100 nexthop via 172.16.104.100 dev dummy1 weight 1
> > + ip route append 172.16.106.0/24 table 100 nexthop via 172.16.104.100 dev dummy1 weight 2
>
> Weight only has meaning with a multipath route. In both of these caess
> these are 2 separate entries in the FIB
Yes, we know these are 2 separate entries. The NM developers know these
are 2 separate entries. But the uses don't know, and the route daemon don't
know. If a user add these 2 entires. And kernel show them as the same. The
route daemon will store them as a same entries. But if the user delete the
entry. We actually delete one and left one in the kernel. This will make
the route daemon and user confused.
So my question is, should we export the weight/scope? Or stop user add
the second entry? Or just leave it there and ask route daemon/uses try
the new nexthop api.
> with the second one only hit under certain conditions.
Just curious, with what kind of certain conditions we will hit the second one?
>
> > + ip route show table 200
> > default dev dummy1 scope link
> > local default dev dummy1 scope host
> > 172.16.107.0/24 via 172.16.104.100 dev dummy1
> > 172.16.107.0/24 via 172.16.104.100 dev dummy1
> >
> > + ip addr add 2001:db8:101::1/64 dev dummy1
> > + ip addr add 2001:db8:101::2/64 dev dummy2
> > + ip route add 2001:db8:102::/64 via 2001:db8:101::10 dev dummy1 table 100
> > + ip route prepend 2001:db8:102::/64 via 2001:db8:101::10 dev dummy2 table 100
> > + ip route add local 2001:db8:103::/64 via 2001:db8:101::10 dev dummy1 table 100
> > + ip route prepend unicast 2001:db8:103::/64 via 2001:db8:101::10 dev dummy2 table 1
> Unfortunately the original IPv6 multipath implementation did not follow
> the same semantics as IPv4. Each leg in a MP route is a separate entry
> and the append and prepend work differently for v6. :-(
>
> This difference is one of the many goals of the separate nexthop objects
> -- aligning ipv4 and ipv6 behavior which can only be done with a new
> API. There were many attempts to make the legacy route infrastructure
> more closely aligned between v4 and v6 and inevitably each was reverted
> because it broke some existing user.
Yes, I understand the difficult and risk to aligned the v4/v6 behavior.
On the other hand, changing to new nexthop api also a large work for the
routing daemons. Here is a quote from NM developers replied to me.
"If the issues (this and others) of the netlink API for route objects can be
fixed, then there seems less reason to change NetworkManager to nexthop
objects. If it cannot (won't) be fixed, then would be another argument for using
nexthop objects..."
I will check if all the issues could be fixed with new nexthop api.
Thanks
Hangbin
Powered by blists - more mailing lists