[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bf3bb290-25b7-e327-851a-d6a036daab03@6wind.com>
Date: Fri, 1 Sep 2023 11:36:51 +0200
From: Nicolas Dichtel <nicolas.dichtel@...nd.com>
To: David Ahern <dsahern@...nel.org>, Hangbin Liu <liuhangbin@...il.com>
Cc: netdev@...r.kernel.org, "David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Ido Schimmel <idosch@...sch.org>,
Thomas Haller <thaller@...hat.com>
Subject: Re: [PATCH net-next] ipv6: do not merge differe type and protocol
routes
Le 31/08/2023 à 20:27, David Ahern a écrit :
> On 8/31/23 5:58 AM, Nicolas Dichtel wrote:
>> Le 31/08/2023 à 12:14, Hangbin Liu a écrit :
>>> Hi Nicolas,
>>> On Thu, Aug 31, 2023 at 10:17:19AM +0200, Nicolas Dichtel wrote:
>>>>>>> So let's skip counting the different type and protocol routes as siblings.
>>>>>>> After update, the different type/protocol routes will not be merged.
>>>>>>>
>>>>>>> + ip -6 route show table 100
>>>>>>> local 2001:db8:103::/64 via 2001:db8:101::10 dev dummy1 metric 1024 pref medium
>>>>>>> 2001:db8:103::/64 via 2001:db8:101::10 dev dummy2 metric 1024 pref medium
>>>>>>>
>>>>>>> + ip -6 route show table 200
>>>>>>> 2001:db8:104::/64 via 2001:db8:101::10 dev dummy1 proto kernel metric 1024 pref medium
>>>>>>> 2001:db8:104::/64 via 2001:db8:101::10 dev dummy2 proto bgp metric 1024 pref medium
>>>>>>
>>>>>> This seems wrong. The goal of 'ip route append' is to add a next hop, not to
>>>>>> create a new route. Ok, it adds a new route if no route exists, but it seems
>>>>>> wrong to me to use it by default, instead of 'add', to make things work magically.
>>>>>
>>>>> Legacy API; nothing can be done about that (ie., that append makes a new
>>>>> route when none exists).
>>>>>
>>>>>>
>>>>>> It seems more correct to return an error in these cases, but this will change
>>>>>> the uapi and it may break existing setups.
>>>>>>
>>>>>> Before this patch, both next hops could be used by the kernel. After it, one
>>>>>> route will be ignored (the former or the last one?). This is confusing and also
>>>>>> seems wrong.
>>>>>
>>>>> Append should match all details of a route to add to an existing entry
>>>>> and make it multipath. If there is a difference (especially the type -
>>>>> protocol difference is arguable) in attributes, then they are different
>>>>> routes.
>>>>>
>>>>
>>>> As you said, the protocol difference is arguable. It's not a property of the
>>>> route, just a hint.
>>>> I think the 'append' should match a route whatever the protocol is.
>>>> 'ip route change' for example does not use the protocol to find the existing
>>>> route, it will update it:
>>>>
>>>> $ ip -6 route add 2003:1:2:3::/64 via 2001::2 dev eth1
>>>> $ ip -6 route
>>>> 2003:1:2:3::/64 via 2001::2 dev eth1 metric 1024 pref medium
>>>> $ ip -6 route change 2003:1:2:3::/64 via 2001::2 dev eth1 protocol bgp
>>>> $ ip -6 route
>>>> 2003:1:2:3::/64 via 2001::2 dev eth1 proto bgp metric 1024 pref medium
>>>> $ ip -6 route change 2003:1:2:3::/64 via 2001::2 dev eth1 protocol kernel
>>>> $ ip -6 route
>>>> 2003:1:2:3::/64 via 2001::2 dev eth1 proto kernel metric 1024 pref medium
>>>
>>> Not sure if I understand correctly, `ip route replace` should able to
>>> replace all other field other than dest and dev. It's for changing the route,
>>> not only nexthop.
>>>>
>>>> Why would 'append' selects route differently?
>>>
>>> The append should also works for a single route, not only for append nexthop, no?
>> I don't think so. The 'append' should 'join', not add. Adding more cases where a
>> route is added instead of appended doesn't make the API clearer.
>>
>> With this patch, it will be possible to add a new route with the 'append'
>> command when the 'add' command fails:
>> $ ip -6 route add local 2003:1:2:3::/64 via 2001::2 dev eth1 table 200
>> $ ip -6 route add unicast 2003:1:2:3::/64 via 2001::2 dev eth1 table 200
>> RTNETLINK answers: File exists
>>
>> $ ip -6 route add 2003:1:2:3::/64 via 2001::2 dev eth1 protocol bgp table 200
>> $ ip -6 route add 2003:1:2:3::/64 via 2001::2 dev eth1 protocol kernel table 200
>> RTNETLINK answers: File exists
>>
>> This makes the API more confusing and complex. And I don't understand how it
>> will be used later. There will be 2 routes on the system, but only one will be
>> used, which one? This is confusing.
>>
>>>
>>>>
>>>> This patch breaks the legacy API.
>>>
>>> As the patch's description. Who would expect different type/protocol route
>>> should be merged as multipath route? I don't think the old API is correct.
>> The question is not 'who expect', but 'is there some systems somewhere that rely
>> on this (deliberately or not)'.
>> Frankly, the protocol is just informative, so I don't see why it is a problem to
>> ignore it with the 'append' command.
>> For the type, it is weird, for sure. Rejecting the command seems better than
>> duplicating routes. Which route is used by the stack?
>>
>>
>
> Part of my intent with fib_tests.sh was to document the legacy meaning
> of 'append, prepend, replace, and change' options while also providing a
> test script to detect changes that cause a regression.
Yes, it's a good idea for sure.
>
> I do agree now that protocol is informative (passthrough from the kernel
> perspective) so not really part of the route. That should be dropped
> from the patch leaving just a check on rt_type as to whether the routes
> are different. From there the append, prepend, replace and change
> semantics should decide what happens (ie., how the route is inserted).
Right. What can guide us is the meaning/concept/benefit of having this kind of
routing table:
local 2001:db8:103::/64 via 2001:db8:101::10 dev dummy1 metric 1024 pref medium
2001:db8:103::/64 via 2001:db8:101::10 dev dummy2 metric 1024 pref medium
I don't understand how this is used/useful. It's why I ask for the use case/goal
of this patch.
How does the user know which route is used?
Regards,
Nicolas
Powered by blists - more mailing lists