[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eeb19959-26f4-e8c1-abde-726dbb2b828d@6wind.com>
Date: Wed, 30 Aug 2023 17:29:39 +0200
From: Nicolas Dichtel <nicolas.dichtel@...nd.com>
To: Hangbin Liu <liuhangbin@...il.com>, netdev@...r.kernel.org
Cc: "David S . Miller" <davem@...emloft.net>, David Ahern
<dsahern@...nel.org>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Ido Schimmel <idosch@...sch.org>, Thomas Haller <thaller@...hat.com>
Subject: Re: [PATCH net-next] ipv6: do not merge differe type and protocol
routes
Le 30/08/2023 à 08:15, Hangbin Liu a écrit :
> Different with IPv4, IPv6 will auto merge the same metric routes into
> multipath routes. But the different type and protocol routes are also
> merged, which will lost user's configure info. e.g.
>
> + ip route add local 2001:db8:103::/64 via 2001:db8:101::10 dev dummy1 table 100
> + ip route append unicast 2001:db8:103::/64 via 2001:db8:101::10 dev dummy2 table 100
> + ip -6 route show table 100
> local 2001:db8:103::/64 metric 1024 pref medium
> nexthop via 2001:db8:101::10 dev dummy1 weight 1
> nexthop via 2001:db8:101::10 dev dummy2 weight 1
>
> + ip route add 2001:db8:104::/64 via 2001:db8:101::10 dev dummy1 proto kernel table 200
> + ip route append 2001:db8:104::/64 via 2001:db8:101::10 dev dummy2 proto bgp table 200
> + ip -6 route show table 200
> 2001:db8:104::/64 proto kernel metric 1024 pref medium
> nexthop via 2001:db8:101::10 dev dummy1 weight 1
> nexthop via 2001:db8:101::10 dev dummy2 weight 1
>
> So let's skip counting the different type and protocol routes as siblings.
> After update, the different type/protocol routes will not be merged.
>
> + ip -6 route show table 100
> local 2001:db8:103::/64 via 2001:db8:101::10 dev dummy1 metric 1024 pref medium
> 2001:db8:103::/64 via 2001:db8:101::10 dev dummy2 metric 1024 pref medium
>
> + ip -6 route show table 200
> 2001:db8:104::/64 via 2001:db8:101::10 dev dummy1 proto kernel metric 1024 pref medium
> 2001:db8:104::/64 via 2001:db8:101::10 dev dummy2 proto bgp metric 1024 pref medium
This seems wrong. The goal of 'ip route append' is to add a next hop, not to
create a new route. Ok, it adds a new route if no route exists, but it seems
wrong to me to use it by default, instead of 'add', to make things work magically.
It seems more correct to return an error in these cases, but this will change
the uapi and it may break existing setups.
Before this patch, both next hops could be used by the kernel. After it, one
route will be ignored (the former or the last one?). This is confusing and also
seems wrong.
>
> Reported-by: Thomas Haller <thaller@...hat.com>
> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2161994
Please, don't put private link. The bug entry is not public.
Can you explain what is the initial problem?
> Signed-off-by: Hangbin Liu <liuhangbin@...il.com>
> ---
> All fib test passed:
> Tests passed: 203
> Tests failed: 0
> ---
> net/ipv6/ip6_fib.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> index 28b01a068412..f60f5d14f034 100644
> --- a/net/ipv6/ip6_fib.c
> +++ b/net/ipv6/ip6_fib.c
> @@ -1133,6 +1133,11 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct fib6_info *rt,
> rt->fib6_pmtu);
> return -EEXIST;
> }
> +
> + if (iter->fib6_type != rt->fib6_type ||
> + iter->fib6_protocol != rt->fib6_protocol)
> + goto next_iter;
> +
> /* If we have the same destination and the same metric,
> * but not the same gateway, then the route we try to
> * add is sibling to this route, increment our counter
Powered by blists - more mailing lists