[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <willemdebruijn.kernel.5c4c191262c5@gmail.com>
Date: Sat, 13 Dec 2025 15:54:18 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Vadim Fedorenko <vadim.fedorenko@...ux.dev>,
"David S. Miller" <davem@...emloft.net>,
David Ahern <dsahern@...nel.org>,
Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
Willem de Bruijn <willemb@...gle.com>,
Jakub Kicinski <kuba@...nel.ord>
Cc: Shuah Khan <shuah@...nel.org>,
Ido Schimmel <idosch@...dia.com>,
netdev@...r.kernel.org,
Vadim Fedorenko <vadim.fedorenko@...ux.dev>
Subject: Re: [PATCH net 1/2] net: fib: restore ECMP balance from loopback
Vadim Fedorenko wrote:
> Preference of nexthop with source address broke ECMP for packets with
> source address from loopback interface. Original behaviour was to
> balance over nexthops while now it uses the latest nexthop from the
> group.
How does the loopback device specifically come into this?
>
> For the case with 198.51.100.1/32 assigned to lo:
>
> before:
> done | grep veth | awk ' {print $(NF-2)}' | sort | uniq -c:
> 255 veth3
>
> after:
> done | grep veth | awk ' {print $(NF-2)}' | sort | uniq -c:
> 122 veth1
> 133 veth3
>
> Fixes: 32607a332cfe ("ipv4: prefer multipath nexthop that matches source address")
> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@...ux.dev>
> ---
> net/ipv4/fib_semantics.c | 21 +++++++++++----------
> 1 file changed, 11 insertions(+), 10 deletions(-)
>
> diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> index a5f3c8459758..c54b4ad9c280 100644
> --- a/net/ipv4/fib_semantics.c
> +++ b/net/ipv4/fib_semantics.c
> @@ -2165,9 +2165,9 @@ static bool fib_good_nh(const struct fib_nh *nh)
> void fib_select_multipath(struct fib_result *res, int hash,
> const struct flowi4 *fl4)
> {
> + bool first = false, found = false;
> struct fib_info *fi = res->fi;
> struct net *net = fi->fib_net;
> - bool found = false;
> bool use_neigh;
> __be32 saddr;
>
> @@ -2190,23 +2190,24 @@ void fib_select_multipath(struct fib_result *res, int hash,
> (use_neigh && !fib_good_nh(nexthop_nh)))
> continue;
>
> - if (!found) {
> + if (saddr && nexthop_nh->nh_saddr == saddr) {
> res->nh_sel = nhsel;
> res->nhc = &nexthop_nh->nh_common;
> - found = !saddr || nexthop_nh->nh_saddr == saddr;
> + return;
This can return a match that exceeds the upper bound, while better
matches may exist.
Perhaps what we want is the following:
1. if there are matches that match saddr, prefer those above others
- take the first match, as with hash input that results in load
balancing across flows
2. else, take any match
- again, first fit
If no match below fib_nh_upper_bound is found, fall back to the first
fit above that exceeds nh_upper_bound. Again, prefer first fit of 1 if
it exists, else first fit of 2.
If so then we need up to two concurrent stored options,
first_match_saddr and first.
Or alternatively use a score similar to inet listener lookup.
Since a new variable is added, I would rename found with
first_match_saddr or similar to document the intent.
> }
>
> - if (hash > nh_upper_bound)
> - continue;
> -
> - if (!saddr || nexthop_nh->nh_saddr == saddr) {
> + if (!first) {
> res->nh_sel = nhsel;
> res->nhc = &nexthop_nh->nh_common;
> - return;
> + first = true;
> }
>
> - if (found)
> - return;
> + if (found || hash > nh_upper_bound)
> + continue;
> +
> + res->nh_sel = nhsel;
> + res->nhc = &nexthop_nh->nh_common;
> + found = true;
>
> } endfor_nexthops(fi);
> }
> --
> 2.47.3
>
Powered by blists - more mailing lists