[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOiHx=kRew115+43rkPioe=wWNg1TNx5u9F3+frNkOK1M9PySw@mail.gmail.com>
Date: Thu, 24 Nov 2022 15:15:46 +0100
From: Jonas Gorski <jonas.gorski@...il.com>
To: Ido Schimmel <idosch@...sch.org>
Cc: Network Development <netdev@...r.kernel.org>,
David Ahern <dsahern@...nel.org>
Subject: Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of
routes in 6.1
Hi Ido,
On Thu, 24 Nov 2022 at 13:41, Ido Schimmel <idosch@...sch.org> wrote:
>
> On Thu, Nov 24, 2022 at 10:20:00AM +0100, Jonas Gorski wrote:
> > Hello,
> >
> > when an IPv4 route gets removed because its nexthop was deleted, the
> > kernel does not send a RTM_DELROUTE netlink notifications anymore in
> > 6.1. A bisect lead me to 61b91eb33a69 ("ipv4: Handle attempt to delete
> > multipath route when fib_info contains an nh reference"), and
> > reverting it makes it work again.
> >
> > It can be reproduced by doing the following and listening to netlink
> > (e.g. via ip monitor)
> >
> > ip a a 172.16.1.1/24 dev veth1
> > ip nexthop add id 100 via 172.16.1.2 dev veth1
> > ip route add 172.16.101.0/24 nhid 100
> > ip nexthop del id 100
> >
> > where the nexthop del will trigger a RTM_DELNEXTHOP message, but no
> > RTM_DELROUTE, but the route is gone afterwards anyways.
>
> I tried the reproducer and I get the same notifications in ip monitor
> regardless of whether 61b91eb33a69 is reverted or not.
>
> Looking at the code and thinking about it, I don't think we ever
> generated RTM_DELROUTE notifications when IPv4 routes were flushed (to
> avoid a notification storm).
>
> Are you running an upstream kernel?
Okay, after having a second look, you are right, and I got myself
confused by IPv6 generating RTM_DELROUTE notifications, but which is
besides the point.
The point where it fails is that FRR tries to delete its route(s), and
fails to do so with this commit applied (=> RTM_DELROUTE goes
missing), then does the RTM_DELNEXTHOP.
So while there is indeed no RTM_DELROUTE generated in response to the
kernel, it was generated when FRR was successfully deleting its routes
before.
Not sure if this already qualifies as breaking userspace though, but
it's definitely something that used to work with 6.0 and before, and
does not work anymore now.
The error in FRR log is:
[YXPF5-B2CE0] netlink_route_multipath_msg_encode: RTM_DELROUTE
10.0.1.0/24 vrf 0(254)
[HYEHE-CQZ9G] nl_batch_send: netlink-dp (NS 0), batch size=44, msg cnt=1
[XS99C-X3KS5] netlink-dp (NS 0): error: No such process
type=RTM_DELROUTE(25), seq=22, pid=2419702167
with the revert it succeeds.
I'll see if I can get a better idea of the actual netlink message sent.
Regards
Jonas
Powered by blists - more mailing lists