[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <32d40b75d5589b73e17198eb7915c546ea3ff9b1.camel@redhat.com>
Date: Mon, 11 Sep 2023 11:50:23 +0200
From: Thomas Haller <thaller@...hat.com>
To: Benjamin Poirier <bpoirier@...dia.com>, David Ahern <dsahern@...nel.org>
Cc: nicolas.dichtel@...nd.com, Stephen Hemminger
<stephen@...workplumber.org>, Hangbin Liu <liuhangbin@...il.com>, Ido
Schimmel <idosch@...sch.org>, netdev@...r.kernel.org, "David S . Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
<kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>
Subject: Re: [PATCH net-next] ipv4/fib: send RTM_DELROUTE notify when flush
fib
On Tue, 2023-08-08 at 14:59 -0400, Benjamin Poirier wrote:
> On 2023-08-07 19:44 -0600, David Ahern wrote:
> > On 8/2/23 3:10 AM, Thomas Haller wrote:
> > > On Fri, 2023-07-28 at 09:42 -0600, David Ahern wrote:
> > > > On 7/28/23 7:01 AM, Nicolas Dichtel wrote:
> > > >
> > > > > Managing a cache with this is not so obvious 😉
> > > >
> > > >
> > > > FRR works well with Linux at this point,Â
> > >
> > > Interesting. Do you have a bit more information?
> > >
> > > > and libnl's caching was updated
> > > > ad fixed by folks from Cumulus Networks so it should be a good
> > > > too.
> > >
> > >
> > > Which "libnl" do you mean?
> >
> > yes. https://github.com/thom311/libnl.git
> >
> > >
> > > Route caching in libnl3 upstream is very broken (which I am to
> > > blame
> > > for, as I am the maintainer).
> > >
> >
> > as someone who sent in patches it worked for all of Cumulus' uses
> > cases
> > around 2018-2019 time frame. Can't speak for the status today.
> >
>
> Nowadays Cumulus still relies on an OOT kernel patch almost identical
> to
> Hangbin's.
>
> Looking through an old ticket on the subject, I can see you had
> indeed
> prepared patches to make Cumulus' libnl-using application (switchd)
> delete route entries from the libnl cache based on link down events.
> Ultimately, those changes were left on the table for two reasons:
> 1) This would've been the first time for Cumulus that the libnl cache
> would be modified by the application instead of in response to
> netlink
> events. Roopa was concerned that there might be race conditions.
> 2) There was an expectation at the time that Cumulus would move to
> switchdev, which would've made switchd and libnl unnecessary.
>
> I brought up the removal of this OOT kernel patch again a few months
> ago
> but there was not enough interest internally. In fact, I was just
> asked
> to add *more* notifications for a similar case, sigh.
Hi,
Those patches were sent to me directly, and never hit the mailing list
(due to technical problems with the list). More importantly, the
changes were complicated, combined with having no tests. I hesitated to
merge them. Nowadays, the unit test setup for libnl3 improved, and it
would be great to fix this.
This is all entirely my fault as maintainer, but two points:
- libnl3 upstream still does *not* handle route caching correctly (at
all).
- the fact that it isn't fixed in more than a decade, shows IMO that
getting caching right for routes is very hard. Patches that improve the
behavior should not be rejected with "look at libnl3 or FRR".
If FRR gets this right, it's honestly an impressive accomplishment. I'd
still be curious about the details.
Thomas
Powered by blists - more mailing lists