[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210130151143.GB3330615@shredder.lan>
Date: Sat, 30 Jan 2021 17:11:43 +0200
From: Ido Schimmel <idosch@...sch.org>
To: David Ahern <dsahern@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
davem@...emloft.net, amcohen@...dia.com, roopa@...dia.com,
sharpd@...dia.com, bpoirier@...dia.com, mlxsw@...dia.com,
Ido Schimmel <idosch@...dia.com>
Subject: Re: [PATCH net-next 05/10] net: ipv4: Emit notification when fib
hardware flags are changed
On Thu, Jan 28, 2021 at 08:33:22PM -0700, David Ahern wrote:
> On 1/28/21 8:04 PM, Jakub Kicinski wrote:
> > On Tue, 26 Jan 2021 15:23:06 +0200 Ido Schimmel wrote:
> >> Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags
> >> are changed. The aim is to provide an indication to user-space
> >> (e.g., routing daemons) about the state of the route in hardware.
> >
> > What does the daemon in the user space do with it?
>
> You don't want FRR for example to advertise a route to a peer until it
> is really programmed in h/w. This notification gives routing daemons
> that information.
Correct. It is in the cover letter:
"These flags are of interest to routing daemons since they would like to
delay advertisement of routes until they are installed in hardware."
Amit is working on follow-up to emit notifications when route offload
fails. This request also comes from the FRR team. Currently we have a
policy inside mlxsw to abort route offload and install a default route
that sends all the traffic to the CPU. It obviously kills the box and
anyway the policy is something user space should decide, not the kernel.
>
> >
> > The notification will only be generated for the _first_ ASIC which
> > offloaded the object. Which may be fine for you today but as an uAPI
> > it feels slightly lacking.
> >
> > If the user space just wants to make sure the devices are synced to
> > notifications from certain stage, wouldn't it be more idiomatic to
> > provide some "fence" operation?
> >
> > WDYT? David?
> >
>
> This feature was first discussed I think about 2 years ago - when I was
> still with Cumulus, so I already knew the intent and end goal.
>
> I think support for multiple ASICs / NICs doing this kind of offload
> will have a whole lot of challenges. I don't think this particular user
> notification is going to be a big problem - e.g., you could always delay
> the emit until all have indicated the offload.
I do not have experience with multi-ASIC systems, but my understanding
is that each ASIC has its own copy of the networking stack and the ASICs
are connected via front panel or backplane ports, like distinct
leaf/spine switches. In Linux, such a system can be supported by
registering a devlink instance for each ASIC and reloading each instance
to a separate namespace.
Thanks for reviewing, David
Powered by blists - more mailing lists