[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20181127100502.6992916a@cakuba.netronome.com>
Date: Tue, 27 Nov 2018 10:05:02 -0800
From: Jakub Kicinski <jakub.kicinski@...ronome.com>
To: Edward Cree <ecree@...arflare.com>
Cc: <davem@...emloft.net>, <oss-drivers@...ronome.com>,
<netdev@...r.kernel.org>, <jiri@...nulli.us>,
<xiyou.wangcong@...il.com>, <jhs@...atatu.com>,
<gerlitz.or@...il.com>, <ozsh@...lanox.com>, <vladbu@...lanox.com>,
John Hurley <john.hurley@...ronome.com>
Subject: Re: [PATCH net-next 4/6] nfp: flower: offload tunnel decap rules
via indirect TC blocks
On Tue, 27 Nov 2018 16:04:29 +0000, Edward Cree wrote:
> On 10/11/18 05:21, Jakub Kicinski wrote:
> > From: John Hurley <john.hurley@...ronome.com>
> >
> > Previously, TC block tunnel decap rules were only offloaded when a
> > callback was triggered through registration of the rules egress device.
> > This meant that the driver had no access to the ingress netdev and so
> > could not verify it was the same tunnel type that the rule implied.
> >
> > Register tunnel devices for indirect TC block offloads in NFP, giving
> > access to new rules based on the ingress device rather than egress. Use
> > this to verify the netdev type of VXLAN and Geneve based rules and offload
> > the rules to HW if applicable.
> >
> > Tunnel registration is done via a netdev notifier. On notifier
> > registration, this is triggered for already existing netdevs. This means
> > that NFP can register for offloads from devices that exist before it is
> > loaded (filter rules will be replayed from the TC core). Similarly, on
> > notifier unregister, a call is triggered for each currently active netdev.
> > This allows the driver to unregister any indirect block callbacks that may
> > still be active.
> >
> > Signed-off-by: John Hurley <john.hurley@...ronome.com>
> > Reviewed-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
> I've been experimenting with this to try to understand it better, and I'm
> struggling to understand how the various cb_ident parameters are meant to
> be used. In particular:
> > diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
> > index 2c32edfc1a9d..222e1a98cf16 100644
> > --- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
> > +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
>
> [...]
>
> > +int nfp_flower_reg_indir_block_handler(struct nfp_app *app,
> > + struct net_device *netdev,
> > + unsigned long event)
> > +{
> > + int err;
> > +
> > + if (!nfp_fl_is_netdev_to_offload(netdev))
> > + return NOTIFY_OK;
> > +
> > + if (event == NETDEV_REGISTER) {
> > + err = __tc_indr_block_cb_register(netdev, app,
> > + nfp_flower_indr_setup_tc_cb,
> > + netdev);
> Here 'netdev' is passed as cb_ident, meaning that if you have two of these
> NICs in a system you will (AIUI) get EEXIST when the second one tries to
> register its indr_block_cb, since both will have the same cb and cb_ident.
> This is because 'netdev' is the netdevice being watched, rather than the
> netdevice doing the watching.
> AFAICT the right thing to do here would be to pass in 'app' (more generally,
> the same thing as cb_priv) as that should uniquely identify the watcher
> (the watchee is already identified by the 'dev' parameter, which is used to
> obtain the indr_dev on which the cb_list resides).
>
> Curiously, though you also pass netdev as the cb_ident to
> tcf_block_cb_register() in nfp_flower_setup_indr_tc_block(), that does not
> appear to have the same result, as __tcf_block_cb_register() doesn't check
> for an already existing entry — with the result that
> tcf_block_cb_unregister() may remove the wrong entry from the list. If two
> callers register blocks with the same cb_ident, then one unregisters, the
> same block will be removed no matter which caller did the unregister.
Ack, thanks for pointing that out.
> It is also concerning that both __tc_indr_block_cb_unregister() and
> tcf_block_cb_unregister() return void and will silently do nothing if the
> passed cb_ident is not found. In my experiments this did not conduce to
> debuggability; I think they should either WARN_ON, or at least return
> (int) -ENOENT so that the caller has the opportunity to do so.
WARN_ON() seems reasonable, returning ENOENT would only push that
WARN_ON() to the drivers, I don't see what else the driver could do
with an error on unregister :)
Powered by blists - more mailing lists