[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160923135512.64e5d5e9@jkicinski-Precision-T1700>
Date:   Fri, 23 Sep 2016 13:55:12 +0100
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     Jiri Benc <jbenc@...hat.com>, Jiri Pirko <jiri@...nulli.us>
Cc:     netdev@...r.kernel.org, Thomas Graf <tgraf@...g.ch>,
        Roopa Prabhu <roopa@...ulusnetworks.com>,
        ogerlitz@...lanox.com, John Fastabend <john.fastabend@...il.com>,
        sridhar.samudrala@...el.com, ast@...nel.org, daniel@...earbox.net,
        simon.horman@...ronome.com, Paolo Abeni <pabeni@...hat.com>,
        Pravin B Shelar <pshelar@...ira.com>,
        hannes@...essinduktion.org, kubakici@...pl
Subject: Re: [RFC] net: store port/representative id in metadata_dst
On Fri, 23 Sep 2016 11:06:09 +0200, Jiri Benc wrote:
> On Fri, 23 Sep 2016 08:34:29 +0200, Jiri Pirko wrote:
> > So if I understand that correctly, this would need some "shared netdev"
> > which would effectively serve only as a sink for all port netdevices to
> > tx packets to. On RX, this would be completely avoided. This lower
> > device looks like half zombie to me.  
> 
> Looks more like a quarter zombie. Even tx would not be allowed unless
> going through one of the ports, as all skbs without
> METADATA_HW_PORT_MUX metadata_dst would be dropped. But it would be
> possible to attach qdisc to the "lower" netdevice and it would actually
> have an effect. On rx this netdevice would be ignored completely. This
> is very weird behavior.
> 
> > I don't like it :( I wonder if the
> > solution would not be possible without this lower netdev.  
> 
> I agree. This approach doesn't sound correct. The skbs should not be
> requeued.
Thanks for the responses!
I think SR-IOV NICs are coming at this problem from a different angle,
we already have a big, feature-full per-port netdevs and now we want to
spawn representators for VFs to handle fallback traffic.  This patch
would help us mux VFR traffic on all the queues of the physical port
netdevs (the ones which were already present in legacy mode, that's the
lower device).
I read the mlxsw code when I was thinking about this and I wasn't
100% comfortable with returning NETDEV_TX_BUSY, I thought this
behaviour should be generally avoided.  (BTW a very lame question - does
mlxsw ever stop the queues?  AFAICS it only returns BUSY, isn't that
confusing to the stack?)
FWIW the switchdev SR-IOV model we have now seems to be to treat the
existing netdevs as "MAC ports" and spawn representatives for VFs but
not represent PFs in any way.  This makes it impossible to install
VF-PF flow rules.  I worry this can bite us later but that's slightly
different discussion :)  For the purpose of this patch please assume
the lower dev is the MAC/physical/external port.
Powered by blists - more mailing lists
 
