[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190318122947.66754e4b@cakuba.netronome.com>
Date: Mon, 18 Mar 2019 12:29:47 -0700
From: Jakub Kicinski <jakub.kicinski@...ronome.com>
To: Parav Pandit <parav@...lanox.com>
Cc: Jiri Pirko <jiri@...nulli.us>,
"Samudrala, Sridhar" <sridhar.samudrala@...el.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"oss-drivers@...ronome.com" <oss-drivers@...ronome.com>
Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
ports
On Mon, 18 Mar 2019 15:43:20 +0000, Parav Pandit wrote:
> > -----Original Message-----
> > From: Jakub Kicinski <jakub.kicinski@...ronome.com>
> > Sent: Friday, March 15, 2019 8:16 PM
> > To: Parav Pandit <parav@...lanox.com>
> > Cc: Jiri Pirko <jiri@...nulli.us>; Samudrala, Sridhar
> > <sridhar.samudrala@...el.com>; davem@...emloft.net;
> > netdev@...r.kernel.org; oss-drivers@...ronome.com
> > Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
> > ports
> >
> > On Fri, 15 Mar 2019 22:12:13 +0000, Parav Pandit wrote:
> > > > On Fri, 15 Mar 2019 21:08:14 +0100, Jiri Pirko wrote:
> > > > > >> IIUC, Jiri/Jakub are proposing creation of 2 devlink objects
> > > > > >> for each port - host facing ports and switch facing ports. This
> > > > > >> is in addition to the netdevs that are created today.
> > > >
> > > > To be clear I'm not in favour of the dual-object proposal.
> > > >
> > > > > >I am not proposing any different.
> > > > > >I am proposing only two changes.
> > > > > >1. control hostport params via referring hostport (not via
> > > > > >indirect
> > > > > >peer)
> > > > >
> > > > > Not really possible. If you passthrough VF into VM, the hostport
> > > > > goes along with it.
> > > > >
> > > > > >2. flavour should not be vf/pf, flavour should be hostport, switchport.
> > > > > >Because switch is flat and agnostic of pf/vf/mdev.
> > > > >
> > > > > Not sure. It's good to have this kind of visibility.
> > > >
> > > > Yes, this subthread honestly makes me go from 60% sure to 95% sure
> > > > we shouldn't do the dual object thing :( Seems like Parav is
> > > > already confused by it and suggests host port can exist without
> > > > switch port :(
> > > >
> > > I am almost sure that I am not confused.
> > > I am clear that hostports should be configured by devlink instance
> > > which has the capability to program it.
> >
> > Right now a devlink port is something that the datapath of an ASIC can
> > address. All flavours we have presently are basically various MACs - physical
> > (front panel ports), DSA - for ASIC interconnects on a multi-ASIC board, CPU -
> > for connecting to a MAC of a NIC.
> >
> Devlink port implementation in commit doesn't say that it is for ASIC datapath or limited to ASIC datapath id.
> It is not right to say that 'whole datapath' object should be represented with just single object 'port'.
> Datapath involves various stages in ASIC each does different processing.
> These datapath objects are interconnected, i.e. hostport is connected to switchport.
> Commit [1] says devlink port is physical port. However we already have 3 flavours of port.
>
> > Jiri's flavour proposal was strictly extending the same logic to SR-IOV. Each
> > object addressable within the datapath gets a port.
> > The datapath's ID can be used as port_index.
> >
> And as I said, it is already restrictive.
> Port is a port, it can be labeled for vf/pf, but flavour is not really vf/pf.
> Also label applies more on the hostport side vs switchport.
>
> > I just reimplemented his patches here and added the subports which I think
> > he wasn't aware of as they are a quirk of old NFP ASICs.
> >
> > Having 3 objects for the same datapath ID is a significant departure from the
> > existing devlink port semantics.
> >
> It is really not same datapath ID.
> Because if that is the case, we should be programming mac address on the rep-netdev itself.
> But we are not doing that because rep-netdev represents only 'eswitch port'.
Okay, I explained the history to you here, you can write your own if
you want.
> > > When hostport is in VF, that VF usually won't have privilege to
> > > program it and won't have visibility to eswitch either.
> >
> > If VM has no visibility into the eswitch and no permission to configure
> > things, what use does the object serve?
> >
> To view device properties, health, RO registers, more importantly its port details.
Device != port.
> Yonatan is working on grouping these devlink ports and those are control through devlink APIs.
> Jiri is actively internally reviewing those patches since last 3+ weeks, not finished yet.
> So this visibility is needed anyway.
No idea what "grouping devlink ports" may refer to, but I'd be
surprised if it's relevant to VMs.
> > > Why would you like to start with restrictive model of peer view only?
> >
> > "Restrictive model" is one way of putting it. I'd rather say that we are not
> > adding objects which:
> > (a) do not adhere to current semantics;
> > (b) have no distinct function.
> >
> hostport certainly has distinct function than switchport.
> i.e. to program host side parameters. (eth.mac, rdma.port_guid and more in future).
Yeah, Ethernet or IB address, and so many other things (we just can't
happen to think about any right now)...
> > We can make the "add MAC address" command not use the word peer:
> >
> > devlink port addr_pool add pci/0000:05:00.0/10003 type eth
> > 00:11:22:33:44:55 devlink port addr_pool del pci/0000:05:00.0/10003 type
> > eth 00:11:22:33:44:55
> >
> > if the "peer" doesn't sit right.
> >
> > > Hostports exist for infiniband HCA without switchport.
> > > We should be able to manage hostport objects without creating fake
> > eswitch sw object.
> >
> > It sounds like the RDMA subsystem is lacking a model to represent all its
> > objects, but that's RDMA's problem to solve..
> >
> devlink framework is not limited to Ethernet, it operates on bus/device notion.
> So for Ethernet vendors program mac address.
> For rdma vendor programs port_guid (which is equivalent of mac address).
>
> devlink also publishes rdma device info today.
> net/core/devlink.c has very well established IB device info exposed via devlink_nl_port_fill() for more than 3 years now in commit [2].
> It is not fair to say create, solve it somewhere else.
>
> > In netdev world we have netdevs for ports which a used for bulk of the
> > configuration, most importantly - forwarding.
> >
> > > Jakub,
> > > Can you please point to some example other than veth-pair where you
> > > configure host param (such as mac address) through a switch?
> >
> > Existing "legacy" SR-IOV NDOs.
> >
> That is perfect example of programming hostport parameters, without a eswitch..
> At high level, I was looking where you open switch GUI/cli or something equivalent that program's host's mac address..
> So far we don't have such equivalent good example yet..
>
> > > An existing example will help me to map it to devlink eswitch proposal.
> > > If we go peer programming route, what are your thoughts on how should
> > > we program infiniband hostports which doesn't have peer ports?
> >
> > Again, you may be trying to fix RDMA's lack of control objects, which may be
> > better fixed elsewhere..
>
> devlink port is link agnostic control object.
>
> [1] bfcd3a46617209454cfc0947ab093e37fd1e84ef
> [2] commit id bfcd3a466
Powered by blists - more mailing lists