[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <VI1PR0501MB22712659E87910ECF0573CBFD1470@VI1PR0501MB2271.eurprd05.prod.outlook.com>
Date: Mon, 18 Mar 2019 15:43:20 +0000
From: Parav Pandit <parav@...lanox.com>
To: Jakub Kicinski <jakub.kicinski@...ronome.com>
CC: Jiri Pirko <jiri@...nulli.us>,
"Samudrala, Sridhar" <sridhar.samudrala@...el.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"oss-drivers@...ronome.com" <oss-drivers@...ronome.com>
Subject: RE: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
ports
> -----Original Message-----
> From: Jakub Kicinski <jakub.kicinski@...ronome.com>
> Sent: Friday, March 15, 2019 8:16 PM
> To: Parav Pandit <parav@...lanox.com>
> Cc: Jiri Pirko <jiri@...nulli.us>; Samudrala, Sridhar
> <sridhar.samudrala@...el.com>; davem@...emloft.net;
> netdev@...r.kernel.org; oss-drivers@...ronome.com
> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
> ports
>
> On Fri, 15 Mar 2019 22:12:13 +0000, Parav Pandit wrote:
> > > On Fri, 15 Mar 2019 21:08:14 +0100, Jiri Pirko wrote:
> > > > >> IIUC, Jiri/Jakub are proposing creation of 2 devlink objects
> > > > >> for each port - host facing ports and switch facing ports. This
> > > > >> is in addition to the netdevs that are created today.
> > >
> > > To be clear I'm not in favour of the dual-object proposal.
> > >
> > > > >I am not proposing any different.
> > > > >I am proposing only two changes.
> > > > >1. control hostport params via referring hostport (not via
> > > > >indirect
> > > > >peer)
> > > >
> > > > Not really possible. If you passthrough VF into VM, the hostport
> > > > goes along with it.
> > > >
> > > > >2. flavour should not be vf/pf, flavour should be hostport, switchport.
> > > > >Because switch is flat and agnostic of pf/vf/mdev.
> > > >
> > > > Not sure. It's good to have this kind of visibility.
> > >
> > > Yes, this subthread honestly makes me go from 60% sure to 95% sure
> > > we shouldn't do the dual object thing :( Seems like Parav is
> > > already confused by it and suggests host port can exist without
> > > switch port :(
> > >
> > I am almost sure that I am not confused.
> > I am clear that hostports should be configured by devlink instance
> > which has the capability to program it.
>
> Right now a devlink port is something that the datapath of an ASIC can
> address. All flavours we have presently are basically various MACs - physical
> (front panel ports), DSA - for ASIC interconnects on a multi-ASIC board, CPU -
> for connecting to a MAC of a NIC.
>
Devlink port implementation in commit doesn't say that it is for ASIC datapath or limited to ASIC datapath id.
It is not right to say that 'whole datapath' object should be represented with just single object 'port'.
Datapath involves various stages in ASIC each does different processing.
These datapath objects are interconnected, i.e. hostport is connected to switchport.
Commit [1] says devlink port is physical port. However we already have 3 flavours of port.
> Jiri's flavour proposal was strictly extending the same logic to SR-IOV. Each
> object addressable within the datapath gets a port.
> The datapath's ID can be used as port_index.
>
And as I said, it is already restrictive.
Port is a port, it can be labeled for vf/pf, but flavour is not really vf/pf.
Also label applies more on the hostport side vs switchport.
> I just reimplemented his patches here and added the subports which I think
> he wasn't aware of as they are a quirk of old NFP ASICs.
>
> Having 3 objects for the same datapath ID is a significant departure from the
> existing devlink port semantics.
>
It is really not same datapath ID.
Because if that is the case, we should be programming mac address on the rep-netdev itself.
But we are not doing that because rep-netdev represents only 'eswitch port'.
> > When hostport is in VF, that VF usually won't have privilege to
> > program it and won't have visibility to eswitch either.
>
> If VM has no visibility into the eswitch and no permission to configure
> things, what use does the object serve?
>
To view device properties, health, RO registers, more importantly its port details.
Yonatan is working on grouping these devlink ports and those are control through devlink APIs.
Jiri is actively internally reviewing those patches since last 3+ weeks, not finished yet.
So this visibility is needed anyway.
> > Why would you like to start with restrictive model of peer view only?
>
> "Restrictive model" is one way of putting it. I'd rather say that we are not
> adding objects which:
> (a) do not adhere to current semantics;
> (b) have no distinct function.
>
hostport certainly has distinct function than switchport.
i.e. to program host side parameters. (eth.mac, rdma.port_guid and more in future).
> We can make the "add MAC address" command not use the word peer:
>
> devlink port addr_pool add pci/0000:05:00.0/10003 type eth
> 00:11:22:33:44:55 devlink port addr_pool del pci/0000:05:00.0/10003 type
> eth 00:11:22:33:44:55
>
> if the "peer" doesn't sit right.
>
> > Hostports exist for infiniband HCA without switchport.
> > We should be able to manage hostport objects without creating fake
> eswitch sw object.
>
> It sounds like the RDMA subsystem is lacking a model to represent all its
> objects, but that's RDMA's problem to solve..
>
devlink framework is not limited to Ethernet, it operates on bus/device notion.
So for Ethernet vendors program mac address.
For rdma vendor programs port_guid (which is equivalent of mac address).
devlink also publishes rdma device info today.
net/core/devlink.c has very well established IB device info exposed via devlink_nl_port_fill() for more than 3 years now in commit [2].
It is not fair to say create, solve it somewhere else.
> In netdev world we have netdevs for ports which a used for bulk of the
> configuration, most importantly - forwarding.
>
> > Jakub,
> > Can you please point to some example other than veth-pair where you
> > configure host param (such as mac address) through a switch?
>
> Existing "legacy" SR-IOV NDOs.
>
That is perfect example of programming hostport parameters, without a eswitch..
At high level, I was looking where you open switch GUI/cli or something equivalent that program's host's mac address..
So far we don't have such equivalent good example yet..
> > An existing example will help me to map it to devlink eswitch proposal.
> > If we go peer programming route, what are your thoughts on how should
> > we program infiniband hostports which doesn't have peer ports?
>
> Again, you may be trying to fix RDMA's lack of control objects, which may be
> better fixed elsewhere..
devlink port is link agnostic control object.
[1] bfcd3a46617209454cfc0947ab093e37fd1e84ef
[2] commit id bfcd3a466
Powered by blists - more mailing lists