[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <VI1PR0501MB2271CF0653351D664C3C0622D15E0@VI1PR0501MB2271.eurprd05.prod.outlook.com>
Date: Mon, 25 Mar 2019 20:34:43 +0000
From: Parav Pandit <parav@...lanox.com>
To: Parav Pandit <parav@...lanox.com>, Jiri Pirko <jiri@...nulli.us>
CC: Jakub Kicinski <jakub.kicinski@...ronome.com>,
"Samudrala, Sridhar" <sridhar.samudrala@...el.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"oss-drivers@...ronome.com" <oss-drivers@...ronome.com>
Subject: RE: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
ports
> -----Original Message-----
> From: netdev-owner@...r.kernel.org <netdev-owner@...r.kernel.org> On
> Behalf Of Parav Pandit
> Sent: Friday, March 22, 2019 7:40 PM
> To: Jiri Pirko <jiri@...nulli.us>
> Cc: Jakub Kicinski <jakub.kicinski@...ronome.com>; Samudrala, Sridhar
> <sridhar.samudrala@...el.com>; davem@...emloft.net;
> netdev@...r.kernel.org; oss-drivers@...ronome.com
> Subject: RE: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
> ports
>
>
>
> > -----Original Message-----
> > From: Jiri Pirko <jiri@...nulli.us>
> > Sent: Friday, March 22, 2019 8:33 AM
> > To: Parav Pandit <parav@...lanox.com>
> > Cc: Jakub Kicinski <jakub.kicinski@...ronome.com>; Samudrala, Sridhar
> > <sridhar.samudrala@...el.com>; davem@...emloft.net;
> > netdev@...r.kernel.org; oss-drivers@...ronome.com
> > Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
> > devlink PCI ports
> >
> > Thu, Mar 21, 2019 at 06:42:55PM CET, parav@...lanox.com wrote:
> > >
> > >
> > >> -----Original Message-----
> > >> From: Jiri Pirko <jiri@...nulli.us>
> > >> Sent: Thursday, March 21, 2019 12:24 PM
> > >> To: Parav Pandit <parav@...lanox.com>
> > >> Cc: Jakub Kicinski <jakub.kicinski@...ronome.com>; Samudrala,
> > >> Sridhar <sridhar.samudrala@...el.com>; davem@...emloft.net;
> > >> netdev@...r.kernel.org; oss-drivers@...ronome.com
> > >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
> > >> devlink PCI ports
> > >>
> > >> Thu, Mar 21, 2019 at 05:50:37PM CET, parav@...lanox.com wrote:
> > >> >
> > >> >
> > >> >> -----Original Message-----
> > >> >> From: Jiri Pirko <jiri@...nulli.us>
> > >> >> Sent: Thursday, March 21, 2019 11:16 AM
> > >> >> To: Parav Pandit <parav@...lanox.com>
> > >> >> Cc: Jakub Kicinski <jakub.kicinski@...ronome.com>; Samudrala,
> > >> >> Sridhar <sridhar.samudrala@...el.com>; davem@...emloft.net;
> > >> >> netdev@...r.kernel.org; oss-drivers@...ronome.com
> > >> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
> > >> >> devlink PCI ports
> > >> >>
> > >> >> Thu, Mar 21, 2019 at 04:03:58PM CET, parav@...lanox.com wrote:
> > >> >> >Hi Jiri,
> > >> >> >
> > >> >> >> -----Original Message-----
> > >> >> >> From: Jiri Pirko <jiri@...nulli.us>
> > >> >> >> Sent: Thursday, March 21, 2019 4:08 AM
> > >> >> >> To: Jakub Kicinski <jakub.kicinski@...ronome.com>
> > >> >> >> Cc: Parav Pandit <parav@...lanox.com>; Samudrala, Sridhar
> > >> >> >> <sridhar.samudrala@...el.com>; davem@...emloft.net;
> > >> >> >> netdev@...r.kernel.org; oss-drivers@...ronome.com
> > >> >> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports
> > >> >> >> on devlink PCI ports
> > >> >> >>
> > >> >> >> Wed, Mar 20, 2019 at 09:22:57PM CET,
> > >> >> >> jakub.kicinski@...ronome.com
> > >> >> >> wrote:
> > >> >> >> >On Wed, 20 Mar 2019 18:24:15 +0000, Parav Pandit wrote:
> > >> >> >> >> Hi Jiri, Jakub, Samudrala Sridhar,
> > >> >> >> >> > > > > > And physical port in
> > >> >> >> >> > > > > > include/uapi/linux/devlink.h also describe that.
> > >> >> >> >> > > > >
> > >> >> >> >> > > > > By "that" you must mean that the physical is a
> > >> >> >> >> > > > > user facing
> > >> port.
> > >> >> >> >> > > >
> > >> >> >> >> > > > Can you please describe the difference between 'PF port'
> > >> >> >> >> > > > and 'physical port of include/uapi/linux/devlink.h'?
> > >> >> >> >> > > > I must have missed this crisp definition in
> > >> >> >> >> > > > discussion between you and Jiri. I am in meantime
> > >> >> >> >> > > > checking the
> > thread.
> > >> >> >> >> > >
> > >> >> >> >> > > Perhaps start with the cover letter which includes an
> > >> >> >> >> > > ASCII
> > >> >> drawing?
> > >> >> >> >> > >
> > >> >> >> >> > > Using Mellanox nomenclature - PF port is a "representor"
> > >> >> >> >> > > for the PF which may be on another Host (SmartNIC or
> > >> multihost).
> > >> >> >> >> > > It's pretty much the same thing as a VF port/"representor".
> > >> >> >> >> > >
> > >> >> >> >> > Yes. We are aligned here. :-) I see your point, where in
> > >> >> >> >> > multi-host scenario, a physical port may be 1, but PF
> > >> >> >> >> > ports are 4, because of 4 PFs for 4 hosts.
> > >> >> >> >> > (just an example of 4 hosts with their own mac address
> > >> >> >> >> > sharing 1 physical port).
> > >> >> >> >> >
> > >> >> >> >> > When there is no multihost and one to one mapping
> > >> >> >> >> > between a PF and physical links, there is some overlap
> > >> >> >> >> > between PF port and physical port attributes.
> > >> >> >> >> > I believe, such overlap is fine as long as we have
> > >> >> >> >> > unique indices for the
> > >> >> >> ports.
> > >> >> >> >> >
> > >> >> >> >> > So I am ok to have flavours as
> > >> >> physical/cpu/dsa/pf/vf/mdev/switchport.
> > >> >> >> >> > (last 4 as new port flavours).
> > >> >> >> >> >
> > >> >> >> >> > > Physical port is the hole on the panel of the adapter
> > >> >> >> >> > > where cable
> > >> >> >> goes.
> > >> >> >> >>
> > >> >> >> >> So my take away from above discussion are:
> > >> >> >> >> 1. Following new port flavours should be added
> > >> >> >> pci_pf/pci_vf/mdev/switchport.
> > >> >> >> >> a. Switchport indicates port on the eswitch. Normally this
> > >> >> >> >> port has rep-
> > >> >> >> netdev attached to it.
> > >> >> >> >
> > >> >> >> >I don't understand the "switchport". Surely physical ports
> > >> >> >> >are also attached to the eswitch? And one of the main
> > >> >> >> >purpose of adding the pci_pf/pci_vf flavours was to generate
> > >> >> >> >phys_port_name for the port netdevs.
> > >> >> >> >
> > >> >> >> >Please don't use the term representor if possible.
> > >> >> >> >Representor for most developers describes the way the netdev
> > >> >> >> >is implemented in the driver, so for Mellanox and Netronome
> > >> >> >> >different ports will be representors and non-representors.
> > >> >> >> >That's why I prefer port netdev (attached to eswitch, has
> > >> >> >> >switch_id) and host netdev (PF/VF netdev, vNIC, VSI, etc).
> > >> >> >> >
> > >> >> >> >> b. host side port flavours are pci_pf/pci_vf/mdev which
> > >> >> >> >> may be connected to switchport
> > >> >> >> >
> > >> >> >> >See above, pci_pf/pci_vf are needed for phys_port_name
> > generation.
> > >> >> >>
> > >> >> >> Yep, that makes sense.
> > >> >> >>
> > >> >> >>
> > >> >> >> >
> > >> >> >> >> 2. host side port flavours are not limited to Ethernet, as
> > >> >> >> >> it is for devlink's
> > >> >> >> port instance.
> > >> >> >> >>
> > >> >> >> >> 3. Each port is continue to be accessed using unique port
> index.
> > >> >> >> >>
> > >> >> >> >> 4. host side ports and switchport are control objects.
> > >> >> >> >> a. switch side ports reside where current eswitch object
> > >> >> >> >> of devlink instance reside b. for a given VF/PF/mdev such
> > >> >> >> >> host side ports may be in hypervisor or VM or both
> > >> >> >> >> depending on the privilege
> > >> >> >> >>
> > >> >> >> >> 5. eth.mac_address, rdma.port_guid can be programmed at
> > >> >> >> >> host port flavours by extending as $ devlink port param set...
> > >> >> >> >> (similar to devlink dev param set)
> > >> >> >> >
> > >> >> >> >You can keep restating that's your position, but I have
> > >> >> >> >*not* conceded to that.
> > >> >> >>
> > >> >> >> I'm also not convinced that host dummy ports are good idea to
> > >> >> >> hold
> > >> >> these.
> > >> >> >>
> > >> >> >>
> > >> >> >I didn't understand what do you mean my dummy port.
> > >> >>
> > >> >> It's a port for a VF host port which is not actually in the host
> > >> >> but in the
> > >> vm.
> > >> >> Very confusing.
> > >> >>
> > >> >It is the vf_ctrl flavour. I don't see it any different than rep-netdev.
> > >> >rep-netdev is not that confusing to us that represent eswitch vport.
> > >> >Why vf_ctrl flavour port that represents otherside of the pipe as
> > >> >you have
> > >> shown in example?
> > >> >Why it that confusing?
> > >>
> > >> Because sometimes it is there only once (PF), sometimes twice (VF)
> > >> - and one of these is kind-of zombie.
> > >>
> > >I gave the example in email that contains description yesterday.
> > >You didn't respond to it.
> > >So repeating here.
> > >Can you please point what looks like zombie below?
> > >
> > >$ devlink port show
> > >pci/0000:05:00.0/0 eth netdev repndev_pf0_p0 flavour physical
> > >switch_id 00154d130d2f
> > >pci/0000:05:00.0/1 eth netdev repndev_pf0_p1 flavour physical
> > >switch_id 00154d130d2f
> > >pci/0000:05:00.0/10001 eth netdev repndev_pf0_vf_1 flavour switchport
> > >switch_id 00154d130d2f peer pci/0000:05:00.0/1
> > >pci/0000:05:00.0/10002 eth netdev repndev_pf0_p0_mdev_8000 flavour
> > >switchport switch_id 00154d130d2f peer mdev/uuidX/0
> > >
> > >pci/0000:05:00.0/1 eth netdev flavour vf_ctrl vf 1
> >
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this one.
> > You are missing an actual VF instance.
> >
> VF instance is in VM. It is not visible here in Hypervisor. But if you prefer to
> see it, It looks like below.
> pci/0000:05:01.0/0 eth netdev eth0 flavour vf
>
>
> > >mdev/uuidX/0 eth netdev flavour mdev_ctrl
> > Why "ctrl"?
> >
> I suffixed it with ctrl to indicate you that it is used for control functionality.
> Again, I described in previous email to Jakub' response in lot detail.
>
So we had offline discussion. Jiri and Jakub prefers to program hostport's parameters via 'peer' way.
This would require creating unmanaged switch port for rdma.
We concluded to expose host side property via below indirect way on eswitch side port.
pci/0000:05:00.0/1 type eth netdev repndev_pf0_p1 flavour physical switch_id 00154d130d2f
pci/0000:05:00.0/2 type eth netdev repndev_pf0_vf_1 flavour eswitch switch_id 00154d130d2f vf 1 pf 0
pci/0000:05:00.0/4 type eth netdev repndev_pf0_sp_3 flavour eswitch switch_id 00154d130d2f mdev/uuidA/0
+---+ +---+
vf| | | | mdev
+-+-+ +-+-+
physical link <---------+ | |
| | |
| | |
+-+-+ +-+-+ +-+-+
| 1 | | 2 | | 3 |
+--+---+-----+---+------+---+--+
| physical vf pfsub |
| port port port |
| |
| eswitch |
| |
+------------------------------+
Host port parameters such as ether.mac_addr, rdma.node_port_guid and more port internal parameters to be programmed via peer mode.
These ports are created by the driver code and not by a user.
This is very unusual way to program host params via switch.
No solid example provided to support devlink model...
Anyways, I dislike but I agree to Jiri and Jakob suggestion. :-)
Let's move forward this way. I will let future speak for the design choices made...
> > >
> > >> >
> > >> >
> > >> >> >Can you explain what is wrong in programming host port params
> > >> >> >using
> > >> >> host_port object?
> > >> >> >Few questions are unanswered in my past 2 or 3 emails.
> > >> >> >Can you please go through them?
> > >> >> >Can you point to some example switch API where you program host
> > >> >> >params
> > >> >> at switch?
> > >> >> >
> > >> >> >> >
> > >> >> >> >> 6. more host port params can be added in future when user
> > >> >> >> >> need arise
> > >> >> >> >>
> > >> >> >> >> 7. rep-netdev continue to be eswitch (switchport)
> > >> >> >> >> representor at the
> > >> >> >> switch side.
> > >> >> >> >> a. Hence rep-netdev cannot be used for programming host
> > >> >> >> >> port's
> > >> >> >> parameters.
> > >> >> >> >>
> > >> >> >> >> 8. eswitch devlink instance knows when VF/PF/mdev's
> > >> >> >> >> switchport are
> > >> >> >> created/removed.
> > >> >> >> >> Hence, those will be created/deleted by eswitch.
> > >> >> >> >> Similarly for host port flavours too.
> > >> >> >> >>
> > >> >> >> >> Does it look fine? Did I miss something?
> > >> >> >> >> We would like to progress on incremental patches for
> > >> >> >> >> item-4 and any prep work needed to reach to item-4.
Powered by blists - more mailing lists