lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <VI1PR0501MB227109474D520BDCE212A5B3D1440@VI1PR0501MB2271.eurprd05.prod.outlook.com>
Date:   Fri, 15 Mar 2019 21:59:33 +0000
From:   Parav Pandit <parav@...lanox.com>
To:     Jiri Pirko <jiri@...nulli.us>
CC:     "Samudrala, Sridhar" <sridhar.samudrala@...el.com>,
        Jakub Kicinski <jakub.kicinski@...ronome.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "oss-drivers@...ronome.com" <oss-drivers@...ronome.com>
Subject: RE: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
 ports



> -----Original Message-----
> From: Jiri Pirko <jiri@...nulli.us>
> Sent: Friday, March 15, 2019 3:08 PM
> To: Parav Pandit <parav@...lanox.com>
> Cc: Samudrala, Sridhar <sridhar.samudrala@...el.com>; Jakub Kicinski
> <jakub.kicinski@...ronome.com>; davem@...emloft.net;
> netdev@...r.kernel.org; oss-drivers@...ronome.com
> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
> ports
> 
> Fri, Mar 15, 2019 at 04:32:24PM CET, parav@...lanox.com wrote:
> >
> >
> >> -----Original Message-----
> >> From: Samudrala, Sridhar <sridhar.samudrala@...el.com>
> >> Sent: Friday, March 15, 2019 12:58 AM
> >> To: Parav Pandit <parav@...lanox.com>; Jakub Kicinski
> >> <jakub.kicinski@...ronome.com>
> >> Cc: Jiri Pirko <jiri@...nulli.us>; davem@...emloft.net;
> >> netdev@...r.kernel.org; oss-drivers@...ronome.com
> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
> >> devlink PCI ports
> >>
> >>
> >> On 3/14/2019 7:40 PM, Parav Pandit wrote:
> >> >
> >> >
> >> >> -----Original Message-----
> >> >> From: Samudrala, Sridhar <sridhar.samudrala@...el.com>
> >> >> Sent: Thursday, March 14, 2019 9:16 PM
> >> >> To: Parav Pandit <parav@...lanox.com>; Jakub Kicinski
> >> >> <jakub.kicinski@...ronome.com>
> >> >> Cc: Jiri Pirko <jiri@...nulli.us>; davem@...emloft.net;
> >> >> netdev@...r.kernel.org; oss-drivers@...ronome.com
> >> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
> >> >> devlink PCI ports
> >> >>
> >> >>
> >> >>
> >> >> On 3/14/2019 6:28 PM, Parav Pandit wrote:
> >> >>>
> >> >>>
> >> >>>> -----Original Message-----
> >> >>>> From: Jakub Kicinski <jakub.kicinski@...ronome.com>
> >> >>>> Sent: Thursday, March 14, 2019 6:39 PM
> >> >>>> To: Parav Pandit <parav@...lanox.com>
> >> >>>> Cc: Jiri Pirko <jiri@...nulli.us>; davem@...emloft.net;
> >> >>>> netdev@...r.kernel.org; oss-drivers@...ronome.com
> >> >>>> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
> >> >>>> devlink PCI ports
> >> >>>>
> >> >>>> On Thu, 14 Mar 2019 22:35:36 +0000, Parav Pandit wrote:
> >> >>>>>>> Then instances of flavour pci_vf are going to appear in the
> >> >>>>>>> same devlink instance. Those are the switch ports:
> >> >>>>>>> pci/0000:05:00.0/10002: type eth netdev enp5s0npf0pf0s0
> >> >>>>>>>                           flavour pci_vf pf 0 vf 0
> >> >>>>>>>                           switch_id 00154d130d2f peer
> >> >>>>>>> pci/0000:05:10.1/0
> >> >>>>>>> pci/0000:05:00.0/10003: type eth netdev enp5s0npf0pf0s0
> >> >>>>>>>                           flavour pci_vf pf 0 vf 0 subport 1
> >> >>>>>>>                           switch_id 00154d130d2f peer
> >> >>>>>>> pci/0000:05:10.1/1
> >> >>>>>>>
> >> >>>>>>> With that, peers are going to appear too, and those are the
> >> >>>>>>> actual VF/VF
> >> >>>>>>> subport:
> >> >>>>>>> pci/0000:05:10.1/0: type eth netdev ??? flavour pci_vf_host
> >> >>>>>>>                       peer pci/0000:05:00.0/10002
> >> >>>>>>> pci/0000:05:10.1/1: type eth netdev ??? flavour pci_vf_host
> >> >>>>>>>                       peer pci/0000:05:00.0/10003
> >> >>>>>>>
> >> >>>>>>> Later you can push this VF along with all subports to VM. So
> >> >>>>>>> in VM, you are going to see the VF like this:
> >> >>>>>>> $ devlink dev
> >> >>>>>>> pci/0000:00:08.0
> >> >>>>>>> $ devlink port
> >> >>>>>>> pci/0000:00:08.0/0: type eth netdev ??? flavour pci_vf_host
> >> >>>>>>> pci/0000:00:08.0/1: type eth netdev ??? flavour pci_vf_host
> >> >>>>>>>
> >> >>>>>>> And back to your question of how are they connected in eswitch.
> >> >>>>>>> That is totally up to the original user John who did the creation.
> >> >>>>>>> He is in charge of the eswitch on baremetal, he would
> >> >>>>>>> configure the forwarding however he likes.
> >> >>>>>>
> >> >>>>>> Ack, so I think you're saying VM has to communicate to the
> >> >>>>>> cloud environment to have this provisioned using some service
> >> >>>>>> API, not a kernel API.  That's what I wanted to confirm.
> >> >>>>>>
> >> >>>>>> I don't see any benefit to having the "host ports" under
> >> >>>>>> devlink, as such I think it's a matter of preference.
> >> >>>>>
> >> >>>>> We need 'host ports' to configure parameters of this host port
> >> >>>>> which is not exposed by the rep-netdev.
> >> >>>>> Such as mac address.
> >> >>>>
> >> >>>> Please look at the quote of what Jiri wrote above - the host
> >> >>>> port gets passed to the VM, you can't use it as a handle to set the
> MAC.
> >> >>>>
> >> >>>> The way to set the MAC remains:
> >> >>>>
> >> >>>> # devlink port set pci/0000:05:00.0/10002 peer mac_addr
> >> >>>> 00:11:22:33:44:55
> >> >>>>
> >> >>> Even though it can be done, I think this is wrong model to
> >> >>> program
> >> >> hostport mac address using eswitch port.
> >> >>> All devlink objects are control objects, so what is passed to VM
> >> >>> is what is
> >> >> represented by devlink.
> >> >>> VF in the VM will anyway create its devlink object.
> >> >>> What is wrong in programming hostport?
> >> >>> It gives a very clear view to users of topology and objects.
> >> >>
> >> >> The VF or any subport MAC address should be configured by the
> >> >> orchestration layer that is running on the hypervisor and when a
> >> >> VF is assigned to a VF, the host port is not visible to the hypervisor.
> >> > What prevents  creation of hostport due to which is not visible?
> >> > Hostport is control port to program host side of parameters.
> >> > It should be created when user wants to program the parameters.
> >> >
> >> > Model is really straight forward.
> >> > Program host port params using hostport object.
> >> > Program switchport params using rep-netdev.
> >>
> >> IIUC, Jiri/Jakub are proposing creation of 2 devlink objects for each
> >> port - host facing ports and switch facing ports. This is in addition
> >> to the netdevs that are created today.
> >>
> >I am not proposing any different.
> >I am proposing only two changes.
> >1. control hostport params via referring hostport (not via indirect
> >peer)
> 
> Not really possible. If you passthrough VF into VM, the hostport goes along
> with it.
> 
No.
I am sorry in showing the enumeration which is the source of confusion.

Below is the right enumeration.

When VF is enumerated initially in the host, where eswitch devlink instance is located.
Below enumeration is seen.

First two entries shows the link between hostport and switchport.
$ devlink port show
pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id 00154d130d2f peer pci/0000:05:00.0/1

pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f peer pci/0000:05:00.0/10002

pci/0000:05:10.1/0 eth netdev flavour hostport
This entry won't be seen if VF auto probing is disabled. Because than VF is not enumerated.

As a user, I will be programming the mac address of hostport for a VF.
pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f peer pci/0000:05:00.0/10002


> 
> >2. flavour should not be vf/pf, flavour should be hostport, switchport.
> >Because switch is flat and agnostic of pf/vf/mdev.
> 
> Not sure. It's good to have this kind of visibility.
> 
port can have label/attribute indicating that this belong to VF-1 or mdev as long as you are agreeing to have mdev attribute on host port.
(and not ask for abstracting it, because mdev is well defined kernel object).

> 
> >
> >> Are you suggesting that all the devlink objects should be visible
> >> only at the hypervisor layer?
> >>
> >Of course not.
> >
> >Ports and params controlled by hypervisor should be exposed at
> hypervisor/eswitch wherever its parent devlink instance exist.
> >Ports which should be visible inside a VM should be exposed inside a VM.
> >So for a given VF,
> >
> >If eswitch is at hypervisor level,
> >$ devlink port show
> >pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id
> >00154d130d2f peer pci/0000:05:10.1/0
> >pci/0000:05:10.1/0 eth netdev flavour hostport switch_id 00154d130d2f
> >peer pci/0000:05:00.0/10002
> >
> >where VF is enumerated,
> >$ devlink port show
> >pci/0000:05:10.1/0 eth netdev flavour hostport
> 
> So this is how it looks like in VM, right?
> 
Yep.
Once VF is mapped to VM only two entries are seen and hostport can be still controlled.

$ devlink port show
pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id 00154d130d2f peer pci/0000:05:00.0/1

pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f peer pci/0000:05:00.0/10002

This addresses the case for Infiniband where there is no eswitch, but hostports exists and should be managed.
We shouldn't be inventing new devlink APIs or create a fake sw eswitch object which doesn't exist in hw.

> 
> >This is because unprivileged VF doesn't have visibility to eswitch and its
> links.
> >
> >> I think the terminology need to be defined clearly so that we are all
> >> on the same page.
> >>
> >> >
> >> >> Currently we have ndo_set_vf_mac_addr api that works with PF
> >> >> netdev, but i think we are trying to move away from that API and
> >> >> do all the configuration via the port representor netdevs.
> >> > This is fine rep-netdev represents eswitch port.
> >> > You normally don't go to switch to program host port params.
> >> >
> >> >> As the mac address cannot be configured using this netdev, i think
> >> >> Jakub is suggesting creating a devlink opject for each port
> >> >> representor and use that interface to set peer mac address.
> >> >
> >> > I understand but is convoluted interface.
> >> > When you program host NIC mac address you talk to iLo or BIOS.
> >> > When you program switch side mac address, you go
> switch/router/modem.
> >> >
> >> > Also programming host params on host side, also doesn't make
> >> assumption that its connected to eswitch.
> >> > It also doesn't assume that same connectivity for its life.
> >> >
> >> > If you model around how physical devices are configured, it will
> >> > almost
> >> never go wrong and still provides same level of flexibility.
> >> >
> >> >> We should be able use this to configure port vlan too.
> >> >>
> >> >> Also, instead of subport, can we call vport and support different
> >> >> types of vports - sr-iov, siov, vmdq etc.
> >> >>
> >> > At switch level there are just ports.
> >> > sriov, siov, mdev, vmdq are their couter part (peer) where it is
> connected.
> >> >
> >> >>>
> >> >>> Also eswitch is flat. There is no need of pf/vf flavour for port.
> >> >>> It doesn't make sense to define 'mdev' flavour which we are
> >> >>> already
> >> >> working.
> >> >>> At eswitch level it is just a port, it happen to be connected to
> >> >>> vf or pf or
> >> >> other objects, it doesn't matter.
> >> >>> Port should be flavoured as 'hostport' or 'switchport'.
> >> >>>
> >> >>>
> >> >>>> (using the port ids from above)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ