lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <VI1PR0501MB2271CEB4C87036509F75D36AD1470@VI1PR0501MB2271.eurprd05.prod.outlook.com>
Date:   Mon, 18 Mar 2019 15:56:32 +0000
From:   Parav Pandit <parav@...lanox.com>
To:     Jiri Pirko <jiri@...nulli.us>
CC:     "Samudrala, Sridhar" <sridhar.samudrala@...el.com>,
        Jakub Kicinski <jakub.kicinski@...ronome.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "oss-drivers@...ronome.com" <oss-drivers@...ronome.com>
Subject: RE: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
 ports



> -----Original Message-----
> From: Jiri Pirko <jiri@...nulli.us>
> Sent: Monday, March 18, 2019 7:21 AM
> To: Parav Pandit <parav@...lanox.com>
> Cc: Samudrala, Sridhar <sridhar.samudrala@...el.com>; Jakub Kicinski
> <jakub.kicinski@...ronome.com>; davem@...emloft.net;
> netdev@...r.kernel.org; oss-drivers@...ronome.com
> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
> ports
> 
> Fri, Mar 15, 2019 at 10:59:33PM CET, parav@...lanox.com wrote:
> >
> >
> >> -----Original Message-----
> >> From: Jiri Pirko <jiri@...nulli.us>
> >> Sent: Friday, March 15, 2019 3:08 PM
> >> To: Parav Pandit <parav@...lanox.com>
> >> Cc: Samudrala, Sridhar <sridhar.samudrala@...el.com>; Jakub Kicinski
> >> <jakub.kicinski@...ronome.com>; davem@...emloft.net;
> >> netdev@...r.kernel.org; oss-drivers@...ronome.com
> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
> >> devlink PCI ports
> >>
> >> Fri, Mar 15, 2019 at 04:32:24PM CET, parav@...lanox.com wrote:
> >> >
> >> >
> >> >> -----Original Message-----
> >> >> From: Samudrala, Sridhar <sridhar.samudrala@...el.com>
> >> >> Sent: Friday, March 15, 2019 12:58 AM
> >> >> To: Parav Pandit <parav@...lanox.com>; Jakub Kicinski
> >> >> <jakub.kicinski@...ronome.com>
> >> >> Cc: Jiri Pirko <jiri@...nulli.us>; davem@...emloft.net;
> >> >> netdev@...r.kernel.org; oss-drivers@...ronome.com
> >> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
> >> >> devlink PCI ports
> >> >>
> >> >>
> >> >> On 3/14/2019 7:40 PM, Parav Pandit wrote:
> >> >> >
> >> >> >
> >> >> >> -----Original Message-----
> >> >> >> From: Samudrala, Sridhar <sridhar.samudrala@...el.com>
> >> >> >> Sent: Thursday, March 14, 2019 9:16 PM
> >> >> >> To: Parav Pandit <parav@...lanox.com>; Jakub Kicinski
> >> >> >> <jakub.kicinski@...ronome.com>
> >> >> >> Cc: Jiri Pirko <jiri@...nulli.us>; davem@...emloft.net;
> >> >> >> netdev@...r.kernel.org; oss-drivers@...ronome.com
> >> >> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on
> >> >> >> devlink PCI ports
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On 3/14/2019 6:28 PM, Parav Pandit wrote:
> >> >> >>>
> >> >> >>>
> >> >> >>>> -----Original Message-----
> >> >> >>>> From: Jakub Kicinski <jakub.kicinski@...ronome.com>
> >> >> >>>> Sent: Thursday, March 14, 2019 6:39 PM
> >> >> >>>> To: Parav Pandit <parav@...lanox.com>
> >> >> >>>> Cc: Jiri Pirko <jiri@...nulli.us>; davem@...emloft.net;
> >> >> >>>> netdev@...r.kernel.org; oss-drivers@...ronome.com
> >> >> >>>> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports
> >> >> >>>> on devlink PCI ports
> >> >> >>>>
> >> >> >>>> On Thu, 14 Mar 2019 22:35:36 +0000, Parav Pandit wrote:
> >> >> >>>>>>> Then instances of flavour pci_vf are going to appear in
> >> >> >>>>>>> the same devlink instance. Those are the switch ports:
> >> >> >>>>>>> pci/0000:05:00.0/10002: type eth netdev enp5s0npf0pf0s0
> >> >> >>>>>>>                           flavour pci_vf pf 0 vf 0
> >> >> >>>>>>>                           switch_id 00154d130d2f peer
> >> >> >>>>>>> pci/0000:05:10.1/0
> >> >> >>>>>>> pci/0000:05:00.0/10003: type eth netdev enp5s0npf0pf0s0
> >> >> >>>>>>>                           flavour pci_vf pf 0 vf 0 subport 1
> >> >> >>>>>>>                           switch_id 00154d130d2f peer
> >> >> >>>>>>> pci/0000:05:10.1/1
> >> >> >>>>>>>
> >> >> >>>>>>> With that, peers are going to appear too, and those are
> >> >> >>>>>>> the actual VF/VF
> >> >> >>>>>>> subport:
> >> >> >>>>>>> pci/0000:05:10.1/0: type eth netdev ??? flavour pci_vf_host
> >> >> >>>>>>>                       peer pci/0000:05:00.0/10002
> >> >> >>>>>>> pci/0000:05:10.1/1: type eth netdev ??? flavour pci_vf_host
> >> >> >>>>>>>                       peer pci/0000:05:00.0/10003
> >> >> >>>>>>>
> >> >> >>>>>>> Later you can push this VF along with all subports to VM.
> >> >> >>>>>>> So in VM, you are going to see the VF like this:
> >> >> >>>>>>> $ devlink dev
> >> >> >>>>>>> pci/0000:00:08.0
> >> >> >>>>>>> $ devlink port
> >> >> >>>>>>> pci/0000:00:08.0/0: type eth netdev ??? flavour
> >> >> >>>>>>> pci_vf_host
> >> >> >>>>>>> pci/0000:00:08.0/1: type eth netdev ??? flavour
> >> >> >>>>>>> pci_vf_host
> >> >> >>>>>>>
> >> >> >>>>>>> And back to your question of how are they connected in
> eswitch.
> >> >> >>>>>>> That is totally up to the original user John who did the
> creation.
> >> >> >>>>>>> He is in charge of the eswitch on baremetal, he would
> >> >> >>>>>>> configure the forwarding however he likes.
> >> >> >>>>>>
> >> >> >>>>>> Ack, so I think you're saying VM has to communicate to the
> >> >> >>>>>> cloud environment to have this provisioned using some
> >> >> >>>>>> service API, not a kernel API.  That's what I wanted to confirm.
> >> >> >>>>>>
> >> >> >>>>>> I don't see any benefit to having the "host ports" under
> >> >> >>>>>> devlink, as such I think it's a matter of preference.
> >> >> >>>>>
> >> >> >>>>> We need 'host ports' to configure parameters of this host
> >> >> >>>>> port which is not exposed by the rep-netdev.
> >> >> >>>>> Such as mac address.
> >> >> >>>>
> >> >> >>>> Please look at the quote of what Jiri wrote above - the host
> >> >> >>>> port gets passed to the VM, you can't use it as a handle to
> >> >> >>>> set the
> >> MAC.
> >> >> >>>>
> >> >> >>>> The way to set the MAC remains:
> >> >> >>>>
> >> >> >>>> # devlink port set pci/0000:05:00.0/10002 peer mac_addr
> >> >> >>>> 00:11:22:33:44:55
> >> >> >>>>
> >> >> >>> Even though it can be done, I think this is wrong model to
> >> >> >>> program
> >> >> >> hostport mac address using eswitch port.
> >> >> >>> All devlink objects are control objects, so what is passed to
> >> >> >>> VM is what is
> >> >> >> represented by devlink.
> >> >> >>> VF in the VM will anyway create its devlink object.
> >> >> >>> What is wrong in programming hostport?
> >> >> >>> It gives a very clear view to users of topology and objects.
> >> >> >>
> >> >> >> The VF or any subport MAC address should be configured by the
> >> >> >> orchestration layer that is running on the hypervisor and when
> >> >> >> a VF is assigned to a VF, the host port is not visible to the
> hypervisor.
> >> >> > What prevents  creation of hostport due to which is not visible?
> >> >> > Hostport is control port to program host side of parameters.
> >> >> > It should be created when user wants to program the parameters.
> >> >> >
> >> >> > Model is really straight forward.
> >> >> > Program host port params using hostport object.
> >> >> > Program switchport params using rep-netdev.
> >> >>
> >> >> IIUC, Jiri/Jakub are proposing creation of 2 devlink objects for
> >> >> each port - host facing ports and switch facing ports. This is in
> >> >> addition to the netdevs that are created today.
> >> >>
> >> >I am not proposing any different.
> >> >I am proposing only two changes.
> >> >1. control hostport params via referring hostport (not via indirect
> >> >peer)
> >>
> >> Not really possible. If you passthrough VF into VM, the hostport goes
> >> along with it.
> >>
> >No.
> >I am sorry in showing the enumeration which is the source of confusion.
> >
> >Below is the right enumeration.
> >
> >When VF is enumerated initially in the host, where eswitch devlink instance
> is located.
> >Below enumeration is seen.
> >
> >First two entries shows the link between hostport and switchport.
> >$ devlink port show
> >pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id
> >00154d130d2f peer pci/0000:05:00.0/1
> >
> >pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f
> >peer pci/0000:05:00.0/10002
> 
> Hostport should not have switch_id.
> 
> >
> >pci/0000:05:10.1/0 eth netdev flavour hostport This entry won't be seen
> >if VF auto probing is disabled. Because than VF is not enumerated.
> >
> >As a user, I will be programming the mac address of hostport for a VF.
> >pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f
> >peer pci/0000:05:00.0/10002
> 
> Hmm, so you are going to have 2 hostports for VF:
> 1) pci/0000:05:10.1/0
>    real one, that is going to go to VM - with a separate pci address
>    and devlink instance.

Yep. This is the one where Yonatan's port grouping APIs work on.

> 2) pci/0000:05:00.0/1
>    dummy one, which is not really a hostport, as there is no netdev
>    created for it. It only models the other side of cable, which is away
>    in VM.
> 
Right. This is the control object which typically hypervisor programs.

> >
> >
> >>
> >> >2. flavour should not be vf/pf, flavour should be hostport, switchport.
> >> >Because switch is flat and agnostic of pf/vf/mdev.
> >>
> >> Not sure. It's good to have this kind of visibility.
> >>
> >port can have label/attribute indicating that this belong to VF-1 or mdev as
> long as you are agreeing to have mdev attribute on host port.
> >(and not ask for abstracting it, because mdev is well defined kernel object).
> 
> Why mdev cannot be another flavour?
> 

hostport is of type pf/vf/mdev connected to some switchport.

So proposal is to have,
port flavour = hostport/switchport
port type/label = pf/vf/mdev


> >
> >>
> >> >
> >> >> Are you suggesting that all the devlink objects should be visible
> >> >> only at the hypervisor layer?
> >> >>
> >> >Of course not.
> >> >
> >> >Ports and params controlled by hypervisor should be exposed at
> >> hypervisor/eswitch wherever its parent devlink instance exist.
> >> >Ports which should be visible inside a VM should be exposed inside a
> VM.
> >> >So for a given VF,
> >> >
> >> >If eswitch is at hypervisor level,
> >> >$ devlink port show
> >> >pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id
> >> >00154d130d2f peer pci/0000:05:10.1/0
> >> >pci/0000:05:10.1/0 eth netdev flavour hostport switch_id
> >> >00154d130d2f peer pci/0000:05:00.0/10002
> >> >
> >> >where VF is enumerated,
> >> >$ devlink port show
> >> >pci/0000:05:10.1/0 eth netdev flavour hostport
> >>
> >> So this is how it looks like in VM, right?
> >>
> >Yep.
> >Once VF is mapped to VM only two entries are seen and hostport can be
> still controlled.
> >
> >$ devlink port show
> >pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id
> >00154d130d2f peer pci/0000:05:00.0/1
> >
> >pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f
> >peer pci/0000:05:00.0/10002
> >
> >This addresses the case for Infiniband where there is no eswitch, but
> hostports exists and should be managed.
> >We shouldn't be inventing new devlink APIs or create a fake sw eswitch
> object which doesn't exist in hw.
> >
> >>
> >> >This is because unprivileged VF doesn't have visibility to eswitch
> >> >and its
> >> links.
> >> >
> >> >> I think the terminology need to be defined clearly so that we are
> >> >> all on the same page.
> >> >>
> >> >> >
> >> >> >> Currently we have ndo_set_vf_mac_addr api that works with PF
> >> >> >> netdev, but i think we are trying to move away from that API
> >> >> >> and do all the configuration via the port representor netdevs.
> >> >> > This is fine rep-netdev represents eswitch port.
> >> >> > You normally don't go to switch to program host port params.
> >> >> >
> >> >> >> As the mac address cannot be configured using this netdev, i
> >> >> >> think Jakub is suggesting creating a devlink opject for each
> >> >> >> port representor and use that interface to set peer mac address.
> >> >> >
> >> >> > I understand but is convoluted interface.
> >> >> > When you program host NIC mac address you talk to iLo or BIOS.
> >> >> > When you program switch side mac address, you go
> >> switch/router/modem.
> >> >> >
> >> >> > Also programming host params on host side, also doesn't make
> >> >> assumption that its connected to eswitch.
> >> >> > It also doesn't assume that same connectivity for its life.
> >> >> >
> >> >> > If you model around how physical devices are configured, it will
> >> >> > almost
> >> >> never go wrong and still provides same level of flexibility.
> >> >> >
> >> >> >> We should be able use this to configure port vlan too.
> >> >> >>
> >> >> >> Also, instead of subport, can we call vport and support
> >> >> >> different types of vports - sr-iov, siov, vmdq etc.
> >> >> >>
> >> >> > At switch level there are just ports.
> >> >> > sriov, siov, mdev, vmdq are their couter part (peer) where it is
> >> connected.
> >> >> >
> >> >> >>>
> >> >> >>> Also eswitch is flat. There is no need of pf/vf flavour for port.
> >> >> >>> It doesn't make sense to define 'mdev' flavour which we are
> >> >> >>> already
> >> >> >> working.
> >> >> >>> At eswitch level it is just a port, it happen to be connected
> >> >> >>> to vf or pf or
> >> >> >> other objects, it doesn't matter.
> >> >> >>> Port should be flavoured as 'hostport' or 'switchport'.
> >> >> >>>
> >> >> >>>
> >> >> >>>> (using the port ids from above)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ