lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 12 Mar 2019 15:02:39 +0100
From:   Jiri Pirko <jiri@...nulli.us>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc:     davem@...emloft.net, netdev@...r.kernel.org,
        oss-drivers@...ronome.com
Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
 ports

Tue, Mar 12, 2019 at 03:10:54AM CET, jakub.kicinski@...ronome.com wrote:
>On Mon, 11 Mar 2019 09:52:04 +0100, Jiri Pirko wrote:
>> Fri, Mar 08, 2019 at 08:09:43PM CET, jakub.kicinski@...ronome.com wrote:
>> >If the switchport is in the hypervisor then only the hypervisor can
>> >control switching/forwarding, correct?  
>> 
>> Correct.
>> 
>> >The primary use case for partitioning within a VM (of a VF) would be
>> >containers (and DPDK)?  
>> 
>> Makes sense.
>> 
>> >SR-IOV makes things harder.  Splitting a PF is reasonably easy to grasp.
>> >I'm trying to get a sense of is how would we control an SR-IOV
>> >environment as a whole.  
>> 
>> You mean orchestration? 
>
>Right, orchestration.
>
>To be clear on where I'm going with this - if we want to allow VFs 
>to partition themselves then they have to control what is effectively 
>a "nested" switch.  A per-VF set of rules which would the get

Wait. If you allow to make VF subports (I believe that is what you ment
by VFs partition themselves), that does not mean they will have a
separate nested switch. They would still belong under the same one.


>"flattened" into the main eswitch rule set.  If I was to choose I'd
>really rather have this "flattening" be done on the (Linux) hypervisor
>and not in the vendor driver and firmware.

Agreed. Driver should provide one big switch. User should configure it.


>
>I'd much rather have the VM make a "give me another NIC" orchestration
>call via some high level REST API than devlink.  This makes the
>configuration strictly high level to low level:
>
>  VM -> cloud net REST API -> cloud agent -> devlink/Linux -> FW -> HW
>
>Without round trips via firmware.  

Okay. So the "devlink/Linux -> FW" part is going to happen on baremetal.
Makes sense.


>
>This allows for easy policy enforcement, common code to be maintained
>in user space, in high level languages (no 0.5M LoC drivers and 10M LoC
>firmware for every driver).  It can also be used with software paths
>like VirtIO..

Agreed.


>
>Modelling and debugging a nested switch would be a nightmare.  What
>follows is that we probably shouldn't deal with partitioning of VFs,
>but rather only partition via the PF devlink instance, and reassign 
>the partitions to VMs.

Agreed. That must be misunderstanding, I never suggested nested
switches.


>
>> I originally planned to implement sriov orchestration api in devlink too.
>
>Interesting, would you mind elaborating?

I have to think about it. But something like this:

After bootup, you see only physical port, PF switch port and PF host leg.
$ devlink port show
pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical switch_id 00154d130d2
pci/0000:05:00.0/1: type eth netdev ??? flavour pci_pf_host
                    peer pci/0000:05:00.0/10000
pci/0000:05:00.0/10000: type eth netdev enp5s0npf0pf0s0 flavour pci_pf pf 0 subport 0
                    switch_id 00154d130d2f peer pci/0000:05:00.0/1

To create new PF subport under PF 0:
$ devlink dev port add pci/0000:05:00.0 flavour pci_pf pf 0
$ devlink port show
pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical switch_id 00154d130d2
pci/0000:05:00.0/1: type eth netdev ??? flavour pci_pf_host
                    peer pci/0000:05:00.0/10000
pci/0000:05:00.0/10000: type eth netdev enp5s0npf0pf0s0 flavour pci_pf pf 0 subport 0
                    switch_id 00154d130d2f peer pci/0000:05:00.0/1
pci/0000:05:00.0/2: type eth netdev ??? flavour pci_pf_host                            <<<<<<<<<<<<<<<<<<
                    peer pci/0000:05:00.0/10001                                        <<<<<<<<<<<<<<<<<<
pci/0000:05:00.0/10001: type eth netdev enp5s0npf0pf0s0 flavour pci_pf pf 0 subport 1  <<<<<<<<<<<<<<<<<<
                    switch_id 00154d130d2f peer pci/0000:05:00.0/2                     <<<<<<<<<<<<<<<<<<

To create a new VF under PF 0:
$ devlink dev port add pci/0000:05:00.0 flavour pci_vf pf 0
$ devlink port show
pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical switch_id 00154d130d2
pci/0000:05:00.0/1: type eth netdev ??? flavour pci_pf_host
                    peer pci/0000:05:00.0/10000
pci/0000:05:00.0/10000: type eth netdev enp5s0npf0pf0s0 flavour pci_pf pf 0 subport 0
                    switch_id 00154d130d2f peer pci/0000:05:00.0/1
pci/0000:05:00.0/2: type eth netdev ??? flavour pci_pf_host
                    peer pci/0000:05:00.0/10001
pci/0000:05:00.0/10001: type eth netdev enp5s0npf0pf0s0 flavour pci_pf pf 0 subport 1
                    switch_id 00154d130d2f peer pci/0000:05:00.0/2
pci/0000:05:10.1/0: type eth netdev ??? flavour pci_vf_host                            <<<<<<<<<<<<<<<<<<
                    peer pci/0000:05:00.0/10002                                        <<<<<<<<<<<<<<<<<<
pci/0000:05:00.0/10002: type eth netdev enp5s0npf0pf0s0 flavour pci_vf pf 0 vf 0       <<<<<<<<<<<<<<<<<<
                    switch_id 00154d130d2f peer pci/0000:05:10.1/0                     <<<<<<<<<<<<<<<<<<

So new VF is created.


To delete, use would need to use the port which is in eswitch:
$ devlink port del pci/0000:05:00.0/2
devlink answers: Operation not permitted
$ devlink port del pci/0000:05:00.0/10001     <<<<<<<<<< this

$ devlink port del pci/0000:05:10.1/0
devlink answers: Operation not permitted
$ devlink port del pci/0000:05:00.0/10002     <<<<<<<<<< this

This actually removes VF.


For VF subports this would work too, we just have to have "subport"
attribute not only for PFs but also for VFs:

To create a new VF subport under PF 0 and VF 0:
$ devlink dev port add pci/0000:05:00.0 flavour pci_vf pf 0 vf 0
$ devlink port show
pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical switch_id 00154d130d2
pci/0000:05:00.0/1: type eth netdev ??? flavour pci_pf_host
                    peer pci/0000:05:00.0/10000
pci/0000:05:00.0/10000: type eth netdev enp5s0npf0pf0s0 flavour pci_pf pf 0 subport 0
                    switch_id 00154d130d2f peer pci/0000:05:00.0/1
pci/0000:05:00.0/2: type eth netdev ??? flavour pci_pf_host
                    peer pci/0000:05:00.0/10001
pci/0000:05:00.0/10001: type eth netdev enp5s0npf0pf0s0 flavour pci_pf pf 0 subport 1
                    switch_id 00154d130d2f peer pci/0000:05:00.0/2
pci/0000:05:10.1/0: type eth netdev ??? flavour pci_vf_host
                    peer pci/0000:05:00.0/10002
pci/0000:05:00.0/10002: type eth netdev enp5s0npf0pf0s0 flavour pci_vf pf 0 vf 0 subport 0
                    switch_id 00154d130d2f peer pci/0000:05:10.1/0
pci/0000:05:10.1/1: type eth netdev ??? flavour pci_vf_host                                  <<<<<<<<<<<<<<<<<<
                    peer pci/0000:05:00.0/10003                                              <<<<<<<<<<<<<<<<<<
pci/0000:05:00.0/10003: type eth netdev enp5s0npf0pf0s0 flavour pci_vf pf 0 vf 0 subport 1   <<<<<<<<<<<<<<<<<<
                    switch_id 00154d130d2f peer pci/0000:05:10.1/1                           <<<<<<<<<<<<<<<<<<


Powered by blists - more mailing lists