lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191112141954.371e8cd2@cakuba>
Date:   Tue, 12 Nov 2019 14:19:54 -0800
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     Yuval Avnery <yuvalav@...lanox.com>
Cc:     Andy Gospodarek <andrew.gospodarek@...adcom.com>,
        Jiri Pirko <jiri@...lanox.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Saeed Mahameed <saeedm@...lanox.com>,
        "leon@...nel.org" <leon@...nel.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "shuah@...nel.org" <shuah@...nel.org>,
        Daniel Jurgens <danielj@...lanox.com>,
        Parav Pandit <parav@...lanox.com>,
        "michael.chan@...adcom.com" <michael.chan@...adcom.com>
Subject: Re: [PATCH net-next v2 00/10] devlink subdev

On Tue, 12 Nov 2019 18:35:26 +0000, Yuval Avnery wrote:
> > It feels a bit 'unconstrained' to me as well.  As Jakub said you added some
> > documentation, but the direction of this long-term is not clear.
> > What seems to happen too often is that we skip creating better infra for
> > existing devices and create it only for the newest shiniest object.
> > These changes are what garners the most attention from those who grant
> > permission to push things upstream (*cough* managers *cough*), but it's
> > not the right choice in this case.  I'm not sure if that is part of what bothers
> > Jakub, but it is one thing that bothers me and why this feels incomplete.
> > 
> > The thing that has been bouncing around in my head about this (and until I
> > was in front of a good text-based MUA I didn't dare respond) is that we seem
> > to have an overlap in functionality between what you are proposing and
> > existing virtual device configuration, but you are completely ignoring
> > improving upon the existing creation methods.
> > Whether dealing with a SmartNIC (which I used to describe a NIC with
> > general purpose processors that could be running Linux and I think you do,
> > too) or a regular NIC it seems like we should use devlink to create and
> > configure a variety of devices these could be:
> > 
> > 1.  Number of PFs (there are cases where more than 1 PF per uplink port
> >     is required for command/control of the SmartNIC or where a single
> >     PCI b/d/f may have many uplink ports that need to be addressed
> >     separately)  
> 
> Yes, uplink ports can be addressed using "devlink port".
> 
> Multiple pfs are already supported.
> 
> $ devlink subdev show
> pci/0000:03:00.0/1: flavour pcipf pf 0
> pci/0000:03:00.0/1: flavour pcipf pf 1
> pci/0000:03:00.0/1: flavour pcivf pf 1 vf 0

I don't think that covers what Andy was taking about.

> > 2.  Device-specific SR-IOV VFs visible to server 3.  mdev devices that _may_
> > have representers on the embedded cores of
> >     the NIC  
> 
> Yes, Is also planned for current interface. (will need to add mdev flavor in the future)
> 
> > 4.  Hardware VirtIO-net devices supported 5.  Other non-network devices
> > (storage given as the first example) 6.  ...  
> 
> All is up to driver - what to expose and what flavor to grant the subdev.
> 
> > We cannot get rid of the methods for creating hardware VFs via sysfs, but
> > now that we are seeing lots of different flavors of devices that might be
> > created we should start by making sure at a minimum we can create existing
> > hardware devices (VFs) with this devlink interface and move from there.  Is
> > there a plan to use this interface to create subdevs that are VFs or just new
> > subdev flavors?  I could start to get behind an interface like this if the patches
> > showed that devlink would be the one place where device creation and
> > control could be done for all types of subdevs, but that isn't clear that it is
> > your direction based on the patches.  
> 
> I don’t see how the SmartNic can force the host PF to create new VFs.
> Isn’t that the host PF responsibility? Doesn't make sense to me.
> Do we want to have an interface that works only for NICs and not from
> the SmartNic embedded system?
>
> What we do here is expose all the PFS and potential VFs/mdevs that the host can enable.
> Unlike ip link which exposes only enabled VFs, here (in mlx5
> implementation) we expose all the VFs that the host PF _can_ enable.
> This allows, for example, to set the MAC of a VF before it is enabled.

So subdev will never be used to create interfaces? Just expose all
_possible_? Ugh. Wouldn't that be something to document?

How things are configured matters. This patch set throws some basics
together to get you the hw_addr from the SmartNIC side, and then you
start talking big plans like NVMe and PCIe control which you don't seem
to have thought through.

How does the PF in your diagram magically have different types of VFs?
PCIe SR-IOV cap has VF device vendor:product, you won't have NVMe VFs
hanging off the main PF like that in a normal world :/

Jiri pitched the subdev object as the "other side of the wire" but you
are back to arguing that you think it's actually the same as the port,
just more general. How does the hw_addr belongs to the ASIC side then?

And for PCIe you'd use this new interface for MSI-X or other resource
allocation. Say VFs. Who controls that in a SmartNIC scenario? Is the
host not allowed to move them around? This starts to touch multi-host
use case which Linux networking never begun to address.

You're creating a new object type with no clear semantics just to throw
in one feature - namely the hw_addr. If this is the quality of design
we can expect - just add the hw addr to devlink ports and let's move on.

> > So just to make sure this is clear, what I'm proposing that devlink
> > is used to create and configure all types of devices that it may be
> > possible to create,  
> 
> More flavors can be defiantly added in the future. About creation,
> see my comment above. Actually flavor is the only required attribute
> for registration. The rest is optional. 
> > configure, or address from a PF and the starting place should be
> > standard, hardware VFs.  If that can be done right and we can
> > address some of the issues with the current implementation (more
> > than one hw addr is a _great_ example) then adding new flavours for
> > the server devices and and adding new flavors for SmartNIC-based
> > devices should be easy and feel natural.  
> 
> So what you are saying that allowing multiple addresses per subdev
> will make a strong case for this patch set?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ