lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 12 Jan 2021 08:15:35 +0200
From:   Leon Romanovsky <leon@...nel.org>
To:     Don Dutile <ddutile@...hat.com>,
        Alexander Duyck <alexander.duyck@...il.com>
Cc:     Bjorn Helgaas <bhelgaas@...gle.com>,
        Saeed Mahameed <saeedm@...dia.com>,
        Jason Gunthorpe <jgg@...dia.com>,
        Jakub Kicinski <kuba@...nel.org>,
        linux-pci <linux-pci@...r.kernel.org>,
        linux-rdma@...r.kernel.org, Netdev <netdev@...r.kernel.org>,
        Alex Williamson <alex.williamson@...hat.com>
Subject: Re: [PATCH mlx5-next v1 1/5] PCI: Add sysfs callback to allow MSI-X
 table size change of SR-IOV VFs

On Mon, Jan 11, 2021 at 10:25:42PM -0500, Don Dutile wrote:
> On 1/11/21 2:30 PM, Alexander Duyck wrote:
> > On Sun, Jan 10, 2021 at 7:12 AM Leon Romanovsky <leon@...nel.org> wrote:
> > > From: Leon Romanovsky <leonro@...dia.com>
> > >
> > > Extend PCI sysfs interface with a new callback that allows configure
> > > the number of MSI-X vectors for specific SR-IO VF. This is needed
> > > to optimize the performance of newly bound devices by allocating
> > > the number of vectors based on the administrator knowledge of targeted VM.
> > >
> > > This function is applicable for SR-IOV VF because such devices allocate
> > > their MSI-X table before they will run on the VMs and HW can't guess the
> > > right number of vectors, so the HW allocates them statically and equally.
> > >
> > > The newly added /sys/bus/pci/devices/.../vf_msix_vec file will be seen
> > > for the VFs and it is writable as long as a driver is not bounded to the VF.
> > >
> > > The values accepted are:
> > >   * > 0 - this will be number reported by the VF's MSI-X capability
> > >   * < 0 - not valid
> > >   * = 0 - will reset to the device default value
> > >
> > > Signed-off-by: Leon Romanovsky <leonro@...dia.com>
> > > ---
> > >   Documentation/ABI/testing/sysfs-bus-pci | 20 ++++++++
> > >   drivers/pci/iov.c                       | 62 +++++++++++++++++++++++++
> > >   drivers/pci/msi.c                       | 29 ++++++++++++
> > >   drivers/pci/pci-sysfs.c                 |  1 +
> > >   drivers/pci/pci.h                       |  2 +
> > >   include/linux/pci.h                     |  8 +++-
> > >   6 files changed, 121 insertions(+), 1 deletion(-)

<...>

> > > +
> > This doesn't make sense to me. You are getting the vector count for
> > the PCI device and reporting that. Are you expecting to call this on
> > the PF or the VFs? It seems like this should be a PF attribute and not
> > be called on the individual VFs.
> >
> > If you are calling this on the VFs then it doesn't really make any
> > sense anyway since the VF is not a "VF PCI dev representor" and
> > shouldn't be treated as such. In my opinion if we are going to be
> > doing per-port resource limiting that is something that might make
> > more sense as a part of the devlink configuration for the VF since the
> > actual change won't be visible to an assigned device.
> if the op were just limited to nic ports, devlink may be used; but I believe Leon is trying to handle it from an sriov/vf perspective for other non-nic devices as well,
> e.g., ib ports, nvme vf's (which don't have a port concept at all).

Right, the SR-IOV VFs are common entities outside of nic/devlink world.
In addition to the netdev world, SR-IOV is used for crypto, storage, FPGA
and IB devices.

>From what I see, There are three possible ways to configure MSI-X vector count:
1. PCI device is on the same CPU - regular server/desktop as we know it.
2. PCI device is on remote CPU - SmartNIC use case, the CPUs are
connected through eswitch model.
3. Some direct interface - DEVX for the mlx5_ib.

This implementation handles item #1 for all devices without any exception.
>From my point of view, the majority of interested users of this feature
will use this option.

The second item should be solved differently (with devlink) when configuring
eswitch port, but it is orthogonal to the item #1 and I will do it after.

The third item is not managed by the kernel, so not relevant for our discussion.

Thanks

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ