[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UevrCLSQp=dNiHXWFu=10OiPb5PPgP1ZkPN1uKHfD=zBQ@mail.gmail.com>
Date: Wed, 10 Mar 2021 15:34:01 -0800
From: Alexander Duyck <alexander.duyck@...il.com>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: Leon Romanovsky <leon@...nel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Saeed Mahameed <saeedm@...dia.com>,
Leon Romanovsky <leonro@...dia.com>,
Jason Gunthorpe <jgg@...dia.com>,
Jakub Kicinski <kuba@...nel.org>,
linux-pci <linux-pci@...r.kernel.org>,
linux-rdma@...r.kernel.org, Netdev <netdev@...r.kernel.org>,
Don Dutile <ddutile@...hat.com>,
Alex Williamson <alex.williamson@...hat.com>,
"David S . Miller" <davem@...emloft.net>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH mlx5-next v7 0/4] Dynamically assign MSI-X vectors count
On Wed, Mar 10, 2021 at 11:09 AM Bjorn Helgaas <helgaas@...nel.org> wrote:
>
> On Sun, Mar 07, 2021 at 10:55:24AM -0800, Alexander Duyck wrote:
> > On Sun, Feb 28, 2021 at 11:55 PM Leon Romanovsky <leon@...nel.org> wrote:
> > > From: Leon Romanovsky <leonro@...dia.com>
> > >
> > > @Alexander Duyck, please update me if I can add your ROB tag again
> > > to the series, because you liked v6 more.
> > >
> > > Thanks
> > >
> > > ---------------------------------------------------------------------------------
> > > Changelog
> > > v7:
> > > * Rebase on top v5.12-rc1
> > > * More english fixes
> > > * Returned to static sysfs creation model as was implemented in v0/v1.
> >
> > Yeah, so I am not a fan of the series. The problem is there is only
> > one driver that supports this, all VFs are going to expose this sysfs,
> > and I don't know how likely it is that any others are going to
> > implement this functionality. I feel like you threw out all the
> > progress from v2-v6.
>
> pci_enable_vfs_overlay() turned up in v4, so I think v0-v3 had static
> sysfs files regardless of whether the PF driver was bound.
>
> > I really feel like the big issue is that this model is broken as you
> > have the VFs exposing sysfs interfaces that make use of the PFs to
> > actually implement. Greg's complaint was the PF pushing sysfs onto the
> > VFs. My complaint is VFs sysfs files operating on the PF. The trick is
> > to find a way to address both issues.
> >
> > Maybe the compromise is to reach down into the IOV code and have it
> > register the sysfs interface at device creation time in something like
> > pci_iov_sysfs_link if the PF has the functionality present to support
> > it.
>
> IIUC there are two questions on the table:
>
> 1) Should the sysfs files be visible only when a PF driver that
> supports MSI-X vector assignment is bound?
>
> I think this is a cosmetic issue. The presence of the file is
> not a reliable signal to management software; it must always
> tolerate files that don't exist (e.g., on old kernels) or files
> that are visible but don't work (e.g., vectors may be exhausted).
>
> If we start with the files always being visible, we should be
> able to add smarts later to expose them only when the PF driver
> is bound.
>
> My concerns with pci_enable_vf_overlay() are that it uses a
> little more sysfs internals than I'd like (although there are
> many callers of sysfs_create_files()) and it uses
> pci_get_domain_bus_and_slot(), which is generally a hack and
> creates refcounting hassles. Speaking of which, isn't v6 missing
> a pci_dev_put() to match the pci_get_domain_bus_and_slot()?
I'm not so much worried about management software as the fact that
this is a vendor specific implementation detail that is shaping how
the kernel interfaces are meant to work. Other than the mlx5 I don't
know if there are any other vendors really onboard with this sort of
solution.
In addition it still feels rather hacky to be modifying read-only PCIe
configuration space on the fly via a backdoor provided by the PF. It
almost feels like this should be some sort of quirk rather than a
standard feature for an SR-IOV VF.
> 2) Should a VF sysfs file use the PF to implement this?
>
> Can you elaborate on your idea here? I guess
> pci_iov_sysfs_link() makes a "virtfnX" link from the PF to the
> VF, and you're thinking we could also make a "virtfnX_msix_count"
> in the PF directory? That's a really interesting idea.
I would honestly be more comfortable if the PF owned these files
instead of the VFs. One of the things I didn't like about this back
during the V1/2 days was the fact that it gave the impression that
MSI-X count was something that is meant to be edited. Since then I
think at least the naming was changed so that it implies that this is
only possible due to SR-IOV.
I also didn't like that it makes the VFs feel like they are port
representors rather than being actual PCIe devices. Having
functionality that only works when the VF driver is not loaded just
feels off. The VF sysfs directory feels like it is being used as a
subdirectory of the PF rather than being a device on its own.
> > Also we might want to double check that the PF cannot be unbound while
> > the VF is present. I know for a while there it was possible to remove
> > the PF driver while the VF was present. The Mellanox drivers may not
> > allow it but it might not hurt to look at taking a reference against
> > the PF driver if you are allocating the VF MSI-X configuration sysfs
> > file.
>
> Unbinding the PF driver will either remove the *_msix_count files or
> make them stop working. Is that a problem? I'm not sure we should
> add a magic link that prevents driver unbinding. Seems like it would
> be hard for users to figure out why the driver can't be removed.
I checked it again, it will make the *_msix_count files stop working.
In order to guarantee it cannot happen in the middle of things though
we are sitting on the device locks for both the PF and the VF. I'm not
a fan of having to hold 2 locks while we perform a firmware operation
for one device, but I couldn't find anything where we would deadlock
so it should be fine.
Powered by blists - more mailing lists