[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zczyhya/+454IaQM@ubuntu>
Date: Wed, 14 Feb 2024 17:04:08 +0000
From: Jim Harris <jim.harris@...sung.com>
To: Leon Romanovsky <leonro@...dia.com>
CC: Bjorn Helgaas <helgaas@...nel.org>, Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@...ux.intel.com>, Bjorn Helgaas
<bhelgaas@...gle.com>, "linux-pci@...r.kernel.org"
<linux-pci@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, Jason Gunthorpe <jgg@...dia.com>, "Alex
Williamson" <alex.williamson@...hat.com>, Pierre Crégut
<pierre.cregut@...nge.com>
Subject: Re: [PATCH v2 1/2] PCI/IOV: Revert
"PCI/IOV: Serialize sysfs sriov_numvfs reads vs writes"
On Wed, Feb 14, 2024 at 09:16:18AM +0200, Leon Romanovsky wrote:
> On Tue, Feb 13, 2024 at 01:45:56PM -0600, Bjorn Helgaas wrote:
> > On Tue, Feb 13, 2024 at 07:46:02PM +0200, Leon Romanovsky wrote:
> > > On Tue, Feb 13, 2024 at 09:59:54AM -0600, Bjorn Helgaas wrote:
> > > ...
> >
> > > > I guess that means that if we apply this revert, the problem Pierre
> > > > reported will return. Obviously the deadlock is more important than
> > > > the inconsistency Pierre observed, but from the user's point of view
> > > > this will look like a regression.
> > > >
> > > > Maybe listening to netlink and then looking at sysfs isn't the
> > > > "correct" way to do this, but I don't want to just casually break
> > > > existing user code. If we do contemplate doing the revert, at the
> > > > very least we should include specific details about what the user code
> > > > *should* do instead, at the level of the actual commands to use
> > > > instead of "ip monitor dev; cat ${path}/device/sriov_numvfs".
> > >
> > > udevadm monitor will do the trick.
> > >
> > > Another possible solution is to refactor the code to make sure that
> > > .probe on VFs happens only after sriov_numvfs is updated.
> >
> > I like the idea of refactoring it so as to preserve the existing
> > ordering while also fixing the deadlock.
>
> I think something like this will be enough (not tested). It will et the number of VFs
> before we make VFs visible to probe:
I'll push a v3, replacing the second patch with this one instead. Although
based on this discussion it seems we're moving towards squashing the revert
with Leon's suggested patch. Bjorn, I'll assume you're still OK with just
squashing these on your end.
I would like some input on how to actually test this though. Presumably we see
some event on device PF and we want to make sure if we read PF/device/sriov_numvfs
that we see the updated value. But the only type of event I think we can
expect is the PF's sriov_numvfs CHANGE event.
Is there any way for VFs to be created outside of writing to the
sriov_numvfs sysfs file? My understanding is some older devices/drivers will
auto-create VFs when the PF is initialized, but it wasn't clear from the bug
report whether that was part of the configuration here. Pierre, do you have
any recollection on this?
Or maybe testing for this case just means compile and verify with udevadm
monitor that we see the CHANGE event before any of the VFs are actually
created...
>
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index aaa33e8dc4c9..0cdfaae80594 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -679,12 +679,14 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> msleep(100);
> pci_cfg_access_unlock(dev);
>
> + iov->num_VFs = nr_virtfn;
> rc = sriov_add_vfs(dev, initial);
> - if (rc)
> + if (rc) {
> + iov->num_VFs = 0;
> goto err_pcibios;
> + }
>
> kobject_uevent(&dev->dev.kobj, KOBJ_CHANGE);
> - iov->num_VFs = nr_virtfn;
>
> return 0;
>
Powered by blists - more mailing lists