lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5c76f6cfb535828f6586a67bd3409981663d14d8.camel@linux.ibm.com>
Date: Thu, 14 Aug 2025 15:12:35 +0200
From: Niklas Schnelle <schnelle@...ux.ibm.com>
To: Alex Williamson <alex.williamson@...hat.com>,
        Farhan Ali
	 <alifm@...ux.ibm.com>, Bjorn Helgaas <helgaas@...nel.org>
Cc: linux-s390@...r.kernel.org, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, mjrosato@...ux.ibm.com
Subject: Re: [PATCH v1 5/6] vfio-pci/zdev: Perform platform specific
 function reset for zPCI

On Wed, 2025-08-13 at 16:56 -0600, Alex Williamson wrote:
> On Wed, 13 Aug 2025 14:52:24 -0700
> Farhan Ali <alifm@...ux.ibm.com> wrote:
> 
> > On 8/13/2025 1:30 PM, Alex Williamson wrote:
> > > On Wed, 13 Aug 2025 10:08:19 -0700
> > > Farhan Ali <alifm@...ux.ibm.com> wrote:
> > >  
> > > > For zPCI devices we should drive a platform specific function reset
> > > > as part of VFIO_DEVICE_RESET. This reset is needed recover a zPCI device
> > > > in error state.
> > > > 
> > > > Signed-off-by: Farhan Ali <alifm@...ux.ibm.com>
> > > > ---
> > > >   arch/s390/pci/pci.c              |  1 +
> > > >   drivers/vfio/pci/vfio_pci_core.c |  4 ++++
> > > >   drivers/vfio/pci/vfio_pci_priv.h |  5 ++++
> > > >   drivers/vfio/pci/vfio_pci_zdev.c | 39 ++++++++++++++++++++++++++++++++
> > > >   4 files changed, 49 insertions(+)
--- snip ---
> > > >   
> > > > +int vfio_pci_zdev_reset(struct vfio_pci_core_device *vdev)
> > > > +{
> > > > +	struct zpci_dev *zdev = to_zpci(vdev->pdev);
> > > > +	int rc = -EIO;
> > > > +
> > > > +	if (!zdev)
> > > > +		return -ENODEV;
> > > > +
> > > > +	/*
> > > > +	 * If we can't get the zdev->state_lock the device state is
> > > > +	 * currently undergoing a transition and we bail out - just
> > > > +	 * the same as if the device's state is not configured at all.
> > > > +	 */
> > > > +	if (!mutex_trylock(&zdev->state_lock))
> > > > +		return rc;
> > > > +
> > > > +	/* We can reset only if the function is configured */
> > > > +	if (zdev->state != ZPCI_FN_STATE_CONFIGURED)
> > > > +		goto out;
> > > > +
> > > > +	rc = zpci_hot_reset_device(zdev);
> > > > +	if (rc != 0)
> > > > +		goto out;
> > > > +
> > > > +	if (!vdev->pci_saved_state) {
> > > > +		pci_err(vdev->pdev, "No saved available for the device");
> > > > +		rc = -EIO;
> > > > +		goto out;
> > > > +	}
> > > > +
> > > > +	pci_dev_lock(vdev->pdev);
> > > > +	pci_load_saved_state(vdev->pdev, vdev->pci_saved_state);
> > > > +	pci_restore_state(vdev->pdev);
> > > > +	pci_dev_unlock(vdev->pdev);
> > > > +out:
> > > > +	mutex_unlock(&zdev->state_lock);
> > > > +	return rc;
> > > > +}  
> > > This looks like it should be a device or arch specific reset
> > > implemented in drivers/pci, not vfio.  Thanks,
> > > 
> > > Alex  
> > 
> > Are you suggesting to move this to an arch specific function? One thing 
> > we need to do after the zpci_hot_reset_device, is to correctly restore 
> > the config space of the device. And for vfio-pci bound devices we want 
> > to restore the state of the device to when it was initially opened.
> 
> We generally rely on the abstraction of pci_reset_function() to select
> the correct type of reset for a function scope reset.  We've gone to
> quite a bit of effort to implement all device specific resets and
> quirks in the PCI core to be re-used across the kernel.
> 
> Calling zpci_hot_reset_device() directly seems contradictory to those
> efforts.  Should pci_reset_function() call this universally on s390x
> rather than providing access to FLR/PM/SBR reset? 
> 

I agree with you Alex. Still trying to figure out what's needed for
this. We already do zpci_hot_reset_device() in reset_slot() from the
s390_pci_hpc.c hotplug slot driver and that does get called via
pci_reset_hotplug_slot() and pci_reset_function(). There are a few
problems though that meant it didn't work for Farhan but I agree maybe
we can fix them for the general case. For one pci_reset_function()
via DEVICE_RESET first tries FLR but that won't work with the device in
the error state and MMIO blocked. Sadly __pci_reset_function_locked()
then concludes that other resets also won't work. So that's something
we might want to improve in general, for example maybe we need
something more like pci_dev_acpi_reset() with higher priority than FLR.

Now for pci_reset_hotplug_slot() via VFIO_DEVICE_PCI_HOT_RESET I'm not
sure why that won't work as is. @Farhan do you know?

>  Why is it
> universally correct here given the ioctl previously made use of
> standard reset mechanisms?
> 
> The DEVICE_RESET ioctl is simply an in-place reset of the device,
> without restoring the original device state.  So we're also subtly
> changing that behavior here, presumably because we're targeting the
> specific error recovery case.  Have you considered how this might
> break non-error-recovery use cases?
> 
> I wonder if we want a different reset mechanism for this use case
> rather than these subtle semantic changes. 

I think an alternative to that, which Farhan actually had in the
previous internal version, is to implement
pci_error_handlers::reset_done() and do the pci_load_saved_state()
there. That would only affect the error recovery case leaving other
cases alone.

Thanks,
Niklas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ