[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZDlNeyv/HLG4SPwB@nvidia.com>
Date: Fri, 14 Apr 2023 09:56:27 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Brett Creeley <brett.creeley@....com>
Cc: kvm@...r.kernel.org, netdev@...r.kernel.org,
alex.williamson@...hat.com, yishaih@...dia.com,
shameerali.kolothum.thodi@...wei.com, kevin.tian@...el.com,
shannon.nelson@....com, drivers@...sando.io,
simon.horman@...igine.com
Subject: Re: [PATCH v8 vfio 6/7] vfio/pds: Add support for firmware recovery
On Tue, Apr 04, 2023 at 12:01:40PM -0700, Brett Creeley wrote:
> It's possible that the device firmware crashes and is able to recover
> due to some configuration and/or other issue. If a live migration
> is in progress while the firmware crashes, the live migration will
> fail. However, the VF PCI device should still be functional post
> crash recovery and subsequent migrations should go through as
> expected.
>
> When the pds_core device notices that firmware crashes it sends an
> event to all its client drivers. When the pds_vfio driver receives
> this event while migration is in progress it will request a deferred
> reset on the next migration state transition. This state transition
> will report failure as well as any subsequent state transition
> requests from the VMM/VFIO. Based on uapi/vfio.h the only way out of
> VFIO_DEVICE_STATE_ERROR is by issuing VFIO_DEVICE_RESET. Once this
> reset is done, the migration state will be reset to
> VFIO_DEVICE_STATE_RUNNING and migration can be performed.
>
> If the event is received while no migration is in progress (i.e.
> the VM is in normal operating mode), then no actions are taken
> and the migration state remains VFIO_DEVICE_STATE_RUNNING.
>
> Signed-off-by: Brett Creeley <brett.creeley@....com>
> Signed-off-by: Shannon Nelson <shannon.nelson@....com>
> ---
> drivers/vfio/pci/pds/pci_drv.c | 110 +++++++++++++++++++++++++++++++-
> drivers/vfio/pci/pds/vfio_dev.c | 34 +++++++++-
> drivers/vfio/pci/pds/vfio_dev.h | 6 +-
> 3 files changed, 146 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/vfio/pci/pds/pci_drv.c b/drivers/vfio/pci/pds/pci_drv.c
> index b0781d9f4246..b155ac9b98ae 100644
> --- a/drivers/vfio/pci/pds/pci_drv.c
> +++ b/drivers/vfio/pci/pds/pci_drv.c
> @@ -20,6 +20,104 @@
> #define PDS_VFIO_DRV_DESCRIPTION "AMD/Pensando VFIO Device Driver"
> #define PCI_VENDOR_ID_PENSANDO 0x1dd8
>
> +static void
> +pds_vfio_recovery_work(struct work_struct *work)
> +{
> + struct pds_vfio_pci_device *pds_vfio =
> + container_of(work, struct pds_vfio_pci_device, work);
> + bool deferred_reset_needed = false;
> +
> + /* Documentation states that the kernel migration driver must not
> + * generate asynchronous device state transitions outside of
> + * manipulation by the user or the VFIO_DEVICE_RESET ioctl.
> + *
> + * Since recovery is an asynchronous event received from the device,
> + * initiate a deferred reset. Only issue the deferred reset if a
> + * migration is in progress, which will cause the next step of the
> + * migration to fail. Also, if the device is in a state that will
> + * be set to VFIO_DEVICE_STATE_RUNNING on the next action (i.e. VM is
> + * shutdown and device is in VFIO_DEVICE_STATE_STOP) as that will clear
> + * the VFIO_DEVICE_STATE_ERROR when the VM starts back up.
> + */
> + mutex_lock(&pds_vfio->state_mutex);
> + if ((pds_vfio->state != VFIO_DEVICE_STATE_RUNNING &&
> + pds_vfio->state != VFIO_DEVICE_STATE_ERROR) ||
> + (pds_vfio->state == VFIO_DEVICE_STATE_RUNNING &&
> + pds_vfio_dirty_is_enabled(pds_vfio)))
> + deferred_reset_needed = true;
> + mutex_unlock(&pds_vfio->state_mutex);
> +
> + /* On the next user initiated state transition, the device will
> + * transition to the VFIO_DEVICE_STATE_ERROR. At this point it's the user's
> + * responsibility to reset the device.
> + *
> + * If a VFIO_DEVICE_RESET is requested post recovery and before the next
> + * state transition, then the deferred reset state will be set to
> + * VFIO_DEVICE_STATE_RUNNING.
> + */
> + if (deferred_reset_needed)
> + pds_vfio_deferred_reset(pds_vfio, VFIO_DEVICE_STATE_ERROR);
> +}
Why is this a work? it is threaded on a blocking_notifier_chain so it
can call the mutex?
Why is the locking like this, can't you just call
pds_vfio_deferred_reset() under the mutex?
Jason
Powered by blists - more mailing lists