[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220302142732.GK219866@nvidia.com>
Date: Wed, 2 Mar 2022 10:27:32 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Cornelia Huck <cohuck@...hat.com>
Cc: Yishai Hadas <yishaih@...dia.com>, alex.williamson@...hat.com,
bhelgaas@...gle.com, saeedm@...dia.com, linux-pci@...r.kernel.org,
kvm@...r.kernel.org, netdev@...r.kernel.org, kuba@...nel.org,
leonro@...dia.com, kwankhede@...dia.com, mgurtovoy@...dia.com,
maorg@...dia.com, ashok.raj@...el.com, kevin.tian@...el.com,
shameerali.kolothum.thodi@...wei.com
Subject: Re: [PATCH V9 mlx5-next 09/15] vfio: Define device migration
protocol v2
On Wed, Mar 02, 2022 at 12:19:20PM +0100, Cornelia Huck wrote:
> > +/*
> > + * vfio_mig_get_next_state - Compute the next step in the FSM
> > + * @cur_fsm - The current state the device is in
> > + * @new_fsm - The target state to reach
> > + * @next_fsm - Pointer to the next step to get to new_fsm
> > + *
> > + * Return 0 upon success, otherwise -errno
> > + * Upon success the next step in the state progression between cur_fsm and
> > + * new_fsm will be set in next_fsm.
>
> What about non-success? Can the caller make any assumption about
> next_fsm in that case? Because...
I checked both mlx5 and acc, both properly ignore the next_fsm value
on error. This oddness aros when Alex asked to return an errno instead
of the state value.
> > + * any -> ERROR
> > + * ERROR cannot be specified as a device state, however any transition request
> > + * can be failed with an errno return and may then move the device_state into
> > + * ERROR. In this case the device was unable to execute the requested arc and
> > + * was also unable to restore the device to any valid device_state.
> > + * To recover from ERROR VFIO_DEVICE_RESET must be used to return the
> > + * device_state back to RUNNING.
>
> ...this seems to indicate that not moving into STATE_ERROR is an
> option anyway.
Yes, but it is never done by vfio_mig_get_next_state() it is only
directly triggered inside the driver.
> Do we need any extra guidance in the description for
> vfio_mig_get_next_state()?
I think no, it is typical in linux that function failure means output
arguments are not valid
Jason
Powered by blists - more mailing lists