lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220302083440.539a1f33.alex.williamson@redhat.com>
Date:   Wed, 2 Mar 2022 08:34:40 -0700
From:   Alex Williamson <alex.williamson@...hat.com>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     Cornelia Huck <cohuck@...hat.com>,
        Yishai Hadas <yishaih@...dia.com>, bhelgaas@...gle.com,
        saeedm@...dia.com, linux-pci@...r.kernel.org, kvm@...r.kernel.org,
        netdev@...r.kernel.org, kuba@...nel.org, leonro@...dia.com,
        kwankhede@...dia.com, mgurtovoy@...dia.com, maorg@...dia.com,
        ashok.raj@...el.com, kevin.tian@...el.com,
        shameerali.kolothum.thodi@...wei.com
Subject: Re: [PATCH V9 mlx5-next 09/15] vfio: Define device migration
 protocol v2

On Wed, 2 Mar 2022 10:27:32 -0400
Jason Gunthorpe <jgg@...dia.com> wrote:

> On Wed, Mar 02, 2022 at 12:19:20PM +0100, Cornelia Huck wrote:
> > > +/*
> > > + * vfio_mig_get_next_state - Compute the next step in the FSM
> > > + * @cur_fsm - The current state the device is in
> > > + * @new_fsm - The target state to reach
> > > + * @next_fsm - Pointer to the next step to get to new_fsm
> > > + *
> > > + * Return 0 upon success, otherwise -errno
> > > + * Upon success the next step in the state progression between cur_fsm and
> > > + * new_fsm will be set in next_fsm.  
> > 
> > What about non-success? Can the caller make any assumption about
> > next_fsm in that case? Because...  
> 
> I checked both mlx5 and acc, both properly ignore the next_fsm value
> on error. This oddness aros when Alex asked to return an errno instead
> of the state value.

Right, my assertion was that only the driver itself should be able to
transition to the ERROR state.  vfio_mig_get_next_state() should never
advise the driver to go to the error state, it can only report that a
transition is invalid.  The driver may stay in the current state if an
error occurs here, which is why we added the ability to get the device
state.  Thanks,

Alex

> > > + * any -> ERROR
> > > + *   ERROR cannot be specified as a device state, however any transition request
> > > + *   can be failed with an errno return and may then move the device_state into
> > > + *   ERROR. In this case the device was unable to execute the requested arc and
> > > + *   was also unable to restore the device to any valid device_state.
> > > + *   To recover from ERROR VFIO_DEVICE_RESET must be used to return the
> > > + *   device_state back to RUNNING.  
> > 
> > ...this seems to indicate that not moving into STATE_ERROR is an
> > option anyway.   
> 
> Yes, but it is never done by vfio_mig_get_next_state() it is only
> directly triggered inside the driver.
> 
> > Do we need any extra guidance in the description for
> > vfio_mig_get_next_state()?  
> 
> I think no, it is typical in linux that function failure means output
> arguments are not valid
> 
> Jason
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ