[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211015151243.3c5b0910.alex.williamson@redhat.com>
Date: Fri, 15 Oct 2021 15:12:43 -0600
From: Alex Williamson <alex.williamson@...hat.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Yishai Hadas <yishaih@...dia.com>, bhelgaas@...gle.com,
saeedm@...dia.com, linux-pci@...r.kernel.org, kvm@...r.kernel.org,
netdev@...r.kernel.org, kuba@...nel.org, leonro@...dia.com,
kwankhede@...dia.com, mgurtovoy@...dia.com, maorg@...dia.com
Subject: Re: [PATCH V1 mlx5-next 12/13] vfio/pci: Add infrastructure to let
vfio_pci_core drivers trap device RESET
On Fri, 15 Oct 2021 17:03:28 -0300
Jason Gunthorpe <jgg@...dia.com> wrote:
> On Fri, Oct 15, 2021 at 01:52:37PM -0600, Alex Williamson wrote:
> > On Wed, 13 Oct 2021 12:47:06 +0300
> > Yishai Hadas <yishaih@...dia.com> wrote:
> >
> > > Add infrastructure to let vfio_pci_core drivers trap device RESET.
> > >
> > > The motivation for this is to let the underlay driver be aware that
> > > reset was done and set its internal state accordingly.
> >
> > I think the intention of the uAPI here is that the migration error
> > state is exited specifically via the reset ioctl. Maybe that should be
> > made more clear, but variant drivers can already wrap the core ioctl
> > for the purpose of determining that mechanism of reset has occurred.
>
> It is not just recovering the error state.
>
> Any transition to reset changes the firmware state. Eg if userspace
> uses one of the other emulation paths to trigger the reset after
> putting the device off running then the driver state and FW state
> become desynchronized.
>
> So all the reset paths need to be synchronized some how, either
> blocked while in non-running states or aligning the SW state with the
> new post-reset FW state.
This only catches the two flavors of FLR and the RESET ioctl itself, so
we've got gaps relative to "all the reset paths" anyway. I'm also
concerned about adding arbitrary callbacks for every case that it gets
too cumbersome to write a wrapper for the existing callbacks.
However, why is this a vfio thing when we have the
pci_error_handlers.reset_done callback. At best this ought to be
redundant to that. Thanks,
Alex
Powered by blists - more mailing lists