[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5433435CAAAB23EAE085C3128C929@BN9PR11MB5433.namprd11.prod.outlook.com>
Date: Tue, 9 Nov 2021 00:58:26 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: Alex Williamson <alex.williamson@...hat.com>,
Cornelia Huck <cohuck@...hat.com>,
Yishai Hadas <yishaih@...dia.com>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"saeedm@...dia.com" <saeedm@...dia.com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"kuba@...nel.org" <kuba@...nel.org>,
"leonro@...dia.com" <leonro@...dia.com>,
"kwankhede@...dia.com" <kwankhede@...dia.com>,
"mgurtovoy@...dia.com" <mgurtovoy@...dia.com>,
"maorg@...dia.com" <maorg@...dia.com>,
"Dr. David Alan Gilbert" <dgilbert@...hat.com>
Subject: RE: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver
for mlx5 devices
> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Monday, November 8, 2021 8:36 PM
>
> On Mon, Nov 08, 2021 at 08:53:20AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@...dia.com>
> > > Sent: Tuesday, October 26, 2021 11:19 PM
> > >
> > > On Tue, Oct 26, 2021 at 08:42:12AM -0600, Alex Williamson wrote:
> > >
> > > > > This is also why I don't like it being so transparent as it is
> > > > > something userspace needs to care about - especially if the HW
> cannot
> > > > > support such a thing, if we intend to allow that.
> > > >
> > > > Userspace does need to care, but userspace's concern over this should
> > > > not be able to compromise the platform and therefore making VF
> > > > assignment more susceptible to fatal error conditions to comply with a
> > > > migration uAPI is troublesome for me.
> > >
> > > It is an interesting scenario.
> > >
> > > I think it points that we are not implementing this fully properly.
> > >
> > > The !RUNNING state should be like your reset efforts.
> > >
> > > All access to the MMIO memories from userspace should be revoked
> > > during !RUNNING
> >
> > This assumes that vCPUs must be stopped before !RUNNING is entered
> > in virtualization case. and it is true today.
> >
> > But it may not hold when talking about guest SVA and I/O page fault [1].
> > The problem is that the pending requests may trigger I/O page faults
> > on guest page tables. W/o running vCPUs to handle those faults, the
> > quiesce command cannot complete draining the pending requests
> > if the device doesn't support preempt-on-fault (at least it's the case for
> > some Intel and Huawei devices, possibly true for most initial SVA
> > implementations).
>
> It cannot be ordered any other way.
>
> vCPUs must be stopped first, then the PCI devices must be stopped
> after, otherwise the vCPU can touch a stopped a device while handling
> a fault which is unreasonable.
>
> However, migrating a pending IOMMU fault does seem unreasonable as well.
>
> The NDA state can potentially solve this:
>
> RUNNING | VCPU RUNNING - Normal
> NDMA | RUNNING | VCPU RUNNING - Halt and flush DMA, and thus all
> faults
> NDMA | RUNNING - Halt all MMIO access
should be two steps?
NDMA | RUNNING - vCPU stops access to the device
NDMA - halt all MMIO access by revoking mapping
> 0 - Halted everything
yes, adding a new state sounds better than reordering the vcpu/device
stop sequence.
>
> Though this may be more disruptive to the vCPUs as they could spin on
> DMA/interrupts that will not come.
it's inevitable regardless how we define the migration states. the
actual impact depends on how long 'Halt and flush DMA' will take.
Thanks
Kevin
Powered by blists - more mailing lists