[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB52762ECFCA869B97BDD2AA9D8C26A@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Mon, 26 Jun 2023 07:31:31 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: Brett Creeley <brett.creeley@....com>, "kvm@...r.kernel.org"
<kvm@...r.kernel.org>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"yishaih@...dia.com" <yishaih@...dia.com>,
"shameerali.kolothum.thodi@...wei.com"
<shameerali.kolothum.thodi@...wei.com>, "shannon.nelson@....com"
<shannon.nelson@....com>
Subject: RE: [PATCH v10 vfio 4/7] vfio/pds: Add VFIO live migration support
> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Wednesday, June 21, 2023 9:27 PM
>
> On Wed, Jun 21, 2023 at 06:49:12AM +0000, Tian, Kevin wrote:
>
> > What is the criteria for 'reasonable'? How does CSPs judge that such
> > device can guarantee a *reliable* reasonable window so live migration
> > can be enabled in the production environment?
>
> The CSP needs to work with the device vendor to understand how it fits
> into their system, I don't see how we can externalize this kind of
> detail in a general way.
>
> > I'm afraid that we are hiding a non-deterministic factor in current protocol.
>
> Yes
>
> > But still I don't think it's a good situation where the user has ZERO
> > knowledge about the non-negligible time in the stopping path...
>
> In any sane device design this will be a small period of time. These
> timeouts should be to protect against a device that has gone wild.
>
Any example how 'small' it will be (e.g. <1ms)?
Should we define a *reasonable* threshold in VFIO community which
any new variant driver should provide information to judge against?
If the worst-case stop time (assuming the device doesn't go wild) may
exceed the threshold then it's time to consider whether a new interface
is required to communicate such constraint to userspace.
The reason why I keep discussing it is that IMHO achieving negligible
stop time is a very challenging task for many accelerators. e.g. IDXD
can be stopped only after completing all the pending requests. While
it allows software to configure the max pending work size (and a
reasonable setting could meet both migration SLA and performance
SLA) the worst-case draining latency could be in 10's milliseconds which
cannot be ignored by the VMM.
Or do you think it's still better left to CSP working with the device vendor
even in this case, given the worst-case latency could be affected by
many factors hence not something which a kernel driver can accurately
estimate?
Thanks
Kevin
Powered by blists - more mailing lists