netdev - RE: [PATCH v10 vfio 4/7] vfio/pds: Add VFIO live migration support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <BN9PR11MB5276DD9E2B791EE2C06046348C5CA@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Tue, 20 Jun 2023 02:02:44 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: Brett Creeley <brett.creeley@....com>, "kvm@...r.kernel.org"
	<kvm@...r.kernel.org>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
	"yishaih@...dia.com" <yishaih@...dia.com>,
	"shameerali.kolothum.thodi@...wei.com"
	<shameerali.kolothum.thodi@...wei.com>, "shannon.nelson@....com"
	<shannon.nelson@....com>
Subject: RE: [PATCH v10 vfio 4/7] vfio/pds: Add VFIO live migration support

> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Monday, June 19, 2023 8:47 PM
> 
> On Fri, Jun 16, 2023 at 08:06:21AM +0000, Tian, Kevin wrote:
> 
> > Ideally the VMM has an estimation how long a VM can be paused based on
> > SLA, to-be-migrated state size, available network bandwidth, etc. and that
> > hint should be passed to the kernel so any state transition which may
> violate
> > that expectation can fail quickly to break the migration process and put the
> > VM back to the running state.
> >
> > Jason/Shameer, is there similar concern in mlx/hisilicon drivers?
> 
> It is handled through the vfio_device_feature_mig_data_size mechanism..

that is only for estimation of copied data.

IMHO the stop time when the VM is paused includes both the time of
stopping the device and the time of migrating the VM state.

For a software-emulated device the time of stopping the device is negligible.

But certainly for assigned device the worst-case hard-coded 5s timeout as
done in this patch will kill whatever reasonable 'VM dead time' SLA (usually
in milliseconds) which CSPs try to meet purely based on the size of copied
data.

Wouldn't a user-specified stop-device timeout be required to at least allow
breaking migration early according to the desired SLA?

> 
> > > +	if (cur == VFIO_DEVICE_STATE_RUNNING_P2P && next ==
> > > VFIO_DEVICE_STATE_STOP)
> > > +		return NULL;
> >
> > I'm not sure whether P2P is actually supported here. By definition
> > P2P means the device is stopped but still responds to p2p request
> > from other devices. If you look at mlx example it uses different
> > cmds between RUNNING->RUNNING_P2P and RUNNING_P2P->STOP.
> >
> > But in your case seems you simply move what is required in STOP
> > into P2P. Probably you can just remove the support of P2P like
> > hisilicon does.
> 
> We want new devices to get their architecture right, they need to
> support P2P. Didn't we talk about this already and Brett was going to
> fix it?
> 

Looks it's not fixed since RUNNING_P2P->STOP is a nop in this patch.