[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <565BF285.4040507@intel.com>
Date: Mon, 30 Nov 2015 14:53:57 +0800
From: "Lan, Tianyu" <tianyu.lan@...el.com>
To: Alexander Duyck <alexander.duyck@...il.com>,
"Dong, Eddie" <eddie.dong@...el.com>
Cc: "a.motakis@...tualopensystems.com" <a.motakis@...tualopensystems.com>,
Alex Williamson <alex.williamson@...hat.com>,
"b.reynal@...tualopensystems.com" <b.reynal@...tualopensystems.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
"Wyborny, Carolyn" <carolyn.wyborny@...el.com>,
"Skidmore, Donald C" <donald.c.skidmore@...el.com>,
"Jani, Nrupal" <nrupal.jani@...el.com>,
Alexander Graf <agraf@...e.de>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
"qemu-devel@...gnu.org" <qemu-devel@...gnu.org>,
"Tantilov, Emil S" <emil.s.tantilov@...el.com>,
Or Gerlitz <gerlitz.or@...il.com>,
"Rustad, Mark D" <mark.d.rustad@...el.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Eric Auger <eric.auger@...aro.org>,
intel-wired-lan <intel-wired-lan@...ts.osuosl.org>,
"Kirsher, Jeffrey T" <jeffrey.t.kirsher@...el.com>,
"Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
"Ronciak, John" <john.ronciak@...el.com>,
"linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Williams, Mitch A" <mitch.a.williams@...el.com>,
Netdev <netdev@...r.kernel.org>,
"Nelson, Shannon" <shannon.nelson@...el.com>,
Wei Yang <weiyang@...ux.vnet.ibm.com>,
"zajec5@...il.com" <zajec5@...il.com>
Subject: Re: [RFC PATCH V2 0/3] IXGBE/VFIO: Add live migration support for
SRIOV NIC
On 11/26/2015 11:56 AM, Alexander Duyck wrote:
> > I am not saying you cannot modify the drivers, however what you are
> doing is far too invasive. Do you seriously plan on modifying all of
> the PCI device drivers out there in order to allow any device that
> might be direct assigned to a port to support migration? I certainly
> hope not. That is why I have said that this solution will not scale.
Current drivers are not migration friendly. If the driver wants to
support migration, it's necessary to be changed.
RFC PATCH V1 presented our ideas about how to deal with MMIO, ring and
DMA tracking during migration. These are common for most drivers and
they maybe problematic in the previous version but can be corrected later.
Doing suspend and resume() may help to do migration easily but some
devices requires low service down time. Especially network and I got
that some cloud company promised less than 500ms network service downtime.
So I think performance effect also should be taken into account when we
design the framework.
>
> What I am counter proposing seems like a very simple proposition. It
> can be implemented in two steps.
>
> 1. Look at modifying dma_mark_clean(). It is a function called in
> the sync and unmap paths of the lib/swiotlb.c. If you could somehow
> modify it to take care of marking the pages you unmap for Rx as being
> dirty it will get you a good way towards your goal as it will allow
> you to continue to do DMA while you are migrating the VM.
>
> 2. Look at making use of the existing PCI suspend/resume calls that
> are there to support PCI power management. They have everything
> needed to allow you to pause and resume DMA for the device before and
> after the migration while retaining the driver state. If you can
> implement something that allows you to trigger these calls from the
> PCI subsystem such as hot-plug then you would have a generic solution
> that can be easily reproduced for multiple drivers beyond those
> supported by ixgbevf.
Glanced at PCI hotplug code. The hotplug events are triggered by PCI
hotplug controller and these event are defined in the controller spec.
It's hard to extend more events. Otherwise, we also need to add some
specific codes in the PCI hotplug core since it's only add and remove
PCI device when it gets events. It's also a challenge to modify Windows
hotplug codes. So we may need to find another way.
>
> Thanks.
>
> - Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists