[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210930170143.GB69218@ziepe.ca>
Date: Thu, 30 Sep 2021 14:01:43 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Max Gurtovoy <mgurtovoy@...dia.com>
Cc: Alex Williamson <alex.williamson@...hat.com>,
Leon Romanovsky <leon@...nel.org>,
Doug Ledford <dledford@...hat.com>,
Yishai Hadas <yishaih@...dia.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Kirti Wankhede <kwankhede@...dia.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
linux-rdma@...r.kernel.org, netdev@...r.kernel.org,
Saeed Mahameed <saeedm@...dia.com>,
Cornelia Huck <cohuck@...hat.com>
Subject: Re: [PATCH mlx5-next 2/7] vfio: Add an API to check migration state
transition validity
On Thu, Sep 30, 2021 at 07:51:22PM +0300, Max Gurtovoy wrote:
>
> On 9/30/2021 7:24 PM, Jason Gunthorpe wrote:
> > On Thu, Sep 30, 2021 at 06:32:07PM +0300, Max Gurtovoy wrote:
> > > > Just prior to open device the vfio pci layer will generate a FLR to
> > > > the function so we expect that post open_device has a fresh from reset
> > > > fully running device state.
> > > running also mean that the device doesn't have a clue on its internal state
> > > ? or running means unfreezed and unquiesced ?
> > The device just got FLR'd and it should be in a clean state and
> > operating. Think the VM is booting for the first time.
>
> During the resume phase in the dst, the VM is paused and not booting.
> Migration SW is waiting to get memory and state from SRC. The device will
> start from the exact point that was in the src.
>
> it's exactly "000b => Device Stopped, not saving or resuming"
For this case qmeu should open the VFIO device and immediately issue a
command to go to resuming. The kernel cannot know at open_device time
which case userspace is trying to do. Due to backwards compat we
assume userspace is going to boot a fresh VM.
> Well, this is your design for the driver implementation. Nobody is
> preventing other drivers to start deserializing device state into the device
> during RESUMING bit on.
It is a logical model. Devices can stream the migration data directly
into the internal state if they like. It just creates more conditions
where they have report an error state.
> So if we moved from 100b to 010b somehow, one should deserialized its buffer
> to the device, and then serialize it to migration region again ?
Yes.
> I guess its doable since the device is freeze and quiesced. But moving from
> 100b to 011b is not possible, right ?
Why not?
100b to 011b is no different than going indirectly 100b -> 001b -> 011b
The time spent in 001b is just negligable.
Jason
Powered by blists - more mailing lists