[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220222165300.4a8dd044.alex.williamson@redhat.com>
Date: Tue, 22 Feb 2022 16:53:00 -0700
From: Alex Williamson <alex.williamson@...hat.com>
To: Yishai Hadas <yishaih@...dia.com>
Cc: <bhelgaas@...gle.com>, <jgg@...dia.com>, <saeedm@...dia.com>,
<linux-pci@...r.kernel.org>, <kvm@...r.kernel.org>,
<netdev@...r.kernel.org>, <kuba@...nel.org>, <leonro@...dia.com>,
<kwankhede@...dia.com>, <mgurtovoy@...dia.com>, <maorg@...dia.com>,
<cohuck@...hat.com>, <ashok.raj@...el.com>, <kevin.tian@...el.com>,
<shameerali.kolothum.thodi@...wei.com>
Subject: Re: [PATCH V8 mlx5-next 09/15] vfio: Define device migration
protocol v2
On Sun, 20 Feb 2022 11:57:10 +0200
Yishai Hadas <yishaih@...dia.com> wrote:
> From: Jason Gunthorpe <jgg@...dia.com>
>
> Replace the existing region based migration protocol with an ioctl based
> protocol. The two protocols have the same general semantic behaviors, but
> the way the data is transported is changed.
>
> This is the STOP_COPY portion of the new protocol, it defines the 5 states
> for basic stop and copy migration and the protocol to move the migration
> data in/out of the kernel.
>
> Compared to the clarification of the v1 protocol Alex proposed:
>
> https://lore.kernel.org/r/163909282574.728533.7460416142511440919.stgit@omen
>
> This has a few deliberate functional differences:
>
> - ERROR arcs allow the device function to remain unchanged.
>
> - The protocol is not required to return to the original state on
> transition failure. Instead userspace can execute an unwind back to
> the original state, reset, or do something else without needing kernel
> support. This simplifies the kernel design and should userspace choose
> a policy like always reset, avoids doing useless work in the kernel
> on error handling paths.
>
> - PRE_COPY is made optional, userspace must discover it before using it.
> This reflects the fact that the majority of drivers we are aware of
> right now will not implement PRE_COPY.
>
> - segmentation is not part of the data stream protocol, the receiver
> does not have to reproduce the framing boundaries.
I'm not sure how to reconcile the statement above with:
"The user must consider the migration data segments carried
over the FD to be opaque and non-fungible. During RESUMING, the
data segments must be written in the same order they came out
of the saving side FD."
This is subtly conflicting that it's not segmented, but segments must
be written in order. We'll naturally have some segmentation due to
buffering in kernel and userspace, but I think referring to it as a
stream suggests that the user can cut and join segments arbitrarily so
long as byte order is preserved, right? I suspect the commit log
comment is referring to the driver imposed segmentation and framing
relative to region offsets.
Maybe something like:
"The user must consider the migration data stream carried over
the FD to be opaque and must preserve the byte order of the
stream. The user is not required to preserve buffer
segmentation when writing the data stream during the RESUMING
operation."
This statement also gives me pause relative to Jason's comments
regarding async support:
> + * The kernel migration driver must fully transition the device to the new state
> + * value before the operation returns to the user.
The above statement certainly doesn't preclude asynchronous
availability of data on the stream FD, but it does demand that the
device state transition itself is synchronous and can cannot be
shortcut. If the state transition itself exceeds migration SLAs, we're
in a pickle. Thanks,
Alex
Powered by blists - more mailing lists