lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 22 Feb 2022 16:53:00 -0700
From:   Alex Williamson <alex.williamson@...hat.com>
To:     Yishai Hadas <yishaih@...dia.com>
Cc:     <bhelgaas@...gle.com>, <jgg@...dia.com>, <saeedm@...dia.com>,
        <linux-pci@...r.kernel.org>, <kvm@...r.kernel.org>,
        <netdev@...r.kernel.org>, <kuba@...nel.org>, <leonro@...dia.com>,
        <kwankhede@...dia.com>, <mgurtovoy@...dia.com>, <maorg@...dia.com>,
        <cohuck@...hat.com>, <ashok.raj@...el.com>, <kevin.tian@...el.com>,
        <shameerali.kolothum.thodi@...wei.com>
Subject: Re: [PATCH V8 mlx5-next 09/15] vfio: Define device migration
 protocol v2

On Sun, 20 Feb 2022 11:57:10 +0200
Yishai Hadas <yishaih@...dia.com> wrote:

> From: Jason Gunthorpe <jgg@...dia.com>
> 
> Replace the existing region based migration protocol with an ioctl based
> protocol. The two protocols have the same general semantic behaviors, but
> the way the data is transported is changed.
> 
> This is the STOP_COPY portion of the new protocol, it defines the 5 states
> for basic stop and copy migration and the protocol to move the migration
> data in/out of the kernel.
> 
> Compared to the clarification of the v1 protocol Alex proposed:
> 
> https://lore.kernel.org/r/163909282574.728533.7460416142511440919.stgit@omen
> 
> This has a few deliberate functional differences:
> 
>  - ERROR arcs allow the device function to remain unchanged.
> 
>  - The protocol is not required to return to the original state on
>    transition failure. Instead userspace can execute an unwind back to
>    the original state, reset, or do something else without needing kernel
>    support. This simplifies the kernel design and should userspace choose
>    a policy like always reset, avoids doing useless work in the kernel
>    on error handling paths.
> 
>  - PRE_COPY is made optional, userspace must discover it before using it.
>    This reflects the fact that the majority of drivers we are aware of
>    right now will not implement PRE_COPY.
> 
>  - segmentation is not part of the data stream protocol, the receiver
>    does not have to reproduce the framing boundaries.

I'm not sure how to reconcile the statement above with:

	"The user must consider the migration data segments carried
	 over the FD to be opaque and non-fungible. During RESUMING, the
	 data segments must be written in the same order they came out
	 of the saving side FD."

This is subtly conflicting that it's not segmented, but segments must
be written in order.  We'll naturally have some segmentation due to
buffering in kernel and userspace, but I think referring to it as a
stream suggests that the user can cut and join segments arbitrarily so
long as byte order is preserved, right?  I suspect the commit log
comment is referring to the driver imposed segmentation and framing
relative to region offsets.

Maybe something like:

	"The user must consider the migration data stream carried over
	 the FD to be opaque and must preserve the byte order of the
	 stream.  The user is not required to preserve buffer
	 segmentation when writing the data stream during the RESUMING
	 operation."

This statement also gives me pause relative to Jason's comments
regarding async support:

> + * The kernel migration driver must fully transition the device to the new state
> + * value before the operation returns to the user.

The above statement certainly doesn't preclude asynchronous
availability of data on the stream FD, but it does demand that the
device state transition itself is synchronous and can cannot be
shortcut.  If the state transition itself exceeds migration SLAs, we're
in a pickle.  Thanks,

Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ