[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5276C05DBC8C5E79154891B08C3B9@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Tue, 22 Feb 2022 01:43:13 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: Yishai Hadas <yishaih@...dia.com>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"saeedm@...dia.com" <saeedm@...dia.com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"kuba@...nel.org" <kuba@...nel.org>,
"leonro@...dia.com" <leonro@...dia.com>,
"kwankhede@...dia.com" <kwankhede@...dia.com>,
"mgurtovoy@...dia.com" <mgurtovoy@...dia.com>,
"maorg@...dia.com" <maorg@...dia.com>,
"Raj, Ashok" <ashok.raj@...el.com>,
"shameerali.kolothum.thodi@...wei.com"
<shameerali.kolothum.thodi@...wei.com>
Subject: RE: [PATCH V7 mlx5-next 15/15] vfio: Extend the device migration
protocol with PRE_COPY
> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Friday, February 18, 2022 10:06 PM
>
>
> > > and to be defined in future. While making the whole PRE_COPY feature
> > > optional eliminates the concern from mlx5, this is still a complicated arc
> > > to implement and seems prudent to leave it closed until a proper use
> case
> >
> > Can you shed some light on the complexity here?
>
> It is with the data_fd, once a driver enters STOP_COPY it should stuff
> its final state into the data_fd. If this is aborted back to PRE_COPY
> then the data_fd needs to return to streaming changes. Managing this
> transition is not trivial - it is something that has to be signaled to
> the receiver.
>
> There is also something of a race here where the data_fd can reach
> end-of-stream and then the user can do STOP_COPY->PRE_COPY and
> continue stuffing data. This makes the construction of the data stream
> framing "interesting" as there is no longer a possible in-band end of
> stream marker. See the other discussion about async operation why this
> is not ideal.
>
> Basically, it is behavior current qemu doesn't trigger that requires
> significant complexity and testing in any driver to support
> properly. No driver proposed
Make sense.
>
> > > @@ -959,6 +1007,8 @@ struct vfio_device_feature_mig_state {
> > > * above FSM arcs. As there are multiple paths through the FSM arcs the
> > > path
> > > * should be selected based on the following rules:
> > > * - Select the shortest path.
> > > + * - The path cannot have saving group states as interior arcs, only
> > > + * starting/end states.
> >
> > what about PRECOPY->PRECOPY_P2P->STOP_COPY? In this case
> > PRECOPY_P2P is used as interior arc.
>
> It isn't an interior arc because there are only two arcs :) But yes,
> it is bit unclear.
>
> > and if we disallow a non-saving-group state as interior arc when both
> > start and end states are saving-group states (e.g.
> > STOP_COPY->STOP->RUNNING_P2P->PRE_COPY_P2P as I asked in
> > the start) then it might be another rule to be specified...
>
> This isn't a shortest path.
it is the shortest path when STOP_COPY->PRE_COPY_P2P base arc is
not supported. I guess your earlier explanation about data_fd
should be the 3rd rule for why that combination arc is not allowed
in FSM.
>
> > > @@ -972,6 +1022,9 @@ struct vfio_device_feature_mig_state {
> > > * support them. The user can disocver if these states are supported by
> using
> > > * VFIO_DEVICE_FEATURE_MIGRATION. By using combination transitions
> the
> > > user can
> > > * avoid knowing about these optional states if the kernel driver supports
> > > them.
> > > + *
> > > + * Arcs touching PRE_COPY and PRE_COPY_P2P are removed if support
> for
> > > PRE_COPY
> > > + * is not present.
> >
> > why adding this sentence particularly for PRE_COPY? Isn't it already
> > explained by last paragraph for optional states?
>
> Well, I thought it was clarifying about how the optionality is
> constructed.
The last paragraph already says:
+ * The optional states cannot be used with SET_STATE if the device does not
+ * support them. The user can disocver if these states are supported by using
+ * VFIO_DEVICE_FEATURE_MIGRATION. By using combination transitions the user can
+ * avoid knowing about these optional states if the kernel driver supports them.
>
> > > + * Drivers should attempt to return estimates so that initial_bytes +
> > > + * dirty_bytes matches the amount of data an immediate transition to
> > > STOP_COPY
> > > + * will require to be streamed.
> >
> > I didn't understand this requirement. In an immediate transition to
> > STOP_COPY I expect the amount of data covers the entire device
> > state, i.e. initial_bytes. dirty_bytes are dynamic and iteratively returned
> > then why we need set some expectation on the sum of
> > initial+round1_dity+round2_dirty+...
>
> "will require to be streamed" means additional data from this point
> forward, not including anything already sent.
>
> It turns into the estimate of how long STOP_COPY will take.
>
I still didn't get the 'match' part. Why should the amount of data which
has already been sent match the additional data to be sent in STOP_COPY?
Thanks
Keivn
Powered by blists - more mailing lists