[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB527602684F6583FDAC82A2AE8C8AA@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Fri, 8 Dec 2023 03:42:57 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Alex Williamson <alex.williamson@...hat.com>, "Cao, Yahui"
<yahui.cao@...el.com>
CC: "intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>, "Liu, Lingyu" <lingyu.liu@...el.com>, "Chittim,
Madhu" <madhu.chittim@...el.com>, "Samudrala, Sridhar"
<sridhar.samudrala@...el.com>, "jgg@...dia.com" <jgg@...dia.com>,
"yishaih@...dia.com" <yishaih@...dia.com>,
"shameerali.kolothum.thodi@...wei.com"
<shameerali.kolothum.thodi@...wei.com>, "brett.creeley@....com"
<brett.creeley@....com>, "davem@...emloft.net" <davem@...emloft.net>,
"edumazet@...gle.com" <edumazet@...gle.com>, "kuba@...nel.org"
<kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>
Subject: RE: [PATCH iwl-next v4 12/12] vfio/ice: Implement vfio_pci driver for
E800 devices
> From: Tian, Kevin
> Sent: Friday, December 8, 2023 11:42 AM
>
> > From: Alex Williamson <alex.williamson@...hat.com>
> > Sent: Friday, December 8, 2023 6:43 AM
> > > +
> > > + if (cur == VFIO_DEVICE_STATE_RUNNING &&
> > > + new == VFIO_DEVICE_STATE_RUNNING_P2P) {
> > > + ice_migration_suspend_dev(ice_vdev->pf, ice_vdev->vf_id);
> > > + return NULL;
> > > + }
> > > +
> > > + if (cur == VFIO_DEVICE_STATE_RUNNING_P2P &&
> > > + new == VFIO_DEVICE_STATE_STOP)
> > > + return NULL;
> >
> > This looks suspicious, are we actually able to freeze the internal
> > device state? It should happen here.
> >
> > * RUNNING_P2P -> STOP
> > * STOP_COPY -> STOP
> > * While in STOP the device must stop the operation of the device. The
> > device
> > * must not generate interrupts, DMA, or any other change to external
> state.
> > * It must not change its internal state. When stopped the device and
> kernel
> > * migration driver must accept and respond to interaction to support
> > external
> > * subsystems in the STOP state, for example PCI MSI-X and PCI config
> space.
> > * Failure by the user to restrict device access while in STOP must not
> result
> > * in error conditions outside the user context (ex. host system faults).
> > *
> > * The STOP_COPY arc will terminate a data transfer session.
> >
>
> It was discussed in v3 [1].
>
> This device only provides a way to drain/stop outgoing traffic (for
> RUNNING->RUNNING_P2P). No interface for stopping the incoming
> requests.
>
> Jason explained that RUNNING_P2P->STOP transition can be a 'nop' as long
> as there is guarantee that the device state is frozen at this point.
>
> By definition the user should request this transition only after all devices
> are put in RUNNING_P2P. At that point no one is sending P2P requests to
> further affect the internal state of this device. Then an explicit "stop
> responder" action is not strictly required and 'nop' can still meet
> above definition.
[1] https://lore.kernel.org/intel-wired-lan/20231013140744.GT3952@nvidia.com/
Powered by blists - more mailing lists