[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YXgV6ehhsSlydiEl@work-vm>
Date: Tue, 26 Oct 2021 15:51:21 +0100
From: "Dr. David Alan Gilbert" <dgilbert@...hat.com>
To: Alex Williamson <alex.williamson@...hat.com>
Cc: Yishai Hadas <yishaih@...dia.com>,
Jason Gunthorpe <jgg@...dia.com>, bhelgaas@...gle.com,
saeedm@...dia.com, linux-pci@...r.kernel.org, kvm@...r.kernel.org,
netdev@...r.kernel.org, kuba@...nel.org, leonro@...dia.com,
kwankhede@...dia.com, mgurtovoy@...dia.com, maorg@...dia.com,
Cornelia Huck <cohuck@...hat.com>
Subject: Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver
for mlx5 devices
* Alex Williamson (alex.williamson@...hat.com) wrote:
> On Mon, 25 Oct 2021 19:47:29 +0100
> "Dr. David Alan Gilbert" <dgilbert@...hat.com> wrote:
>
> > * Alex Williamson (alex.williamson@...hat.com) wrote:
> > > On Mon, 25 Oct 2021 17:34:01 +0100
> > > "Dr. David Alan Gilbert" <dgilbert@...hat.com> wrote:
> > >
> > > > * Alex Williamson (alex.williamson@...hat.com) wrote:
> > > > > [Cc +dgilbert, +cohuck]
> > > > >
> > > > > On Wed, 20 Oct 2021 11:28:04 +0300
> > > > > Yishai Hadas <yishaih@...dia.com> wrote:
> > > > >
<snip>
> > > In a way. We're essentially recognizing that we cannot stop a single
> > > device in isolation of others that might participate in peer-to-peer
> > > DMA with that device, so we need to make a pass to quiesce each device
> > > before we can ask the device to fully stop. This new device state bit
> > > is meant to be that quiescent point, devices can accept incoming DMA
> > > but should cease to generate any. Once all device are quiesced then we
> > > can safely stop them.
> >
> > It may need some further refinement; for example in that quiesed state
> > do counters still tick? will a NIC still respond to packets that don't
> > get forwarded to the host?
>
> I'd think no, but I imagine it's largely device specific to what extent
> a device can be fully halted yet minimally handle incoming DMA.
That's what worries me; we're adding a new state here as we understand
more about trying to implement a device; but it seems that we need to
nail something down as to what the state means.
> > Note I still think you need a way to know when you have actually reached
> > these states; setting a bit in a register is asking nicely for a device
> > to go into a state - has it got there?
>
> It's more than asking nicely, we define the device_state bits as
> synchronous, the device needs to enter the state before returning from
> the write operation or return an errno.
I don't see how it can be synchronous in practice; can it really wait to
complete if it has to take many cycles to finish off an inflight DMA
before it transitions?
> > > > Now, you could be a *little* more sloppy; you could allow a device carry
> > > > on doing stuff purely with it's own internal state up until the point
> > > > it needs to serialise; but that would have to be strictly internal state
> > > > only - if it can change any other devices state (or issue an interrupt,
> > > > change RAM etc) then you get into ordering issues on the serialisation
> > > > of multiple devices.
> > >
> > > Yep, that's the proposal that doesn't require a uAPI change, we loosen
> > > the definition of stopped to mean the device can no longer generate DMA
> > > or interrupts and all internal processing outside or responding to
> > > incoming DMA should halt (essentially the same as the new quiescent
> > > state above). Once all devices are in this state, there should be no
> > > incoming DMA and we can safely collect per device migration data. If
> > > state changes occur beyond the point in time where userspace has
> > > initiated the collection of migration data, drivers have options for
> > > generating errors when userspace consumes that data.
> >
> > How do you know that last device has actually gone into that state?
>
> Each device cannot, the burden is on the user to make sure all devices
> are stopped before proceeding to read migration data.
Yeh this really ties to the previous question; if it's synchronous
you're OK.
> > Also be careful; it feels much more delicate where something might
> > accidentally start a transaction.
>
> This sounds like a discussion of theoretically broken drivers. Like
> the above device_state, drivers still have a synchronization point when
> the user reads the pending_bytes field to initiate retrieving the
> device state. If the implementation requires the device to be fully
> stopped to snapshot the device state to provide to the user, this is
> where that would happen. Thanks,
Yes, but I worry that some ways of definining it are harder to get right
in drivers, so less likely to be theoretical.
Dave
> Alex
>
--
Dr. David Alan Gilbert / dgilbert@...hat.com / Manchester, UK
Powered by blists - more mailing lists