[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211026085210.000dc19b.alex.williamson@redhat.com>
Date: Tue, 26 Oct 2021 08:52:10 -0600
From: Alex Williamson <alex.williamson@...hat.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@...hat.com>,
Yishai Hadas <yishaih@...dia.com>, bhelgaas@...gle.com,
saeedm@...dia.com, linux-pci@...r.kernel.org, kvm@...r.kernel.org,
netdev@...r.kernel.org, kuba@...nel.org, leonro@...dia.com,
kwankhede@...dia.com, mgurtovoy@...dia.com, maorg@...dia.com,
Cornelia Huck <cohuck@...hat.com>
Subject: Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver
for mlx5 devices
On Tue, 26 Oct 2021 09:13:53 -0300
Jason Gunthorpe <jgg@...dia.com> wrote:
> On Tue, Oct 26, 2021 at 09:40:34AM +0100, Dr. David Alan Gilbert wrote:
> > * Jason Gunthorpe (jgg@...dia.com) wrote:
> > > On Mon, Oct 25, 2021 at 07:47:29PM +0100, Dr. David Alan Gilbert wrote:
> > >
> > > > It may need some further refinement; for example in that quiesed state
> > > > do counters still tick? will a NIC still respond to packets that don't
> > > > get forwarded to the host?
> > >
> > > At least for the mlx5 NIC the two states are 'able to issue outbound
> > > DMA' and 'all internal memories and state are frozen and unchanging'.
> >
> > Yeh, so my point was just that if you're adding a new state to this
> > process, you need to define the details like that.
>
> We are not planning to propose any patches/uAPI specification for this
> problem until after the mlx5 vfio driver is merged..
I'm not super comfortable with that. If we're expecting to add a new
bit to define a quiescent state prior to clearing the running flag and
this is an optional device feature that userspace migration needs to be
aware of and it's really not clear from a hypervisor when p2p DMA might
be in use, I think that leaves userspace in a pickle how and when
they'd impose restrictions on assignment with multiple assigned
devices. It's likely that the majority of initial use cases wouldn't
need this feature, which would make it difficult to arbitrarily impose
later.
OTOH, if we define !_RUNNING as quiescent and userspace reading
pending_bytes as the point by which the user is responsible for
quiescing all devices and the device state becomes stable (or drivers
can generate errors during collection of device state if that proves
otherwise), then I think existing userspace doesn't care about this
issue. Thanks,
Alex
Powered by blists - more mailing lists