[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211105132404.GB2744544@nvidia.com>
Date: Fri, 5 Nov 2021 10:24:04 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Alex Williamson <alex.williamson@...hat.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@...wei.com>,
Cornelia Huck <cohuck@...hat.com>,
Yishai Hadas <yishaih@...dia.com>, bhelgaas@...gle.com,
saeedm@...dia.com, linux-pci@...r.kernel.org, kvm@...r.kernel.org,
netdev@...r.kernel.org, kuba@...nel.org, leonro@...dia.com,
kwankhede@...dia.com, mgurtovoy@...dia.com, maorg@...dia.com,
"Dr. David Alan Gilbert" <dgilbert@...hat.com>
Subject: Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver
for mlx5 devices
On Wed, Nov 03, 2021 at 12:04:11PM -0600, Alex Williamson wrote:
> We agreed that it's easier to add a feature than a restriction in a
> uAPI, so how do we resolve that some future device may require a new
> state in order to apply the SET_IRQS configuration?
I would say don't support those devices. If there is even a hint that
they could maybe exist then we should fix it now. Once the uapi is set
and documented we should expect device makers to consider it when
building their devices.
As for SET_IRQs, I have been looking at making documentation and I
don't like the way the documentation has to be wrriten because of
this.
What I see as an understandable, clear, documentation is:
- SAVING set - no device touches allowed beyond migration operations
and reset via XX
Must be set with !RUNNING
- RESUMING set - same as SAVING
- RUNNING cleared - limited device touches in this list: SET_IRQs, XX
config, XX.
Device may assume no touches outside the above. (ie no MMIO)
Implies NDMA
- NDMA set - full device touches
Device may not issue DMA or interrupts (??)
Device may not dirty pages
- RUNNING set - full functionality
* In no state may a device generate an error TLP, device
hang/integrity failure or kernel intergity failure, no matter
what userspace does.
The device is permitted to corrupt the migration/VM or SEGV
userspace if userspace doesn't follow the rules.
(we are trying to figure out what the XX's are right now, would
appreciate any help)
This is something I think we could expect a HW engineering team to
follow and implement in devices. It doesn't complicate things.
Overall, at this moment, I would prioritize documentation clarity over
strict compatability with qemu, because people have to follow this
documentation and make their devices long into the future. If the
documentation is convoluted for compatibility reasons HW people are
more likely to get it wrong. When HW people get it wrong they are more
likely to ask for "quirks" in the uAPI to fix their mistakes.
The pending_bytes P2P idea is also quite complicated to document as
now we have to describe an HW state not in terms of a NDMA control
bit, but in terms of a bunch of implicit operations in a protocol. Not
so nice.
So, here is what I propose. Let us work on some documentation and come
up with the sort of HW centric docs like above and we can then decide
if we want to make the qemu changes it will imply, or not. We'll
include the P2P stuff, as we see it, so it shows a whole picture.
I think that will help everyone participate fully in the discussion.
> If we're going to move forward with the existing uAPI, then we're going
> to need to start factoring compatibility into our discussions of
> missing states and protocols. For example, requiring that the device
> is "quiesced" when the _RUNNING bit is cleared and "frozen" when
> pending_bytes is read has certain compatibility advantages versus
> defining a new state bit.
Not entirely, to support P2P going from RESUMING directly to RUNNING
is not possible. There must be an in between state that all devices
reach before they go to RUNNING. It seems P2P cannot be bolted into
the existing qmeu flow with a kernel only change?
> clarifications were trying for within the existing uAPI rather than
> toss out new device states and protocols at every turn for the sake of
> API purity. The rate at which we're proposing new states and required
> transitions without a plan for the uAPI is not where I want to be for
> adding the driver that could lock us in to a supported uAPI. Thanks,
Well, to be fair, the other cases I suggested new stats was when you
asked about features we don't have at all today (like post-copy). I
think adding new states is a very reasonable way to approach adding
new features. As long as new features can be supported with new states
we have a forward compatability story.
Thanks,
Jason
Powered by blists - more mailing lists