[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aebbd9c7-a239-0f89-972b-a9059e8b218b@arm.com>
Date: Wed, 22 Dec 2021 20:26:34 +0000
From: Robin Murphy <robin.murphy@....com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Lu Baolu <baolu.lu@...ux.intel.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Joerg Roedel <joro@...tes.org>,
Alex Williamson <alex.williamson@...hat.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Christoph Hellwig <hch@...radead.org>,
Kevin Tian <kevin.tian@...el.com>,
Ashok Raj <ashok.raj@...el.com>, Will Deacon <will@...nel.org>,
Dan Williams <dan.j.williams@...el.com>, rafael@...nel.org,
Diana Craciun <diana.craciun@....nxp.com>,
Cornelia Huck <cohuck@...hat.com>,
Eric Auger <eric.auger@...hat.com>,
Liu Yi L <yi.l.liu@...el.com>,
Jacob jun Pan <jacob.jun.pan@...el.com>,
Chaitanya Kulkarni <kch@...dia.com>,
Stuart Yoder <stuyoder@...il.com>,
Laurentiu Tudor <laurentiu.tudor@....com>,
Thierry Reding <thierry.reding@...il.com>,
David Airlie <airlied@...ux.ie>,
Daniel Vetter <daniel@...ll.ch>,
Jonathan Hunter <jonathanh@...dia.com>,
Li Yang <leoyang.li@....com>,
Dmitry Osipenko <digetx@...il.com>,
iommu@...ts.linux-foundation.org, linux-pci@...r.kernel.org,
kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 07/13] iommu: Add iommu_at[de]tach_device_shared() for
multi-device groups
On 21/12/2021 6:46 pm, Jason Gunthorpe wrote:
> On Tue, Dec 21, 2021 at 04:50:56PM +0000, Robin Murphy wrote:
>
>> this proposal is the worst of both worlds, in that drivers still have to be
>> just as aware of groups in order to know whether to call the _shared
>> interface or not, except it's now entirely implicit and non-obvious.
>
> Drivers are not aware of groups, where did you see that?
`git grep iommu_attach_group -- :^drivers/iommu :^include`
Did I really have to explain that?
The drivers other than vfio_iommu_type1, however, do have a complete
failure to handle, or even consider, any group that does not fit the
particular set of assumptions they are making, but at least they only
work in a context where that should not occur.
> Drivers have to indicate their intention, based entirely on their own
> internal design. If groups are present, or not is irrelevant to the
> driver.
>
> If the driver uses a single struct device (which is most) then it uses
> iommu_attach_device().
>
> If the driver uses multiple struct devices and intends to connect them
> all to the same domain then it uses the _shared variant. The only
> difference between the two is the _shared varient lacks some of the
> protections against driver abuse of the API.
You've lost me again; how are those intentions any different? Attaching
one device to a private domain is a literal subset of attaching more
than one device to a private domain. There is no "abuse" of any API
anywhere; the singleton group restriction exists as a protective measure
because iommu_attach_device() was already in use before groups were
really a thing, in contexts where groups happened to be singleton
already, but anyone adding *new* uses in contexts where that assumption
might *not* hold would be in trouble. Thus it enforces DMA ownership by
the most trivial and heavy-handed means of simply preventing it ever
becoming shared in the first place.
Yes, I'm using the term "DMA ownership" in a slightly different context
to the one in which you originally proposed it. Please step out of the
userspace-device-assignment-focused bubble for a moment and stay with me...
So then we have the iommu_attach_group() interface for new code (and
still nobody has got round to updating the old code to it yet), for
which the basic use-case is still fundamentally "I want to attach my
thing to my domain", but at least now forcing explicit awareness that
"my thing" could possibly be inextricably intertwined with more than
just the one device they expect, so potential callers should have a good
think about that. Unfortunately this leaves the matter of who "owns" the
group entirely in the hands of those callers, which as we've now
concluded is not great.
One of the main reasons for non-singleton groups to occur is due to ID
aliasing or lack of isolation well beyond the scope and control of
endpoint devices themselves, so it's not really fair to expect every
IOMMU-aware driver to also be aware of that, have any idea of how to
actually handle it, or especially try to negotiate with random other
drivers as to whether it might be OK to take control of their DMA
address space too. The whole point is that *every* domain attach really
*has* to be considered "shared" because in general drivers can't know
otherwise. Hence the easy, if crude, fix for the original API.
> Nothing uses the group interface except for VFIO and stuff inside
> drivers/iommu. VFIO has a uAPI tied to the group interface and it
> is stuck with it.
Self-contradiction is getting stronger, careful...
>> Otherwise just add the housekeeping stuff to iommu_{attach,detach}_group() -
>> there's no way we want *three* attach/detach interfaces all with different
>> semantics.
>
> I'm not sure why you think 3 APIs is bad thing. Threes APIs, with
> clearly intended purposes is a lot better than one giant API with a
> bunch of parameters that tries to do everything.
Because there's only one problem to solve! We have the original API
which does happen to safely enforce ownership, but in an implicit way
that doesn't scale; then we have the second API which got past the
topology constraint but unfortunately turns out to just be unsafe in a
slightly different way, and was supposed to replace the first one but
hasn't, and is a bit clunky to boot; now you're proposing a third one
which can correctly enforce safe ownership for any group topology, which
is simply combining the good bits of the first two. It makes no sense to
maintain two bad versions of a thing alongside one which works better.
I don't see why anything would be a giant API with a bunch of parameters
- depending on how you look at it, this new proposal is basically either
iommu_attach_device() with the ability to scale up to non-trivial groups
properly, or iommu_attach_group() with a potentially better interface
and actual safety. The former is still more prevalent (and the interface
argument compelling), so if we put the new implementation behind that,
with the one tweak of having it set DMA_OWNER_PRIVATE_DOMAIN
automatically, kill off iommu_attach_group() by converting its couple of
users, and not only have we solved the VFIO problem but we've also
finally updated all the legacy code for free! Of course you can have a
separate version for VFIO to attach with DMA_OWNER_PRIVATE_DOMAIN_USER
if you like, although I still fail to understand the necessity of the
distinction.
> In this case, it is not simple to 'add the housekeeping' to
> iommu_attach_group() in a way that is useful to both tegra and
> VFIO. What tegra wants is what the _shared API implements, and that
> logic should not be open coded in drivers.
>
> VFIO does not want exactly that, it has its own logic to deal directly
> with groups tied to its uAPI. Due to the uAPI it doesn't even have a
> struct device, unfortunately.
Nope. VFIO has its own logic to deal with groups because it's the only
thing that's ever actually tried dealing with groups correctly
(unsurprisingly, given that it's where they came from), and every other
private IOMMU domain user is just crippled or broken to some degree. All
that proves is that we really should be policing groups better in the
IOMMU core, per this series, because actually fixing all the other users
to properly validate their device's group would be a ridiculous mess.
What VFIO wants is (conceptually[1]) "attach this device to my domain,
provided it and any other devices in its group are managed by a driver I
approve of." Surprise surprise, that's what any other driver wants as
well! For iommu_attach_device() it was originally implicit, and is now
further enforced by the singleton group restriction. For Tegra/host1x
it's implicit in the complete obliviousness to the possibility of that
not being the case.
Of course VFIO has a struct device if it needs one; it's trivial to
resolve the member(s) of a group (and even more so once we can assume
that a group may only ever contain mutually-compatible devices in the
first place). How do you think vfio_bus_type() works?
VFIO will also need a struct device anyway, because once I get back from
my holiday in the new year I need to start working with Simon on
evolving the rest of the API away from bus->iommu_ops to dev->iommu so
we can finally support IOMMU drivers coexisting[2].
> The reason there are three APIs is because there are three different
> use-cases. It is not bad thing to have APIs designed for the use cases
> they serve.
Indeed I agree with that second point, I'm just increasingly baffled how
it's not clear to you that there is only one fundamental use-case here.
Perhaps I'm too familiar with the history to objectively see how unclear
the current state of things might be :/
>> It's worth taking a step back and realising that overall, this is really
>> just a more generalised and finer-grained extension of what 426a273834ea
>> already did for non-group-aware code, so it makes little sense *not* to
>> integrate it into the existing interfaces.
>
> This is taking 426a to it's logical conclusion and *removing* the
> group API from the drivers entirely. This is desirable because drivers
> cannot do anything sane with the group.
I am in complete agreement with that (to the point of also not liking
patch #6).
> The drivers have struct devices, and so we provide APIs that work in
> terms of struct devices to cover both driver use cases today, and do
> so more safely than what is already implemented.
I am in complete agreement with that (given "both" of the supposed 3
use-cases all being the same).
> Do not mix up VFIO with the driver interface, these are different
> things. It is better VFIO stay on its own and not complicate the
> driver world.
Nope, vfio_iommu_type1 is just a driver, calling the IOMMU API just like
any other driver. I like the little bit where it passes itself to
vfio_register_iommu_driver(), which I feel gets this across far more
poetically than I can manage.
Thanks,
Robin.
[1] Yes, due to the UAPI it actually starts with the whole group rather
than any particular device within it. Don't nitpick.
[2]
https://lore.kernel.org/linux-iommu/2021052710373173260118@rock-chips.com/
Powered by blists - more mailing lists