[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210604122830.GK1002214@nvidia.com>
Date: Fri, 4 Jun 2021 09:28:30 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: "Tian, Kevin" <kevin.tian@...el.com>
Cc: Alex Williamson <alex.williamson@...hat.com>,
Jean-Philippe Brucker <jean-philippe@...aro.org>,
"Jiang, Dave" <dave.jiang@...el.com>,
"Raj, Ashok" <ashok.raj@...el.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
Jonathan Corbet <corbet@....net>,
Robin Murphy <robin.murphy@....com>,
LKML <linux-kernel@...r.kernel.org>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
David Gibson <david@...son.dropbear.id.au>,
Kirti Wankhede <kwankhede@...dia.com>,
David Woodhouse <dwmw2@...radead.org>,
Jason Wang <jasowang@...hat.com>
Subject: Re: [RFC] /dev/ioasid uAPI proposal
On Fri, Jun 04, 2021 at 08:38:26AM +0000, Tian, Kevin wrote:
> > I think more to drive the replacement design; if we can't figure out
> > how to do something other than backwards compatibility trickery in the
> > kernel, it's probably going to bite us. Thanks,
>
> I'm a bit lost on the desired flow in your minds. Here is one flow based
> on my understanding of this discussion. Please comment whether it
> matches your thinking:
>
> 0) ioasid_fd is created and registered to KVM via KVM_ADD_IOASID_FD;
>
> 1) Qemu binds dev1 to ioasid_fd;
>
> 2) Qemu calls IOASID_GET_DEV_INFO for dev1. This will carry IOMMU_
> CACHE info i.e. whether underlying IOMMU can enforce snoop;
>
> 3) Qemu plans to create a gpa_ioasid, and attach dev1 to it. Here Qemu
> needs to figure out whether dev1 wants to do no-snoop. This might
> be based a fixed vendor/class list or specified by user;
>
> 4) gpa_ioasid = ioctl(ioasid_fd, IOASID_ALLOC); At this point a 'snoop'
> flag is specified to decide the page table format, which is supposed
> to match dev1;
> 5) Qemu attaches dev1 to gpa_ioasid via VFIO_ATTACH_IOASID. At this
> point, specify snoop/no-snoop again. If not supported by related
> iommu or different from what gpa_ioasid has, attach fails.
Why do we need to specify it again?
If the IOASID was created with the "block no-snoop" flag then it is
blocked in that IOASID, and that blocking sets the page table format.
The only question is if we can successfully attach a device to the
page table, or not.
The KVM interface is a bit tricky because Alex said this is partially
security, wbinvd is only enabled if someone has a FD to a device that
can support no-snoop.
Personally I think this got way too complicated, the KVM interface
should simply be
ioctl(KVM_ALLOW_INCOHERENT_DMA, ioasidfd, device_label)
ioctl(KVM_DISALLOW_INCOHERENT_DMA, ioasidfd, device_label)
and let qemu sort it out based on command flags, detection, whatever.
'ioasidfd, device_label' is the security proof that Alex asked
for. This needs to be some device in the ioasidfd that declares it is
capabale of no-snoop. Eg vfio_pci would always declare it is capable
of no-snoop.
No kernel call backs, no kernel auto-sync/etc. If qemu mismatches the
IOASID block no-snoop flag with the KVM_x_INCOHERENT_DMA state then it
is just a kernel-harmless uerspace bug.
Then user space can decide which of the various axis's it wants to
optimize for.
Jason
Powered by blists - more mailing lists