lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220210152305.GG4160@nvidia.com>
Date:   Thu, 10 Feb 2022 11:23:05 -0400
From:   Jason Gunthorpe <jgg@...dia.com>
To:     Niklas Schnelle <schnelle@...ux.ibm.com>
Cc:     Matthew Rosato <mjrosato@...ux.ibm.com>,
        Alex Williamson <alex.williamson@...hat.com>,
        linux-s390@...r.kernel.org, cohuck@...hat.com,
        farman@...ux.ibm.com, pmorel@...ux.ibm.com,
        borntraeger@...ux.ibm.com, hca@...ux.ibm.com, gor@...ux.ibm.com,
        gerald.schaefer@...ux.ibm.com, agordeev@...ux.ibm.com,
        frankja@...ux.ibm.com, david@...hat.com, imbrenda@...ux.ibm.com,
        vneethv@...ux.ibm.com, oberpar@...ux.ibm.com, freude@...ux.ibm.com,
        thuth@...hat.com, pasic@...ux.ibm.com, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 24/30] vfio-pci/zdev: wire up group notifier

On Thu, Feb 10, 2022 at 03:06:35PM +0100, Niklas Schnelle wrote:

> > How does the page pinning work?
> 
> The pinning is done directly in the RPCIT interception handler pinning
> both the IOMMU tables and the guest pages mapped for DMA.

And if pinning fails?
 
> > Then the
> > magic kernel code you describe can operate on its own domain without
> > becoming confused with a normal map/unmap domain.
> 
> This sounds like an interesting idea. Looking at
> drivers/iommu/s390_iommu.c most of that is pretty trivial domain
> handling. I wonder if we could share this by marking the existing
> s390_iommu_domain type with kind of a "lent out to KVM" flag. 

Lu has posted a series here:

https://lore.kernel.org/linux-iommu/20220208012559.1121729-1-baolu.lu@linux.intel.com

Which allows the iommu driver to create a domain with unique ops, so
you'd just fork the entire thing, have your own struct
s390_kvm_iommu_domain and related ops.

When the special creation flow is triggered you'd just create one of
these with the proper ops already setup.

We are imagining a special ioctl to create these things and each IOMMU
HW driver can supply a unique implementation suited to their HW
design.

> KVM RPCIT intercept and vice versa. I.e. while the domain is under
> control of KVM's RPCIT handling we make all IOMMU map/unmap fail.

It is not "under the control of" the domain would be created as linked
to kvm and would never, ever, be anything else.
 
> To me this more direct involvement of IOMMU and KVM on s390x is also a
> direct consequence of it using special instructions. Naturally those
> instructions can be intercepted or run under hardware accelerated
> virtualization.

Well, no, you've just created a kernel-side SW emulated nested
translation scheme. Other CPUs have talked about doing this too, but
nobody has attempted it.

You can make the same argument for any CPU's scheme, a trapped mmio
store is not fundamentally any different from a special instruction
that traps, other than how the information is transferred.

> Yes very good analogy. Has any of that nested IOMMU translations work
> been merged yet? 

No. We are making quiet progress, slowly though. I'll add your
interest to my list

> too.  Basically we would then execute RPCIT without leaving the
> hardware virtualization mode (SIE). We believe that that would
> require pinning all of guest memory though because HW can't really
> pin pages.

Right, this is what other iommu HW will have to do.

Jason 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ