linux-kernel - Re: [PATCH v3 24/30] vfio-pci/zdev: wire up group notifier

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bcf04ad2-848b-de03-5610-d99e3b761b10@linux.ibm.com>
Date:   Thu, 10 Feb 2022 13:59:35 -0500
From:   Matthew Rosato <mjrosato@...ux.ibm.com>
To:     Jason Gunthorpe <jgg@...dia.com>,
        Niklas Schnelle <schnelle@...ux.ibm.com>
Cc:     Alex Williamson <alex.williamson@...hat.com>,
        linux-s390@...r.kernel.org, cohuck@...hat.com,
        farman@...ux.ibm.com, pmorel@...ux.ibm.com,
        borntraeger@...ux.ibm.com, hca@...ux.ibm.com, gor@...ux.ibm.com,
        gerald.schaefer@...ux.ibm.com, agordeev@...ux.ibm.com,
        frankja@...ux.ibm.com, david@...hat.com, imbrenda@...ux.ibm.com,
        vneethv@...ux.ibm.com, oberpar@...ux.ibm.com, freude@...ux.ibm.com,
        thuth@...hat.com, pasic@...ux.ibm.com, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 24/30] vfio-pci/zdev: wire up group notifier

On 2/10/22 10:23 AM, Jason Gunthorpe wrote:
> On Thu, Feb 10, 2022 at 03:06:35PM +0100, Niklas Schnelle wrote:
> 
>>> How does the page pinning work?
>>
>> The pinning is done directly in the RPCIT interception handler pinning
>> both the IOMMU tables and the guest pages mapped for DMA.
> 
> And if pinning fails?

The RPCIT instruction is goes back to the guest with an indication that 
informs it the operation failed / gives it impetus to kick off a guest 
DMA refresh and clear up space (unpin).

>   
>>> Then the
>>> magic kernel code you describe can operate on its own domain without
>>> becoming confused with a normal map/unmap domain.
>>
>> This sounds like an interesting idea. Looking at
>> drivers/iommu/s390_iommu.c most of that is pretty trivial domain
>> handling. I wonder if we could share this by marking the existing
>> s390_iommu_domain type with kind of a "lent out to KVM" flag.
> 
> Lu has posted a series here:
> 
> https://lore.kernel.org/linux-iommu/20220208012559.1121729-1-baolu.lu@linux.intel.com
> 
> Which allows the iommu driver to create a domain with unique ops, so
> you'd just fork the entire thing, have your own struct
> s390_kvm_iommu_domain and related ops.
> 

OK, looking into this, thanks for the pointer...  Sounds to me like we 
then want to make the determination upfront and then ensure the right 
iommu domain ops are registered for the device sometime before creation, 
based upon the usecase -- general userspace: s390_iommu_ops (existing), 
kvm: s390_kvm_iommu_domain (new).

> When the special creation flow is triggered you'd just create one of
> these with the proper ops already setup. >
> We are imagining a special ioctl to create these things and each IOMMU
> HW driver can supply a unique implementation suited to their HW
> design.

But I haven't connected the dots on this part -- At the end of the day 
for this 'special creation flow' I need the kvm + starting point of the 
guest table + format before we let the new s390_kvm_iommu_domain start 
doing automatic map/unmap during RPCIT intercept -- This initial setup 
has to come from a special ioctl as you say, but where do you see it 
living?  I could certainly roll my own via a KVM ioctl or whatever, but 
it sounds like you're also referring to a general-purpose ioctl to 
encompass each of the different unique implementations, with this s390 
kvm approach being one.