linux-kernel - Re: [RFC v2] /dev/iommu uAPI proposal

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5962d403-80c4-0ac4-4f37-96b055a2b4d0@huawei.com>
Date:   Thu, 15 Jul 2021 16:14:02 +0800
From:   Shenming Lu <lushenming@...wei.com>
To:     "Tian, Kevin" <kevin.tian@...el.com>
CC:     Jason Gunthorpe <jgg@...dia.com>,
        "Alex Williamson (alex.williamson@...hat.com)" 
        <alex.williamson@...hat.com>,
        "Jean-Philippe Brucker" <jean-philippe@...aro.org>,
        David Gibson <david@...son.dropbear.id.au>,
        Jason Wang <jasowang@...hat.com>,
        "parav@...lanox.com" <parav@...lanox.com>,
        "Enrico Weigelt, metux IT consult" <lkml@...ux.net>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Joerg Roedel <joro@...tes.org>,
        Eric Auger <eric.auger@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "Liu, Yi L" <yi.l.liu@...el.com>, "Wu, Hao" <hao.wu@...el.com>,
        "Jiang, Dave" <dave.jiang@...el.com>,
        Jacob Pan <jacob.jun.pan@...ux.intel.com>,
        "Kirti Wankhede" <kwankhede@...dia.com>,
        Robin Murphy <robin.murphy@....com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "David Woodhouse" <dwmw2@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        "Lu Baolu" <baolu.lu@...ux.intel.com>,
        "wanghaibin.wang@...wei.com" <wanghaibin.wang@...wei.com>
Subject: Re: [RFC v2] /dev/iommu uAPI proposal

On 2021/7/15 14:49, Tian, Kevin wrote:
>> From: Shenming Lu <lushenming@...wei.com>
>> Sent: Thursday, July 15, 2021 2:29 PM
>>
>> On 2021/7/15 11:55, Tian, Kevin wrote:
>>>> From: Shenming Lu <lushenming@...wei.com>
>>>> Sent: Thursday, July 15, 2021 11:21 AM
>>>>
>>>> On 2021/7/9 15:48, Tian, Kevin wrote:
>>>>> 4.6. I/O page fault
>>>>> +++++++++++++++++++
>>>>>
>>>>> uAPI is TBD. Here is just about the high-level flow from host IOMMU
>> driver
>>>>> to guest IOMMU driver and backwards. This flow assumes that I/O page
>>>> faults
>>>>> are reported via IOMMU interrupts. Some devices report faults via
>> device
>>>>> specific way instead of going through the IOMMU. That usage is not
>>>> covered
>>>>> here:
>>>>>
>>>>> -   Host IOMMU driver receives a I/O page fault with raw fault_data {rid,
>>>>>     pasid, addr};
>>>>>
>>>>> -   Host IOMMU driver identifies the faulting I/O page table according to
>>>>>     {rid, pasid} and calls the corresponding fault handler with an opaque
>>>>>     object (registered by the handler) and raw fault_data (rid, pasid, addr);
>>>>>
>>>>> -   IOASID fault handler identifies the corresponding ioasid and device
>>>>>     cookie according to the opaque object, generates an user fault_data
>>>>>     (ioasid, cookie, addr) in the fault region, and triggers eventfd to
>>>>>     userspace;
>>>>>
>>>>
>>>> Hi, I have some doubts here:
>>>>
>>>> For mdev, it seems that the rid in the raw fault_data is the parent device's,
>>>> then in the vSVA scenario, how can we get to know the mdev(cookie) from
>>>> the
>>>> rid and pasid?
>>>>
>>>> And from this point of view，would it be better to register the mdev
>>>> (iommu_register_device()) with the parent device info?
>>>>
>>>
>>> This is what is proposed in this RFC. A successful binding generates a new
>>> iommu_dev object for each vfio device. For mdev this object includes
>>> its parent device, the defPASID marking this mdev, and the cookie
>>> representing it in userspace. Later it is iommu_dev being recorded in
>>> the attaching_data when the mdev is attached to an IOASID:
>>>
>>> 	struct iommu_attach_data *__iommu_device_attach(
>>> 		struct iommu_dev *dev, u32 ioasid, u32 pasid, int flags);
>>>
>>> Then when a fault is reported, the fault handler just needs to figure out
>>> iommu_dev according to {rid, pasid} in the raw fault data.
>>>
>>
>> Yeah, we have the defPASID that marks the mdev and refers to the default
>> I/O address space, but how about the non-default I/O address spaces?
>> Is there a case that two different mdevs (on the same parent device)
>> are used by the same process in the guest, thus have a same pasid route
>> in the physical IOMMU? It seems that we can't figure out the mdev from
>> the rid and pasid in this case...
>>
>> Did I misunderstand something?... :-)
>>
> 
> No. You are right on this case. I don't think there is a way to 
> differentiate one mdev from the other if they come from the
> same parent and attached by the same guest process. In this
> case the fault could be reported on either mdev (e.g. the first
> matching one) to get it fixed in the guest.
> 

OK. Thanks,

Shenming