lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 11 Dec 2023 22:34:09 +0700
From:   "Suthikulpanit, Suravee" <suravee.suthikulpanit@....com>
To:     Jason Gunthorpe <jgg@...dia.com>, Yi Liu <yi.l.liu@...el.com>
Cc:     "Tian, Kevin" <kevin.tian@...el.com>,
        "Giani, Dhaval" <Dhaval.Giani@....com>,
        Vasant Hegde <vasant.hegde@....com>,
        "joro@...tes.org" <joro@...tes.org>,
        "alex.williamson@...hat.com" <alex.williamson@...hat.com>,
        "robin.murphy@....com" <robin.murphy@....com>,
        "baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
        "cohuck@...hat.com" <cohuck@...hat.com>,
        "eric.auger@...hat.com" <eric.auger@...hat.com>,
        "nicolinc@...dia.com" <nicolinc@...dia.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "mjrosato@...ux.ibm.com" <mjrosato@...ux.ibm.com>,
        "chao.p.peng@...ux.intel.com" <chao.p.peng@...ux.intel.com>,
        "yi.y.sun@...ux.intel.com" <yi.y.sun@...ux.intel.com>,
        "peterx@...hat.com" <peterx@...hat.com>,
        "jasowang@...hat.com" <jasowang@...hat.com>,
        "shameerali.kolothum.thodi@...wei.com" 
        <shameerali.kolothum.thodi@...wei.com>,
        "lulu@...hat.com" <lulu@...hat.com>,
        "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
        "Duan, Zhenzhong" <zhenzhong.duan@...el.com>,
        "joao.m.martins@...cle.com" <joao.m.martins@...cle.com>,
        "Zeng, Xin" <xin.zeng@...el.com>,
        "Zhao, Yan Y" <yan.y.zhao@...el.com>
Subject: Re: [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2)



On 12/11/2023 8:05 PM, Jason Gunthorpe wrote:
> On Mon, Dec 11, 2023 at 08:36:46PM +0800, Yi Liu wrote:
>> On 2023/12/11 10:29, Tian, Kevin wrote:
>>>> From: Jason Gunthorpe <jgg@...dia.com>
>>>> Sent: Saturday, December 9, 2023 9:47 AM
>>>>
>>>> What is in a Nested domain:
>>>>    Intel: A single IO page table refereed to by a PASID entry
>>>>           Each vDomain-ID,PASID allocates a unique nesting domain
>>>>    AMD: A GCR3 table pointer
>>>>         Nesting domains are created for every unique GCR3 pointer.
>>>>         vDomain-ID can possibly refer to multiple Nesting domains :(
>>>>    ARM: A CD table pointer
>>>>         Nesting domains are created for every unique CD table top pointer.
>>>
>>> this AMD/ARM difference is not very clear to me.
>>>
>>> How could a vDomain-ID refer to multiple GCR3 pointers? Wouldn't it
>>> lead to cache tag conflict when a same PASID entry in multiple GCR3 tables
>>> points to different I/O page tables?
>>
>> Perhaps due to only one DomainID in the DTE table indexed by BDF? Actually,
>> the vDomainID will not be used to tag cache, the host DomainId would be
>> used instead. @Jason?
> 
> The DomainID comes from the DTE table which is indexed by the RID, and
> the DTE entry points to the GCR3 table. So the VM certainly can setup
> a DTE table with multiple entires having the same vDomainID but
> pointing to different GCR3's. So the VMM has to do *something* with
> this.
> 
> Most likely this is not a useful thing to do. However what should the
> VMM do when it sees this? Block a random DTE or push the duplication
> down to real HW would be my options. I'd probably try to do the latter
> just on the basis of better emulation.
> 
> Jason

For AMD, the hardware uses host DomainID (hDomainId) and PASID to tag 
the IOMMU TLB.

The VM can setup vDomainID independently from device (RID) and 
hDomainID. The vDomainId->hDomainId mapping would be managed by the host 
IOMMU driver (since this is also needed by the HW when enabling the 
HW-vIOMMU support a.k.a virtual function).

Currently, the AMD IOMMU driver allocates a DomainId per IOMMU group.
One issue with this is when we have nested translation where we could 
end up with multiple devices (RIDs) sharing same PASID and the same 
hDomainID.

For example:

   - Host view
     Device1 (RID 1) w/ hDomainId 1
     Device2 (RID 2) w/ hDomainId 1
   - Guest view
     Pass-through Device1 (vRID 3) w/ vDomainID A + PASID 0
     Pass-through Device2 (vRID 4) w/ vDomainID B + PASID 0

We should be able to workaround this by changing the way we assign 
hDomainId to be per-device for VFIO pass-through devices although 
sharing the same v1 (stage-2) page table. This would look like.

   - Host view
     Device1 (RID 1) w/ hDomainId 1
     Device2 (RID 2) w/ hDomainId 2
   - Guest view
     Pass-through Device1 (vRID 3) w/ vDomainID A + PASID 0
     Pass-through Device2 (vRID 4) w/ vDomainID B + PASID 0

This should avoid the IOMMU TLB conflict. However, the invalidation 
would need to be done for both DomainId 1 and 2 when updating the v1 
(stage-2) page table.

Thanks,
Suravee

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ