[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220405135036.4812c51e.alex.williamson@redhat.com>
Date: Tue, 5 Apr 2022 13:50:36 -0600
From: Alex Williamson <alex.williamson@...hat.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Lu Baolu <baolu.lu@...ux.intel.com>,
Christian Benvenuti <benve@...co.com>,
Cornelia Huck <cohuck@...hat.com>,
David Woodhouse <dwmw2@...radead.org>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
iommu@...ts.linux-foundation.org, Jason Wang <jasowang@...hat.com>,
Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
linux-arm-msm@...r.kernel.org, linux-rdma@...r.kernel.org,
linux-s390@...r.kernel.org,
Matthew Rosato <mjrosato@...ux.ibm.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Nelson Escobar <neescoba@...co.com>, netdev@...r.kernel.org,
Rob Clark <robdclark@...il.com>,
Robin Murphy <robin.murphy@....com>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
virtualization@...ts.linux-foundation.org,
Will Deacon <will@...nel.org>, Christoph Hellwig <hch@....de>,
"Tian, Kevin" <kevin.tian@...el.com>
Subject: Re: [PATCH 3/5] iommu: Introduce the domain op
enforce_cache_coherency()
On Tue, 5 Apr 2022 13:16:02 -0300
Jason Gunthorpe <jgg@...dia.com> wrote:
> This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
> IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
>
> Currently only Intel and AMD IOMMUs are known to support this
> feature. They both implement it as an IOPTE bit, that when set, will cause
> PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> the no-snoop bit was clear.
>
> The new API is triggered by calling enforce_cache_coherency() before
> mapping any IOVA to the domain which globally switches on no-snoop
> blocking. This allows other implementations that might block no-snoop
> globally and outside the IOPTE - AMD also documents such an HW capability.
>
> Leave AMD out of sync with Intel and have it block no-snoop even for
> in-kernel users. This can be trivially resolved in a follow up patch.
>
> Only VFIO will call this new API.
>
> Signed-off-by: Jason Gunthorpe <jgg@...dia.com>
> ---
> drivers/iommu/amd/iommu.c | 7 +++++++
> drivers/iommu/intel/iommu.c | 14 +++++++++++++-
> include/linux/intel-iommu.h | 1 +
> include/linux/iommu.h | 4 ++++
> 4 files changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index a1ada7bff44e61..e500b487eb3429 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2271,6 +2271,12 @@ static int amd_iommu_def_domain_type(struct device *dev)
> return 0;
> }
>
> +static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> + /* IOMMU_PTE_FC is always set */
> + return true;
> +}
> +
> const struct iommu_ops amd_iommu_ops = {
> .capable = amd_iommu_capable,
> .domain_alloc = amd_iommu_domain_alloc,
> @@ -2293,6 +2299,7 @@ const struct iommu_ops amd_iommu_ops = {
> .flush_iotlb_all = amd_iommu_flush_iotlb_all,
> .iotlb_sync = amd_iommu_iotlb_sync,
> .free = amd_iommu_domain_free,
> + .enforce_cache_coherency = amd_iommu_enforce_cache_coherency,
> }
> };
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index df5c62ecf942b8..f08611a6cc4799 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4422,7 +4422,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
> prot |= DMA_PTE_READ;
> if (iommu_prot & IOMMU_WRITE)
> prot |= DMA_PTE_WRITE;
> - if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping)
> + if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
> + dmar_domain->enforce_no_snoop)
> prot |= DMA_PTE_SNP;
>
> max_addr = iova + size;
> @@ -4545,6 +4546,16 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
> return phys;
> }
>
> +static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> +
> + if (!dmar_domain->iommu_snooping)
> + return false;
> + dmar_domain->enforce_no_snoop = true;
> + return true;
> +}
Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
support it, ie. reserved register bit set in pte faults? It seems
really inconsistent here that I could make a domain that supports
iommu_snooping, set enforce_no_snoop = true, then add another DMAR to
the domain that may not support iommu_snooping, I'd get false on the
subsequent enforcement test, but the dmar_domain is still trying to use
DMA_PTE_SNP.
There's also a disconnect, maybe just in the naming or documentation,
but if I call enforce_cache_coherency for a domain, that seems like the
domain should retain those semantics regardless of how it's modified,
ie. "enforced". For example, if I tried to perform the above operation,
I should get a failure attaching the device that brings in the less
capable DMAR because the domain has been set to enforce this feature.
If the API is that I need to re-enforce_cache_coherency on every
modification of the domain, shouldn't dmar_domain->enforce_no_snoop
also return to a default value on domain changes?
Maybe this should be something like set_no_snoop_squashing with the
above semantics, it needs to be re-applied whenever the domain:device
composition changes? Thanks,
Alex
> +
> static bool intel_iommu_capable(enum iommu_cap cap)
> {
> if (cap == IOMMU_CAP_CACHE_COHERENCY)
> @@ -4898,6 +4909,7 @@ const struct iommu_ops intel_iommu_ops = {
> .iotlb_sync = intel_iommu_tlb_sync,
> .iova_to_phys = intel_iommu_iova_to_phys,
> .free = intel_iommu_domain_free,
> + .enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
> }
> };
>
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 2f9891cb3d0014..1f930c0c225d94 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -540,6 +540,7 @@ struct dmar_domain {
> u8 has_iotlb_device: 1;
> u8 iommu_coherency: 1; /* indicate coherency of iommu access */
> u8 iommu_snooping: 1; /* indicate snooping control feature */
> + u8 enforce_no_snoop : 1; /* Create IOPTEs with snoop control */
>
> struct list_head devices; /* all devices' list */
> struct iova_domain iovad; /* iova's that belong to this domain */
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 9208eca4b0d1ac..fe4f24c469c373 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -272,6 +272,9 @@ struct iommu_ops {
> * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
> * queue
> * @iova_to_phys: translate iova to physical address
> + * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
> + * including no-snoop TLPs on PCIe or other platform
> + * specific mechanisms.
> * @enable_nesting: Enable nesting
> * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
> * @free: Release the domain after use.
> @@ -300,6 +303,7 @@ struct iommu_domain_ops {
> phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
> dma_addr_t iova);
>
> + bool (*enforce_cache_coherency)(struct iommu_domain *domain);
> int (*enable_nesting)(struct iommu_domain *domain);
> int (*set_pgtable_quirks)(struct iommu_domain *domain,
> unsigned long quirks);
Powered by blists - more mailing lists