[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <692186fd-42b8-4054-ead2-f6c6b1bf5b2d@linux.intel.com>
Date: Wed, 17 Mar 2021 13:16:58 +0800
From: Lu Baolu <baolu.lu@...ux.intel.com>
To: "Longpeng (Mike, Cloud Infrastructure Service Product Dept.)"
<longpeng2@...wei.com>, dwmw2@...radead.org, joro@...tes.org,
will@...nel.org, alex.williamson@...hat.com
Cc: baolu.lu@...ux.intel.com, iommu@...ts.linux-foundation.org,
LKML <linux-kernel@...r.kernel.org>,
"Gonglei (Arei)" <arei.gonglei@...wei.com>, chenjiashang@...wei.com
Subject: Re: A problem of Intel IOMMU hardware ?
Hi Longpeng,
On 3/17/21 11:16 AM, Longpeng (Mike, Cloud Infrastructure Service
Product Dept.) wrote:
> Hi guys,
>
> We find the Intel iommu cache (i.e. iotlb) maybe works wrong in a special
> situation, it would cause DMA fails or get wrong data.
>
> The reproducer (based on Alex's vfio testsuite[1]) is in attachment, it can
> reproduce the problem with high probability (~50%).
>
> The machine we used is:
> processor : 47
> vendor_id : GenuineIntel
> cpu family : 6
> model : 85
> model name : Intel(R) Xeon(R) Gold 6146 CPU @ 3.20GHz
> stepping : 4
> microcode : 0x2000069
>
> And the iommu capability reported is:
> ver 1:0 cap 8d2078c106f0466 ecap f020df
> (caching mode = 0 , page-selective invalidation = 1)
>
> (The problem is also on 'Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz' and
> 'Intel(R) Xeon(R) Platinum 8378A CPU @ 3.00GHz')
>
> We run the reproducer on Linux 4.18 and it works as follow:
>
> Step 1. alloc 4G *2M-hugetlb* memory (N.B. no problem with 4K-page mapping)
I don't understand 2M-hugetlb here means exactly. The IOMMU hardware
supports both 2M and 1G super page. The mapping physical memory is 4G.
Why couldn't it use 1G super page?
> Step 2. DMA Map 4G memory
> Step 3.
> while (1) {
> {UNMAP, 0x0, 0xa0000}, ------------------------------------ (a)
> {UNMAP, 0xc0000, 0xbff40000},
Have these two ranges been mapped before? Does the IOMMU driver
complains when you trying to unmap a range which has never been
mapped? The IOMMU driver implicitly assumes that mapping and
unmapping are paired.
> {MAP, 0x0, 0xc0000000}, --------------------------------- (b)
> use GDB to pause at here, and then DMA read IOVA=0,
IOVA 0 seems to be a special one. Have you verified with other addresses
than IOVA 0?
> sometimes DMA success (as expected),
> but sometimes DMA error (report not-present).
> {UNMAP, 0x0, 0xc0000000}, --------------------------------- (c)
> {MAP, 0x0, 0xa0000},
> {MAP, 0xc0000, 0xbff40000},
> }
>
> The DMA read operations sholud success between (b) and (c), it should NOT report
> not-present at least!
>
> After analysis the problem, we think maybe it's caused by the Intel iommu iotlb.
> It seems the DMA Remapping hardware still uses the IOTLB or other caches of (a).
>
> When do DMA unmap at (a), the iotlb will be flush:
> intel_iommu_unmap
> domain_unmap
> iommu_flush_iotlb_psi
>
> When do DMA map at (b), no need to flush the iotlb according to the capability
> of this iommu:
> intel_iommu_map
> domain_pfn_mapping
> domain_mapping
> __mapping_notify_one
> if (cap_caching_mode(iommu->cap)) // FALSE
> iommu_flush_iotlb_psi
That's true. The iotlb flushing is not needed in case of PTE been
changed from non-present to present unless caching mode.
> But the problem will disappear if we FORCE flush here. So we suspect the iommu
> hardware.
>
> Do you have any suggestion ?
Best regards,
baolu
Powered by blists - more mailing lists