linux-kernel - Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aff42b8f-b757-4422-9ebe-741a4b894b6c@linux.intel.com>
Date: Tue, 9 Apr 2024 11:12:20 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: Jacob Pan <jacob.jun.pan@...ux.intel.com>
Cc: baolu.lu@...ux.intel.com, iommu@...ts.linux.dev,
 Kevin Tian <kevin.tian@...el.com>, Yi Liu <yi.l.liu@...el.com>,
 Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
 Robin Murphy <robin.murphy@....com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb
 flush

On 4/9/24 5:03 AM, Jacob Pan wrote:
> Hi Lu,

Hi Jacob,

> 
> On Sun,  7 Apr 2024 22:42:32 +0800, Lu Baolu<baolu.lu@...ux.intel.com>
> wrote:
> 
>> The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
>> implementation caches not-present or erroneous translation-structure
>> entries except the first-stage translation. The caching mode is
>> unrelated to the device TLB , therefore there is no need to check
>> it before a device TLB invalidation operation.
>>
>> Before the scalable mode is introduced, caching mode is treated as
>> an indication that the driver is running in a VM guest. This is just
>> a software contract as shadow page table is the only way to implement
>> a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
>> the scalable mode is introduced, this doesn't stand for anymore, as
>> caching mode is not relevant for the first-stage translation. A virtual
>> IOMMU implementation is free to support first-stage translation only
>> with caching mode cleared.
>>
>> Remove the caching mode check before device TLB invalidation to ensure
>> compatibility with the scalable mode use cases.
>>
> I agree with the changes below, what about this CM check:
> 
> /* Notification for newly created mappings */
> static void __mapping_notify_one(struct intel_iommu *iommu, struct dmar_domain *domain,
> 				 unsigned long pfn, unsigned int pages)
> {
> 	/*
> 	 * It's a non-present to present mapping. Only flush if caching mode
> 	 * and second level.
> 	 */
> 	if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
> 		iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
> 
> We are still tying devTLB flush to CM=1, no?

__mapping_notify_one() is called in the path where some PTEs are changed
from non-present to present.

In this scenario,

- if CM is set and first-stage translation is not used, the IOTLB caches
   are required to be explicitly flushed.
- else if hardware requires write buffer flushing, do it.
- Otherwise, no op.
- devtlb invalidation is irrelevant to this path.

The code after the fix appears to do the right thing. devTLB is not
invalidated in iommu_flush_iotlb_psi() since it's a map (map == 1).

Or perhaps I overlooked anything?

> 
> If we are running in the guest with second level page table (shadowed), can
> we decide if devTLB flush is needed based on ATS enable just as the rest of
> the cases?

I think the ATS check should be consistent. It's generic no matter how
the IOMMU is implemented (in hardware or emulated in software).

Best regards,
baolu