linux-kernel - Re: [PATCH 1/1] iommu/vt-d: Fix incorrect cache invalidation for mm notification

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <877cmdcewc.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date:   Mon, 20 Nov 2023 10:55:15 +0800
From:   "Huang, Ying" <ying.huang@...el.com>
To:     Alistair Popple <apopple@...dia.com>
Cc:     Lu Baolu <baolu.lu@...ux.intel.com>,
        Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
        "Robin Murphy" <robin.murphy@....com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Kevin Tian <kevin.tian@...el.com>, <iommu@...ts.linux.dev>,
        <linux-kernel@...r.kernel.org>, <stable@...r.kernel.org>,
        Luo Yuzhang <yuzhang.luo@...el.com>,
        Tony Zhu <tony.zhu@...el.com>, Nadav Amit <namit@...are.com>
Subject: Re: [PATCH 1/1] iommu/vt-d: Fix incorrect cache invalidation for mm
 notification

Alistair Popple <apopple@...dia.com> writes:

> Lu Baolu <baolu.lu@...ux.intel.com> writes:
>
>> Commit 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when
>> invalidating TLBs") moved the secondary TLB invalidations into the TLB
>> invalidation functions to ensure that all secondary TLB invalidations
>> happen at the same time as the CPU invalidation and added a flush-all
>> type of secondary TLB invalidation for the batched mode, where a range
>> of [0, -1UL) is used to indicates that the range extends to the end of
>> the address space.
>>
>> However, using an end address of -1UL caused an overflow in the Intel
>> IOMMU driver, where the end address was rounded up to the next page.
>> As a result, both the IOTLB and device ATC were not invalidated correctly.
>
> Thanks for catching. This fix looks good so:
>
> Reviewed-by: Alistair Popple <apopple@...dia.com>
>
> However examining the fixes patch again I note that we are calling
> mmu_notifier_invalidate_range(mm, 0, -1UL) from
> arch_tlbbatch_add_pending() in arch/x86/include/asm/tlbflush.h.
>
> That seems suboptimal because we would be doing an invalidate all for
> every page unmap,

Yes.  This can be performance regression for IOMMU TLB flushing.  For
CPU, it's "flush smaller ranges with more IPI" vs. "flush whole range
with less IPI", and in general the later wins because the high overhead
of IPI.  But, IIUC, for IOMMU TLB, it becomes "flush smaller ranges"
vs. "flush whole range".  That is generally bad.  It may be better to
restore the original behavior.  Can we just pass the size of TLB
flushing in set_tlb_ubc_flush_pending()->arch_tlbbatch_add_pending(),
and flush the IOMMU TLB for the range?

> and as of db6c1f6f236d ("mm/tlbbatch: introduce
> arch_flush_tlb_batched_pending()") arch_flush_tlb_batched_pending()
> calls flush_tlb_mm() anyway. So I think we can probably drop the
> explicit notifier call from arch_flush_tlb_batched_pending().

arch_flush_tlb_batched_pending() is used when we need to change page
table (e.g., munmap()) in parallel with TLB flushing batching (e.g.,
try_to_unmap()).  The actual TLB flushing part for
set_tlb_ubc_flush_pending()->arch_tlbbatch_add_pending() is
try_to_unmap_flush()->arch_tlbbatch_flush().

> Will put togeather a patch for that.
>
>  - Alistair
>
>> Add a flush all helper function and call it when the invalidation range
>> is from 0 to -1UL, ensuring that the entire caches are invalidated
>> correctly.
>>

[snip]

--
Best Regards,
Huang, Ying