[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250919054007.472493-1-baolu.lu@linux.intel.com>
Date: Fri, 19 Sep 2025 13:39:58 +0800
From: Lu Baolu <baolu.lu@...ux.intel.com>
To: Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>,
Robin Murphy <robin.murphy@....com>,
Kevin Tian <kevin.tian@...el.com>,
Jason Gunthorpe <jgg@...dia.com>,
Jann Horn <jannh@...gle.com>,
Vasant Hegde <vasant.hegde@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...el.com>,
Alistair Popple <apopple@...dia.com>,
Peter Zijlstra <peterz@...radead.org>,
Uladzislau Rezki <urezki@...il.com>,
Jean-Philippe Brucker <jean-philippe@...aro.org>,
Andy Lutomirski <luto@...nel.org>,
Yi Lai <yi1.lai@...el.com>
Cc: iommu@...ts.linux.dev,
security@...nel.org,
x86@...nel.org,
linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Lu Baolu <baolu.lu@...ux.intel.com>
Subject: [PATCH v5 0/8] Fix stale IOTLB entries for kernel address space
This proposes a fix for a security vulnerability related to IOMMU Shared
Virtual Addressing (SVA). In an SVA context, an IOMMU can cache kernel
page table entries. When a kernel page table page is freed and
reallocated for another purpose, the IOMMU might still hold stale,
incorrect entries. This can be exploited to cause a use-after-free or
write-after-free condition, potentially leading to privilege escalation
or data corruption.
This solution introduces a deferred freeing mechanism for kernel page
table pages, which provides a safe window to notify the IOMMU to
invalidate its caches before the page is reused.
Change log:
v5:
- Renamed pagetable_free_async() to pagetable_free_kernel() to avoid
confusion.
- Removed list_del() when the list is on the stack, as it will be freed
when the function returns.
- Discussed a corner case related to memory unplug of memory that was
present as reserved memory at boot. Given that it's extremely rare
and cannot be triggered by unprivileged users. We decided to focus
our efforts on the common vfree() case and noted that corner case in
the commit message.
- Some cleanups.
v4:
- https://lore.kernel.org/linux-iommu/20250905055103.3821518-1-baolu.lu@linux.intel.com/
- Introduce a mechanism to defer the freeing of page-table pages for
KVA mappings. Call iommu_sva_invalidate_kva_range() in the deferred
work thread before freeing the pages.
v3:
- https://lore.kernel.org/linux-iommu/20250806052505.3113108-1-baolu.lu@linux.intel.com/
- iommu_sva_mms is an unbound list; iterating it in an atomic context
could introduce significant latency issues. Schedule it in a kernel
thread and replace the spinlock with a mutex.
- Replace the static key with a normal bool; it can be brought back if
data shows the benefit.
- Invalidate KVA range in the flush_tlb_all() paths.
- All previous reviewed-bys are preserved. Please let me know if there
are any objections.
v2:
- https://lore.kernel.org/linux-iommu/20250709062800.651521-1-baolu.lu@linux.intel.com/
- Remove EXPORT_SYMBOL_GPL(iommu_sva_invalidate_kva_range);
- Replace the mutex with a spinlock to make the interface usable in the
critical regions.
v1: https://lore.kernel.org/linux-iommu/20250704133056.4023816-1-baolu.lu@linux.intel.com/
Dave Hansen (6):
mm: Add a ptdesc flag to mark kernel page tables
mm: Actually mark kernel page table pages
x86/mm: Use 'ptdesc' when freeing PMD pages
mm: Introduce pure page table freeing function
mm: Introduce deferred freeing for kernel page tables
mm: Hook up Kconfig options for async page table freeing
Lu Baolu (2):
x86/mm: Use pagetable_free()
iommu/sva: Invalidate stale IOTLB entries for kernel address space
arch/x86/Kconfig | 1 +
arch/x86/mm/init_64.c | 2 +-
arch/x86/mm/pat/set_memory.c | 2 +-
arch/x86/mm/pgtable.c | 12 ++++-----
drivers/iommu/iommu-sva.c | 29 +++++++++++++++++++++-
include/asm-generic/pgalloc.h | 18 ++++++++++++++
include/linux/iommu.h | 4 +++
include/linux/mm.h | 24 +++++++++++++++---
include/linux/page-flags.h | 46 +++++++++++++++++++++++++++++++++++
mm/Kconfig | 3 +++
mm/pgtable-generic.c | 39 +++++++++++++++++++++++++++++
11 files changed, 168 insertions(+), 12 deletions(-)
--
2.43.0
Powered by blists - more mailing lists