[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aV56y9KcAS8mC7Uk@google.com>
Date: Wed, 7 Jan 2026 15:24:59 +0000
From: Pranjal Shrivastava <praan@...gle.com>
To: Mostafa Saleh <smostafa@...gle.com>
Cc: linux-mm@...ck.org, iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, corbet@....net, joro@...tes.org,
will@...nel.org, robin.murphy@....com, akpm@...ux-foundation.org,
vbabka@...e.cz, surenb@...gle.com, mhocko@...e.com,
jackmanb@...gle.com, hannes@...xchg.org, ziy@...dia.com,
david@...hat.com, lorenzo.stoakes@...cle.com,
Liam.Howlett@...cle.com, rppt@...nel.org, xiaqinxin@...wei.com,
baolu.lu@...ux.intel.com, rdunlap@...radead.org
Subject: Re: [PATCH v5 0/4] iommu: Add IOMMU_DEBUG_PAGEALLOC sanitizer
On Tue, Jan 06, 2026 at 04:21:56PM +0000, Mostafa Saleh wrote:
> Overview
> --------
> This patch series introduces a new debugging feature,
> IOMMU_DEBUG_PAGEALLOC, designed to catch DMA use-after-free bugs
> and IOMMU mapping leaks from buggy drivers.
>
> The kernel has powerful sanitizers like KASAN and DEBUG_PAGEALLOC
> for catching CPU-side memory corruption. However, there is limited
> runtime sanitization for DMA mappings managed by the IOMMU. A buggy
> driver can free a page while it is still mapped for DMA, leading to
> memory corruption or use-after-free vulnerabilities when that page is
> reallocated and used for a different purpose.
>
Thanks for this series! This is really helpful!
> Inspired by DEBUG_PAGEALLOC, this sanitizer tracks IOMMU mappings on a
> per-page basis, as it’s not possible to unmap the pages, because it
> requires to lock and walk all domains on every kernel free, instead we
> rely on page_ext to add an IOMMU-specific mapping reference count for
> each page.
> And on each page allocated/freed from the kernel we simply check the
> count and WARN if it is not zero, and dumping page owner information
> if enabled.
>
> Concurrency
> -----------
> By design this check is racy where one caller can map pages just after
> the check, which can lead to false negatives.
> In my opinion this is acceptable for sanitizers (for ex KCSAN have
> that property).
> Otherwise we have to implement locks in iommu_map/unmap for all domains
> which is not favourable even for a debug feature.
> The sanitizer only guarantees that the refcount itself doesn’t get
> corrupted using atomics. And there are no false positives.
>
> CPU vs IOMMU Page Size
> ----------------------
> IOMMUs can use different page sizes and which can be non-homogeneous;
> not even all of them have the same page size.
>
> To solve this, the refcount is always incremented and decremented in
> units of the smallest page size supported by the IOMMU domain. This
> ensures the accounting remains consistent regardless of the size of
> the map or unmap operation, otherwise double counting can happen.
>
> Testing & Performance
> ---------------------
> This was tested on Morello with Arm64 + SMMUv3
> Did some testing Lenovo IdeaCentre X Gen 10 Snapdragon
> Did some testing on Qemu including different SMMUv3/CPU page size (arm64).
>
> I also ran dma_map_benchmark on Morello:
>
> echo dma_map_benchmark > /sys/bus/pci/devices/0000\:06\:00.0/driver_override
> echo 0000:06:00.0 > /sys/bus/pci/devices/0000\:06\:00.0/driver/unbind
> echo 0000:06:00.0 > /sys/bus/pci/drivers/dma_map_benchmark/bind
> ./dma_map_benchmark -t $threads -g $nr_pages
>
> CONFIG refers to "CONFIG_IOMMU_DEBUG_PAGEALLOC"
> cmdline refers to "iommu.debug_pagealloc"
> Numbers are (map latency)/(unmap latency), lower is better.
>
> CONFIG=n CONFIG=y CONFIG=y
> cmdline=0 cmdline=1
> 4K - 1 thread 0.1/0.6 0.1/0.6 0.1/0.7
> 4K - 4 threads 0.1/1.1 0.1/1.0 0.2/1.1
> 1M - 1 thread 0.8/21.2 0.7/21.2 5.4/42.3
> 1M - 4 threads 1.1/45.9 1.1/46.0 5.9/45.1
>
Just curious to know if we've also measured the latency for larger
mappings? e.g. 1G mapping backed by `n` 4K mappings?
> Changes in v5:
> v4: https://lore.kernel.org/all/20251211125928.3258905-1-smostafa@google.com/
> - Fix typo in comment
> - Collect Baolu R-bs
>
> Main changes in v4:
> v3: https://lore.kernel.org/all/20251124200811.2942432-1-smostafa@google.com/
> - Update the kernel parameter format in docs based on Randy feedback
> - Update commit subjects
> - Add IOMMU only functions in iommu-priv.h based on Baolu feedback
>
> Main changes in v3: (Most of them addressing Will comments)
> v2: https://lore.kernel.org/linux-iommu/20251106163953.1971067-1-smostafa@google.com/
> - Reword the Kconfig help
> - Use unmap_begin/end instead of unmap/remap
> - Use relaxed accessors when refcounting
> - Fix a bug with checking the returned address from iova_to_phys
> - Add more hardening checks (overflow)
> - Add more debug info on assertions (dump_page_owner())
> - Handle cases where unmap returns larger size as the core code seems
> to tolerate that.
> - Drop Tested-by tags from Qinxin as the code logic changed
>
> Main changes in v2:
> v1: https://lore.kernel.org/linux-iommu/20251003173229.1533640-1-smostafa@google.com/
> - Address Jörg comments about #ifdefs and static keys
> - Reword the Kconfig help
> - Drop RFC
> - Collect t-b from Qinxin
> - Minor cleanups
>
> Mostafa Saleh (4):
> iommu: Add page_ext for IOMMU_DEBUG_PAGEALLOC
> iommu: Add calls for IOMMU_DEBUG_PAGEALLOC
> iommu: debug-pagealloc: Track IOMMU pages
> iommu: debug-pagealloc: Check mapped/unmapped kernel memory
>
> .../admin-guide/kernel-parameters.txt | 9 +
> drivers/iommu/Kconfig | 19 ++
> drivers/iommu/Makefile | 1 +
> drivers/iommu/iommu-debug-pagealloc.c | 174 ++++++++++++++++++
> drivers/iommu/iommu-priv.h | 58 ++++++
> drivers/iommu/iommu.c | 11 +-
> include/linux/iommu-debug-pagealloc.h | 32 ++++
> include/linux/mm.h | 5 +
> mm/page_ext.c | 4 +
> 9 files changed, 311 insertions(+), 2 deletions(-)
> create mode 100644 drivers/iommu/iommu-debug-pagealloc.c
> create mode 100644 include/linux/iommu-debug-pagealloc.h
>
> --
> 2.52.0.351.gbe84eed79e-goog
>
>
Powered by blists - more mailing lists