lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250710132234.GL1599700@nvidia.com>
Date: Thu, 10 Jul 2025 10:22:34 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Baolu Lu <baolu.lu@...ux.intel.com>, Joerg Roedel <joro@...tes.org>,
	Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
	Kevin Tian <kevin.tian@...el.com>, Jann Horn <jannh@...gle.com>,
	Vasant Hegde <vasant.hegde@....com>,
	Alistair Popple <apopple@...dia.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Uladzislau Rezki <urezki@...il.com>,
	Jean-Philippe Brucker <jean-philippe@...aro.org>,
	Andy Lutomirski <luto@...nel.org>,
	"Tested-by : Yi Lai" <yi1.lai@...el.com>, iommu@...ts.linux.dev,
	security@...nel.org, linux-kernel@...r.kernel.org,
	stable@...r.kernel.org
Subject: Re: [PATCH v2 1/1] iommu/sva: Invalidate KVA range on kernel TLB
 flush

On Thu, Jul 10, 2025 at 05:53:16AM -0700, Dave Hansen wrote:
> On 7/9/25 19:14, Baolu Lu wrote:
> >> If the problem is truly limited to freeing page tables, it needs to be
> >> commented appropriately.
> > 
> > Yeah, good comments. It should not be limited to freeing page tables;
> > freeing page tables is just a real case that we can see in the vmalloc/
> > vfree paths. Theoretically, whenever a kernel page table update is done
> > and the CPU TLB needs to be flushed, the secondary TLB (i.e., the caches
> > on the IOMMU) should be flushed accordingly. It's assumed that this
> > happens in flush_tlb_kernel_range().
> 
> Could you elaborate on this a bit further?
> 
> I thought that the IOMMU was only ever doing "user" permission walks and
> never "supervisor". That's why we had the whole discussion about whether
> the IOMMU would stop if it saw an upper-level entry with _PAGE_USER clear.

Yes

> The reason this matters is that leaf kernel page table entries always
> have _PAGE_USER clear. That means an IOMMU doing "user" permission walks
> should never be able to do anything with them. Even if an IOTLB entry
> was created* for a _PAGE_USER=0 PTE, the "user" permission walk will
> stop when it finds that entry.

Yes, if the IOTLB caches a supervisor entry it doesn't matter since if
the cache hits the S/U check will still fail and it will never get
used. It is just wasting IOTLB cache space. No need to invalidate
IOTLB.
 
> Why does this matter? We flush the CPU TLB in a bunch of different ways,
> _especially_ when it's being done for kernel mappings. For example,
> __flush_tlb_all() is a non-ranged kernel flush which has a completely
> parallel implementation with flush_tlb_kernel_range(). Call sites that
> use _it_ are unaffected by the patch here.
> 
> Basically, if we're only worried about vmalloc/vfree freeing page
> tables, then this patch is OK. If the problem is bigger than that, then
> we need a more comprehensive patch.

I think we are worried about any place that frees page tables.

>  * I'm not sure if the IOMMU will even create an IOTLB entry for
>    a supervisor-permission mapping while doing a user-permission walk.

It doesn't matter if it does or doesn't, it doesn't change the
argument that we don't have to invalidate on PTEs being changed.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ