[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250710132234.GL1599700@nvidia.com>
Date: Thu, 10 Jul 2025 10:22:34 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Baolu Lu <baolu.lu@...ux.intel.com>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
Kevin Tian <kevin.tian@...el.com>, Jann Horn <jannh@...gle.com>,
Vasant Hegde <vasant.hegde@....com>,
Alistair Popple <apopple@...dia.com>,
Peter Zijlstra <peterz@...radead.org>,
Uladzislau Rezki <urezki@...il.com>,
Jean-Philippe Brucker <jean-philippe@...aro.org>,
Andy Lutomirski <luto@...nel.org>,
"Tested-by : Yi Lai" <yi1.lai@...el.com>, iommu@...ts.linux.dev,
security@...nel.org, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH v2 1/1] iommu/sva: Invalidate KVA range on kernel TLB
flush
On Thu, Jul 10, 2025 at 05:53:16AM -0700, Dave Hansen wrote:
> On 7/9/25 19:14, Baolu Lu wrote:
> >> If the problem is truly limited to freeing page tables, it needs to be
> >> commented appropriately.
> >
> > Yeah, good comments. It should not be limited to freeing page tables;
> > freeing page tables is just a real case that we can see in the vmalloc/
> > vfree paths. Theoretically, whenever a kernel page table update is done
> > and the CPU TLB needs to be flushed, the secondary TLB (i.e., the caches
> > on the IOMMU) should be flushed accordingly. It's assumed that this
> > happens in flush_tlb_kernel_range().
>
> Could you elaborate on this a bit further?
>
> I thought that the IOMMU was only ever doing "user" permission walks and
> never "supervisor". That's why we had the whole discussion about whether
> the IOMMU would stop if it saw an upper-level entry with _PAGE_USER clear.
Yes
> The reason this matters is that leaf kernel page table entries always
> have _PAGE_USER clear. That means an IOMMU doing "user" permission walks
> should never be able to do anything with them. Even if an IOTLB entry
> was created* for a _PAGE_USER=0 PTE, the "user" permission walk will
> stop when it finds that entry.
Yes, if the IOTLB caches a supervisor entry it doesn't matter since if
the cache hits the S/U check will still fail and it will never get
used. It is just wasting IOTLB cache space. No need to invalidate
IOTLB.
> Why does this matter? We flush the CPU TLB in a bunch of different ways,
> _especially_ when it's being done for kernel mappings. For example,
> __flush_tlb_all() is a non-ranged kernel flush which has a completely
> parallel implementation with flush_tlb_kernel_range(). Call sites that
> use _it_ are unaffected by the patch here.
>
> Basically, if we're only worried about vmalloc/vfree freeing page
> tables, then this patch is OK. If the problem is bigger than that, then
> we need a more comprehensive patch.
I think we are worried about any place that frees page tables.
> * I'm not sure if the IOMMU will even create an IOTLB entry for
> a supervisor-permission mapping while doing a user-permission walk.
It doesn't matter if it does or doesn't, it doesn't change the
argument that we don't have to invalidate on PTEs being changed.
Jason
Powered by blists - more mailing lists