[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB52764F098F7123D95233200D8C57A@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Tue, 15 Jul 2025 00:05:37 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Mike Rapoport <rppt@...nel.org>, Uladzislau Rezki <urezki@...il.com>
CC: David Laight <david.laight.linux@...il.com>, "Hansen, Dave"
<dave.hansen@...el.com>, "jacob.pan@...ux.microsoft.com"
<jacob.pan@...ux.microsoft.com>, Jason Gunthorpe <jgg@...dia.com>, Lu Baolu
<baolu.lu@...ux.intel.com>, Joerg Roedel <joro@...tes.org>, Will Deacon
<will@...nel.org>, Robin Murphy <robin.murphy@....com>, Jann Horn
<jannh@...gle.com>, Vasant Hegde <vasant.hegde@....com>, Alistair Popple
<apopple@...dia.com>, Peter Zijlstra <peterz@...radead.org>, "Jean-Philippe
Brucker" <jean-philippe@...aro.org>, Andy Lutomirski <luto@...nel.org>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>, "security@...nel.org"
<security@...nel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "stable@...r.kernel.org"
<stable@...r.kernel.org>
Subject: RE: [PATCH 1/1] iommu/sva: Invalidate KVA range on kernel TLB flush
> From: Mike Rapoport <rppt@...nel.org>
> Sent: Monday, July 14, 2025 10:50 PM
>
> On Mon, Jul 14, 2025 at 03:19:17PM +0200, Uladzislau Rezki wrote:
> > On Mon, Jul 14, 2025 at 01:39:20PM +0100, David Laight wrote:
> > > On Wed, 9 Jul 2025 11:22:34 -0700
> > > Dave Hansen <dave.hansen@...el.com> wrote:
> > >
> > > > On 7/9/25 11:15, Jacob Pan wrote:
> > > > >>> Is there a use case where a SVA user can access kernel memory in
> the
> > > > >>> first place?
> > > > >> No. It should be fully blocked.
> > > > >>
> > > > > Then I don't understand what is the "vulnerability condition" being
> > > > > addressed here. We are talking about KVA range here.
> > > >
> > > > SVA users can't access kernel memory, but they can compel walks of
> > > > kernel page tables, which the IOMMU caches. The trouble starts if the
> > > > kernel happens to free that page table page and the IOMMU is using
> the
> > > > cache after the page is freed.
> > > >
> > > > That was covered in the changelog, but I guess it could be made a bit
> > > > more succinct.
>
> But does this really mean that every flush_tlb_kernel_range() should flush
> the IOMMU page tables as well? AFAIU, set_memory flushes TLB even when
> bits
> in pte change and it seems like an overkill...
>
> > > Is it worth just never freeing the page tables used for vmalloc() memory?
> > > After all they are likely to be reallocated again.
> > >
> > >
> > Do we free? Maybe on some arches? According to my tests(AMD x86-64) i
> did
> > once upon a time, the PTE entries were not freed after vfree(). It could be
> > expensive if we did it, due to a global "page_table_lock" lock.
> >
> > I see one place though, it is in the vmap_try_huge_pud()
> >
> > if (pud_present(*pud) && !pud_free_pmd_page(pud, addr))
> > return 0;
> >
> > it is when replace a pud by a huge-page.
>
> There's also a place that replaces a pmd by a smaller huge page, but other
> than that vmalloc does not free page tables.
>
Dave spotted two other places where page tables might be freed:
https://lore.kernel.org/all/62580eab-3e68-4132-981a-84167d130d9f@intel.com/
Powered by blists - more mailing lists