[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aJoEiajJwlWuXyax@pc636>
Date: Mon, 11 Aug 2025 16:56:09 +0200
From: Uladzislau Rezki <urezki@...il.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Uladzislau Rezki <urezki@...il.com>, Ethan Zhao <etzhao1900@...il.com>,
Baolu Lu <baolu.lu@...ux.intel.com>,
Jason Gunthorpe <jgg@...dia.com>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
Kevin Tian <kevin.tian@...el.com>, Jann Horn <jannh@...gle.com>,
Vasant Hegde <vasant.hegde@....com>,
Alistair Popple <apopple@...dia.com>,
Peter Zijlstra <peterz@...radead.org>,
Jean-Philippe Brucker <jean-philippe@...aro.org>,
Andy Lutomirski <luto@...nel.org>, Yi Lai <yi1.lai@...el.com>,
iommu@...ts.linux.dev, security@...nel.org,
linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH v3 1/1] iommu/sva: Invalidate KVA range on kernel TLB
flush
On Mon, Aug 11, 2025 at 06:55:52AM -0700, Dave Hansen wrote:
> On 8/11/25 02:15, Uladzislau Rezki wrote:
> >> kernel_pte_work.list is global shared var, it would make the producer
> >> pte_free_kernel() and the consumer kernel_pte_work_func() to operate in
> >> serialized timing. In a large system, I don't think you design this
> >> deliberately 🙂
> >>
> > Sorry for jumping.
> >
> > Agree, unless it is never considered as a hot path or something that can
> > be really contented. It looks like you can use just a per-cpu llist to drain
> > thinks.
>
> Remember, the code that has to run just before all this sent an IPI to
> every single CPU on the system to have them do a (on x86 at least)
> pretty expensive TLB flush.
>
> If this is a hot path, we have bigger problems on our hands: the full
> TLB flush on every CPU.
>
> So, sure, there are a million ways to make this deferred freeing more
> scalable. But the code that's here is dirt simple and self contained. If
> someone has some ideas for something that's simpler and more scalable,
> then I'm totally open to it.
>
You could also have a look toward removing the &kernel_pte_work.lock.
Replace it by llist_add() on adding side and llist_for_each_safe(n, t, llist_del_all(&list))
on removing side. So you do not need guard(spinlock) stuff.
If i do not miss anything.
>
> But this is _not_ the place to add complexity to get scalability.
>
OK.
--
Uladzislau Rezki
Powered by blists - more mailing lists