linux-kernel - Re: [PATCH v3 1/1] iommu/sva: Invalidate KVA range on kernel TLB flush

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <563bd363-d806-4ee5-bcfe-05725055598d@gmail.com>
Date: Tue, 12 Aug 2025 09:17:01 +0800
From: Ethan Zhao <etzhao1900@...il.com>
To: Dave Hansen <dave.hansen@...el.com>, Uladzislau Rezki <urezki@...il.com>,
 Baolu Lu <baolu.lu@...ux.intel.com>
Cc: Jason Gunthorpe <jgg@...dia.com>, Joerg Roedel <joro@...tes.org>,
 Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
 Kevin Tian <kevin.tian@...el.com>, Jann Horn <jannh@...gle.com>,
 Vasant Hegde <vasant.hegde@....com>, Alistair Popple <apopple@...dia.com>,
 Peter Zijlstra <peterz@...radead.org>,
 Jean-Philippe Brucker <jean-philippe@...aro.org>,
 Andy Lutomirski <luto@...nel.org>, Yi Lai <yi1.lai@...el.com>,
 iommu@...ts.linux.dev, security@...nel.org, linux-kernel@...r.kernel.org,
 stable@...r.kernel.org
Subject: Re: [PATCH v3 1/1] iommu/sva: Invalidate KVA range on kernel TLB
 flush



On 8/11/2025 9:55 PM, Dave Hansen wrote:
> On 8/11/25 02:15, Uladzislau Rezki wrote:
>>> kernel_pte_work.list is global shared var, it would make the producer
>>> pte_free_kernel() and the consumer kernel_pte_work_func() to operate in
>>> serialized timing. In a large system, I don't think you design this
>>> deliberately 🙂
>>>
>> Sorry for jumping.
>>
>> Agree, unless it is never considered as a hot path or something that can
>> be really contented. It looks like you can use just a per-cpu llist to drain
>> thinks.
> 
> Remember, the code that has to run just before all this sent an IPI to
> every single CPU on the system to have them do a (on x86 at least)
> pretty expensive TLB flush.
> 
It can be easily identified as a bottleneck by multi-CPU stress testing 
programs involving frequent process creation and destruction, similar to 
the operation of a heavily loaded multi-process Apache web server. 
Hot/cold path ?

> If this is a hot path, we have bigger problems on our hands: the full
> TLB flush on every CPU.
Perhaps not "WE", IPI driven TLB flush seems not the shared mechanism of
all CPUs, at least not for ARM as far as I know.

> 
> So, sure, there are a million ways to make this deferred freeing more
> scalable. But the code that's here is dirt simple and self contained. If
> someone has some ideas for something that's simpler and more scalable,
> then I'm totally open to it.
> 
> But this is _not_ the place to add complexity to get scalability.
At least, please dont add bottleneck, how complex to do that ?

Thanks,
Ethan