[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c8196481fdd338e9d066376b6e9bf1dfcd6ea462.camel@intel.com>
Date: Mon, 18 Aug 2025 17:01:07 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "kevin.brodsky@....com" <kevin.brodsky@....com>,
"linux-hardening@...r.kernel.org" <linux-hardening@...r.kernel.org>
CC: "x86@...nel.org" <x86@...nel.org>, "maz@...nel.org" <maz@...nel.org>,
"luto@...nel.org" <luto@...nel.org>, "mbland@...orola.com"
<mbland@...orola.com>, "willy@...radead.org" <willy@...radead.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"david@...hat.com" <david@...hat.com>, "rppt@...nel.org" <rppt@...nel.org>,
"joey.gouly@....com" <joey.gouly@....com>, "akpm@...ux-foundation.org"
<akpm@...ux-foundation.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "pierre.langlois@....com"
<pierre.langlois@....com>, "Weiny, Ira" <ira.weiny@...el.com>,
"vbabka@...e.cz" <vbabka@...e.cz>, "catalin.marinas@....com"
<catalin.marinas@....com>, "jeffxu@...omium.org" <jeffxu@...omium.org>,
"linus.walleij@...aro.org" <linus.walleij@...aro.org>,
"lorenzo.stoakes@...cle.com" <lorenzo.stoakes@...cle.com>, "kees@...nel.org"
<kees@...nel.org>, "ryan.roberts@....com" <ryan.roberts@....com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "jannh@...gle.com"
<jannh@...gle.com>, "peterz@...radead.org" <peterz@...radead.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "will@...nel.org" <will@...nel.org>,
"qperret@...gle.com" <qperret@...gle.com>, "linux-mm@...ck.org"
<linux-mm@...ck.org>, "broonie@...nel.org" <broonie@...nel.org>
Subject: Re: [RFC PATCH v5 13/18] mm: Map page tables with privileged pkey
On Mon, 2025-08-18 at 18:02 +0200, Kevin Brodsky wrote:
> The benchmarking results (see cover letter) don't seem to point to a
> major performance hit from setting the pkey on arm64 (worth noting that
> the linear mapping is PTE-mapped on arm64 today so no splitting should
> occur when setting the pkey). The overhead may well be substantially
> higher on x86.
It's surprising to me. The batching seems to be about switching the pkey, not
the conversion of the direct map. And with batching you measured a fork
benchmark actually sped up a tiny bit. Shouldn't it involve a pile of page table
allocations and so extra direct map work?
I don't know if it's possible the mock implementation skipped some set_memory()
work somehow?
>
> I agree this is worth looking into, though. I will check the overhead
> added by set_memory_pkey() specifically (ignoring pkey register
> switches), and maybe try to allocate page tables with a dedicated
> kmem_cache instead, reusing this patch [1] from my other kpkeys series.
> A kmem_cache won't be as optimal as a dedicated allocator, but batching
> the page freeing may already improve things substantially.
I actually never got to the benchmark on real HW stage either, but I'd be
surprised if this approach would have acceptable performance for x86. There are
so many optimizations around minimizing TLB flushes in Linux. Dunno. Maybe my
arm knowledge is too lacking.
Powered by blists - more mailing lists