[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260120184541.0a463cd1@pumpkin>
Date: Tue, 20 Jan 2026 18:45:41 +0000
From: David Laight <david.laight.linux@...il.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Ryan Roberts <ryan.roberts@....com>, Kees Cook <kees@...nel.org>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
Huacai Chen <chenhuacai@...nel.org>, Madhavan Srinivasan
<maddy@...ux.ibm.com>, Michael Ellerman <mpe@...erman.id.au>, Paul Walmsley
<pjw@...nel.org>, Palmer Dabbelt <palmer@...belt.com>, Albert Ou
<aou@...s.berkeley.edu>, Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik
<gor@...ux.ibm.com>, Alexander Gordeev <agordeev@...ux.ibm.com>, Thomas
Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav
Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, "Gustavo
A. R. Silva" <gustavoars@...nel.org>, Arnd Bergmann <arnd@...db.de>, Mark
Rutland <mark.rutland@....com>, "Jason A. Donenfeld" <Jason@...c4.com>, Ard
Biesheuvel <ardb@...nel.org>, Jeremy Linton <jeremy.linton@....com>,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
loongarch@...ts.linux.dev, linuxppc-dev@...ts.ozlabs.org,
linux-riscv@...ts.infradead.org, linux-s390@...r.kernel.org,
linux-hardening@...r.kernel.org
Subject: Re: [PATCH v4 0/3] Fix bugs and performance of kstack offset
randomisation
On Tue, 20 Jan 2026 08:37:43 -0800
Dave Hansen <dave.hansen@...el.com> wrote:
> On 1/20/26 08:32, Ryan Roberts wrote:
> > I don't think this question was really addressed to me, but I'll give my opinion
> > anyway; I agree it's pretty binary - it will either work or it will explode.
> > I've tested on arm64 and x86_64 so I have high confidence that it works. If you
> > get it into -next ASAP it has 3 weeks to soak before the merge window opens
> > right? (Linus said he would do an -rc8 this cycle). That feels like enough time
> > to me. But it's your tree 😉
>
> First of all, thank you for testing it on x86! Having that one data
> point where it helped performance is super valuable.
>
> I'm more worried that it's going to regress performance somewhere and
> then it's going to be a pain to back out. I'm not super worried about
> functional regressions.
Unlikely, on x86 the 'rdtsc' is ~20 clocks on Intel cpu and even slower
on amd (according to Agner).
(That is serialised against another rdtsc rather than other instructions.)
Whereas the four TAUSWORTHE() are independent so can execute in parallel.
IIRC each is a memory read and 5 ALU instructions - not much at all.
The slow bit will be the cache miss on the per-cpu data.
You lose a clock at the end because gcc will compile the a | b | c | d
as (((a | b) | c) | d) not ((a | b) | (c | d)).
I think someone reported the 'new' version being faster on x86,
that might be why.
David
Powered by blists - more mailing lists