linux-kernel - Re: [PATCH v4 0/3] Fix bugs and performance of kstack offset randomisation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20260120184541.0a463cd1@pumpkin>
Date: Tue, 20 Jan 2026 18:45:41 +0000
From: David Laight <david.laight.linux@...il.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Ryan Roberts <ryan.roberts@....com>, Kees Cook <kees@...nel.org>,
 Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
 Huacai Chen <chenhuacai@...nel.org>, Madhavan Srinivasan
 <maddy@...ux.ibm.com>, Michael Ellerman <mpe@...erman.id.au>, Paul Walmsley
 <pjw@...nel.org>, Palmer Dabbelt <palmer@...belt.com>, Albert Ou
 <aou@...s.berkeley.edu>, Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik
 <gor@...ux.ibm.com>, Alexander Gordeev <agordeev@...ux.ibm.com>, Thomas
 Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav
 Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, "Gustavo
 A. R. Silva" <gustavoars@...nel.org>, Arnd Bergmann <arnd@...db.de>, Mark
 Rutland <mark.rutland@....com>, "Jason A. Donenfeld" <Jason@...c4.com>, Ard
 Biesheuvel <ardb@...nel.org>, Jeremy Linton <jeremy.linton@....com>,
 linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
 loongarch@...ts.linux.dev, linuxppc-dev@...ts.ozlabs.org,
 linux-riscv@...ts.infradead.org, linux-s390@...r.kernel.org,
 linux-hardening@...r.kernel.org
Subject: Re: [PATCH v4 0/3] Fix bugs and performance of kstack offset
 randomisation

On Tue, 20 Jan 2026 08:37:43 -0800
Dave Hansen <dave.hansen@...el.com> wrote:

> On 1/20/26 08:32, Ryan Roberts wrote:
> > I don't think this question was really addressed to me, but I'll give my opinion
> > anyway; I agree it's pretty binary - it will either work or it will explode.
> > I've tested on arm64 and x86_64 so I have high confidence that it works. If you
> > get it into -next ASAP it has 3 weeks to soak before the merge window opens
> > right? (Linus said he would do an -rc8 this cycle). That feels like enough time
> > to me. But it's your tree 😉  
> 
> First of all, thank you for testing it on x86! Having that one data
> point where it helped performance is super valuable.
> 
> I'm more worried that it's going to regress performance somewhere and
> then it's going to be a pain to back out. I'm not super worried about
> functional regressions.

Unlikely, on x86 the 'rdtsc' is ~20 clocks on Intel cpu and even slower
on amd (according to Agner).
(That is serialised against another rdtsc rather than other instructions.)
Whereas the four TAUSWORTHE() are independent so can execute in parallel.
IIRC each is a memory read and 5 ALU instructions - not much at all.
The slow bit will be the cache miss on the per-cpu data.
You lose a clock at the end because gcc will compile the a | b | c | d
as (((a | b) | c) | d) not ((a | b) | (c | d)).

I think someone reported the 'new' version being faster on x86,
that might be why.

	David