lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0302239b-e787-43e1-accd-e9904de56782@citrix.com>
Date: Wed, 19 Feb 2025 17:15:10 +0000
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: Peter Zijlstra <peterz@...radead.org>, x86@...nel.org
Cc: linux-kernel@...r.kernel.org, alyssa.milburn@...el.com,
 scott.d.constable@...el.com, joao@...rdrivepizza.com, jpoimboe@...nel.org,
 jose.marchesi@...cle.com, hjl.tools@...il.com, ndesaulniers@...gle.com,
 samitolvanen@...gle.com, nathan@...nel.org, ojeda@...nel.org,
 kees@...nel.org, alexei.starovoitov@...il.com, mhiramat@...nel.org,
 jmill@....edu
Subject: Re: [PATCH v3 05/10] x86/ibt: Optimize FineIBT sequence

On 19/02/2025 4:21 pm, Peter Zijlstra wrote:
> Scott notes that non-taken branches are faster. Abuse overlapping code
> that traps instead of explicit UD2 instructions.
>
> And LEA does not modify flags and will have less dependencies.
>
> Suggested-by: Scott Constable <scott.d.constable@...el.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>

Can we get a bit more info on this "non-taken branches are faster" ?

For modern cores which have branch prediction pre-decode, a branch
unknown to the predictor will behave as non-taken until the Jcc executes[1].

Something size of Linux is surely going to exceed the branch predictor
capacity, so it's perhaps fair to say that there's a reasonable chance
to miss in the predictor.

But, for a branch known to the predictor, taken branches ought to be
bubble-less these days.  At least, this is what the marketing material
claims.

And, this doesn't account for branches which alias in the predictor and
end up with a wrong prediction.

~Andrew

[1] Yes, I know RWC has the reintroduced 0xee prefix with the decode
resteer.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ