linux-kernel - Re: [PATCH v3 3/4] x86/alternative: Rewrite optimize

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f0b54521-26cf-ed38-d805-3a8eef3b3103@citrix.com>
Date:   Wed, 8 Feb 2023 19:52:04 +0000
From:   Andrew.Cooper3@...rix.com
To:     Peter Zijlstra <peterz@...radead.org>, x86@...nel.org
Cc:     linux-kernel@...r.kernel.org, mhiramat@...nel.org,
        kirill.shutemov@...ux.intel.com, jpoimboe@...hat.com
Subject: Re: [PATCH v3 3/4] x86/alternative: Rewrite optimize_nops() some

On 08/02/2023 5:10 pm, Peter Zijlstra wrote:
> This rewrite address two issues:
>
>  - it no longer hard requires single byte nop runs, it now accepts
>    any NOP and NOPL encoded instruction (but not the more complicated
>    32bit NOPs).
>
>  - it writes a single 'instruction' replacement.
>
> Specifically, ORC unwinder relies on the tail NOP of an alternative to
> be a single instruction, in particular it relies on the inner bytes
> not being executed.
>
> Once we reach the max supported NOP length (currently 8, could easily
> be extended to 11 on x86_64), switches to JMP.d8 and INT3 padding to
> achieve the same result.
>
> The ORC unwinder uses this guarantee in the analysis of
> alternative/overlapping CFI state,
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>

How lucky are you feeling for your game of performance roulette?

Unconditional jmps cost branch prediction these days, and won't be
successfully predicted until taken.

There is a point after which a jmp is more efficient that brute forcing
through a line of nops, and where this point is is very uarch specific,
but it's not a single nop...

Whether you care or not is a different matter, but at least be aware
doing a jmp like this instead of e.g. 2 or 3 nops, is contrary to the
prior advice given by the architects.

~Andrew