[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fc0715e0-42f2-4b5d-be31-ac44657afc56@citrix.com>
Date: Fri, 15 Aug 2025 11:43:11 +0100
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: Peter Zijlstra <peterz@...radead.org>, "H. Peter Anvin" <hpa@...or.com>
Cc: x86@...nel.org, kees@...nel.org, alyssa.milburn@...el.com,
scott.d.constable@...el.com, joao@...rdrivepizza.com,
samitolvanen@...gle.com, nathan@...nel.org, alexei.starovoitov@...il.com,
mhiramat@...nel.org, ojeda@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH] x86,ibt: Use UDB instead of 0xEA
On 15/08/2025 11:30 am, Peter Zijlstra wrote:
> On Fri, Aug 15, 2025 at 12:28:39PM +0200, Peter Zijlstra wrote:
>> On Fri, Aug 15, 2025 at 09:49:39AM +0200, Peter Zijlstra wrote:
>>> On Thu, Aug 14, 2025 at 06:27:44PM -0700, H. Peter Anvin wrote:
>>>> On 2025-08-14 04:17, Peter Zijlstra wrote:
>>>>> Hi!
>>>>>
>>>>> A while ago FineIBT started using the instruction 0xEA to generate #UD.
>>>>> All existing parts will generate #UD in 64bit mode on that instruction.
>>>>>
>>>>> However; Intel/AMD have not blessed using this instruction, it is on
>>>>> their 'reserved' list for future use.
>>>>>
>>>>> Peter Anvin worked the committees and got use of 0xD6 blessed, and it
>>>>> will be called UDB (per the next SDM or so).
>>>>>
>>>>> Reworking the FineIBT code to use UDB wasn't entirely trivial, and I've
>>>>> had to switch the hash register to EAX in order to free up some bytes.
>>>>>
>>>>> Per the x86_64 ABI, EAX is used to pass the number of vector registers
>>>>> for varargs -- something that should not happen in the kernel. More so,
>>>>> we build with -mskip-rax-setup, which should leave EAX completely unused
>>>>> in the calling convention.
>>>>>
>>>>> The code boots and passes the LKDTM CFI_FORWARD_PROTO test for various
>>>>> combinations (non exhaustive so far).
>>>>>
>>>>> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
>>>> Looks good to me (and using %eax will save one byte per call site as
>>>> well), but as per our IRC discussion, *my understanding* is that the
>>>> best possible performance (least branch predictor impact across
>>>> implementations) is to use a forward branch with a 2E prefix (jcc,pn in
>>>> GAS syntax) rather than a reverse branch, if space allows.
>>> Oh right. I did see that comment on IRC and them promptly forgot about
>>> it again :/ I'll have a poke. Scott, do you agree? You being responsible
>>> for the backward jump and such.
>> On top of the other, to show the delta.
>>
>> If we want a fwd branch, we can stick the D6 in the endbr poison nop.
>> That frees up more bytes again, but also that matches what I already did
>> for the bhi1 case, so less special cases is more better.
>>
>> I've had to use cs prefixed jcc.d32, because our older toolchains don't
>> like the ,pn notation.
> And then I forgot to move that cs prefix around in the bhi1 case...
> fixed that.
Dare I ask what ,pn notation is? It's not only the older toolchains
that don't know it :)
~Andrew
Powered by blists - more mailing lists