[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bf053035-6ead-4f3f-b53e-d265824199c3@citrix.com>
Date: Fri, 28 Feb 2025 00:36:40 +0000
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
Cc: x86@...nel.org, Josh Poimboeuf <jpoimboe@...nel.org>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86/speculation: Simplify and make CALL_NOSPEC consistent
On 28/02/2025 12:31 am, Pawan Gupta wrote:
> On Thu, Feb 27, 2025 at 03:13:48PM -0800, Pawan Gupta wrote:
>> On Thu, Feb 27, 2025 at 06:57:37PM +0000, Andrew Cooper wrote:
>>> On 27/02/2025 6:41 pm, Pawan Gupta wrote:
>>>> On Thu, Feb 27, 2025 at 12:49:48AM +0000, Andrew Cooper wrote:
>>>>> On 26/02/2025 9:03 pm, Pawan Gupta wrote:
>>>>>> @@ -420,20 +420,28 @@ static inline void call_depth_return_thunk(void) {}
>>>>>>
>>>>>> #ifdef CONFIG_X86_64
>>>>>>
>>>>>> +/*
>>>>>> + * Equivalent to -mindirect-branch-cs-prefix; emit the 5 byte jmp/call
>>>>>> + * to the retpoline thunk with a CS prefix when the register requires
>>>>>> + * a REX prefix byte to encode. Also see apply_retpolines().
>>>>>> + */
>>>>> Technically, both comments aren't quite accurate. __CS_PREFIX() emits a
>>>>> conditional CS prefix in a manner compatible with
>>>>> -mindirect-branch-cs-prefix, not the full 5/6 byte jmp/call.
>>>> You are right, I will update the comment, and also the ASSEMBLY version
>>>> where this comment came from:
>>>>
>>>> /*
>>>> * Equivalent to -mindirect-branch-cs-prefix; emit the 5 byte jmp/call
>>>> * to the retpoline thunk with a CS prefix when the register requires
>>>> * a REX prefix byte to encode. Also see apply_retpolines().
>>>> */
>>>> .macro __CS_PREFIX reg:req
>>>> .irp rs,r8,r9,r10,r11,r12,r13,r14,r15
>>>> .ifc \reg,\rs
>>>> .byte 0x2e
>>>> .endif
>>>> .endr
>>>> .endm
>>>>
>>>>>> +#define __CS_PREFIX(reg) \
>>>>>> + .irp rs,r8,r9,r10,r11,r12,r13,r14,r15; \
>>>>>> + .ifc \\rs, \reg; \
>>>>> Why are these escaped differently? Given they're all \r of some form or
>>>>> another, I guess something is going wonky with __stringify(), but its
>>>>> still weird for them to be different.
>>>>>
>>>>> Do you have a fully pre-processed source to hand to see how CALL_NOSPEC
>>>>> ends up?
>>>> Below is the pre-processed source for test_cc() generated with "make arch/x86/kvm/emulate.i".
>>>>
>>>> - This is with double backslash in ".ifc \\rs, \reg":
>>>>
>>>> asm("push %[flags]; popf; " ".irp rs,r8,r9,r10,r11,r12,r13,r14,r15; .ifc \\rs, \%V[thunk_target]; .byte 0x2e; .endif; .endr;" "call __x86_indirect_thunk_%V[thunk_target]\n"
>>>> ^
>>>> This ends up emitting the CS prefix byte correctly:
>>>>
>>>> 2e e8 51 c9 32 01 cs call ffffffff824289e0
>>>>
>>>> - This is with single backslash in ".ifc \\rs, \reg":
>>>>
>>>> asm("push %[flags]; popf; " ".irp rs,r8,r9,r10,r11,r12,r13,r14,r15; .ifc \rs, \%V[thunk_target]; .byte 0x2e; .endif; .endr;" "c all __x86_indirect_thunk_%V[thunk_target]\n"
>>>> ^
>>>> This version does not emit the CS prefix byte:
>>>>
>>>> e8 52 c9 32 01 call ffffffff824289e0
>>>>
>>>> I tried looking in gcc inline assembly documentation but could not find
>>>> anything that would explain this. :(
>>> It's because it's about plain C strings.
>>>
>>> \r (from \rs) is Carriage Return (ASCII 0x0d).
>> Ah, right.
>>
>>> After AS's macro expansion, \reg becomes \% which is not a valid escape
>>> character, so the \ gets left intact.
>>>
>>> \reg should become \\reg or you'll probably get a compiler complaining
>>> eventually.
>> Using \\ for reg like this:
>>
>> .ifc \\rs, \\reg
>>
>> is not emitting the CS prefix. I am trying to wrap my head around the
>> magic.
> Getting rid of one layer of macro makes it less magical:
>
> #define __CS_PREFIX(reg) \
> ".irp rs,r8,r9,r10,r11,r12,r13,r14,r15\n" \
> ".ifc \\rs," reg "\n" \
> ".byte 0x2e\n" \
> ".endif\n" \
> ".endr\n"
>
> #define CALL_NOSPEC __CS_PREFIX("%V[thunk_target]") \
> "call __x86_indirect_thunk_%V[thunk_target]\n"
> #else
> #define CALL_NOSPEC "call *%[thunk_target]\n"
> #endif
>
> Preprocessor output is as expected, and this emits the CS prefix correctly:
>
> ".ifc \\rs, %V[thunk_target]"
>
> Full version:
>
> asm("push %[flags]; popf; " ".irp rs,r8,r9,r10,r11,r12,r13,r14,r15\n" ".ifc \\rs," "%V[thunk_target]" "\n" ".byte 0x2e\n" ".endif\n" ".endr\n" "call __x86_indirect_thunk_%V[thunk_target]\n"
> : "=a"(rc), "+r" (current_stack_pointer) : [thunk_target]"r"(fop), [flags]"r"(flags));
Yeah, I think that's a lot better.
I still cant spot precisely what's wrong with the prior \\reg.
~Andrew
Powered by blists - more mailing lists