[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <da6efbb5-2610-6721-77ca-9833d13b9398@oracle.com>
Date: Thu, 9 Apr 2020 10:18:56 +0200
From: Alexandre Chartre <alexandre.chartre@...cle.com>
To: Peter Zijlstra <peterz@...radead.org>,
Josh Poimboeuf <jpoimboe@...hat.com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, jthierry@...hat.com,
tglx@...utronix.de
Subject: Re: [PATCH V2 9/9] x86/speculation: Remove all
ANNOTATE_NOSPEC_ALTERNATIVE directives
On 4/8/20 11:35 PM, Peter Zijlstra wrote:
> On Tue, Apr 07, 2020 at 07:27:39PM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 07, 2020 at 11:28:38AM -0500, Josh Poimboeuf wrote:
>>> Again, we should warn on stack changes inside alternatives, and then
>>> look at converting RSB and retpolines to use static branches so they
>>> have deterministic stacks.
>>
>> I don't think we need static brancher, we should just out-of-line the
>> whole thing.
>>
>> Let me sort this CFI error Thomas is getting and then I'll attempt a
>> patch along the lines I outlined in earlier emails.
>
> Something like so.. seems to build and boot.
>
> ---
> From: Peter Zijlstra (Intel) <peterz@...radead.org>
> Subject: x86: Out-of-line retpoline
>
> Since GCC generated code already uses out-of-line retpolines and objtool
> has trouble with retpolines in alternatives, out-of-line them entirely.
>
> This will enable objtool (once it's been taught a few more tricks) to
> generate valid ORC data for the out-of-line copies, which means we can
> correctly and reliably unwind through a retpoline.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
> arch/x86/crypto/aesni-intel_asm.S | 4 +--
> arch/x86/crypto/camellia-aesni-avx-asm_64.S | 2 +-
> arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 2 +-
> arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 26 ++++++++---------
> arch/x86/entry/entry_32.S | 6 ++--
> arch/x86/entry/entry_64.S | 2 +-
> arch/x86/include/asm/asm-prototypes.h | 8 ++++--
> arch/x86/include/asm/nospec-branch.h | 42 ++++------------------------
> arch/x86/kernel/ftrace_32.S | 2 +-
> arch/x86/kernel/ftrace_64.S | 4 +--
> arch/x86/lib/checksum_32.S | 4 +--
> arch/x86/lib/retpoline.S | 27 +++++++++++++++---
> arch/x86/platform/efi/efi_stub_64.S | 2 +-
> 13 files changed, 62 insertions(+), 69 deletions(-)
>
...
> /*
> * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple
> * indirect jmp/call which may be susceptible to the Spectre variant 2
> @@ -111,10 +83,9 @@
> */
> .macro JMP_NOSPEC reg:req
> #ifdef CONFIG_RETPOLINE
> - ANNOTATE_NOSPEC_ALTERNATIVE
> - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \
> - __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \
> - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
> + ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
> + __stringify(jmp __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE, \
> + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD
> #else
> jmp *\reg
> #endif
> @@ -122,10 +93,9 @@
>
> .macro CALL_NOSPEC reg:req
> #ifdef CONFIG_RETPOLINE
> - ANNOTATE_NOSPEC_ALTERNATIVE
> - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg), \
> - __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\
> - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD
> + ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \
> + __stringify(call __x86_retpoline_\()\reg), X86_FEATURE_RETPOLINE,\
> + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_AMD
For X86_FEATURE_RETPOLINE_AMD, the call won't be aligned like the others,
it will be after the lfence instruction so ORC data won't be at the same
place. I am adding some code in objtool to check that alternatives don't
change the stack, but I should actually be checking if all alternatives
have the same unwind instructions at the same place.
Other than that, my only question would be any impact on performances.
Retpoline code was added with trying to limit performance impact.
Here, JMP_NOSPEC has now an additional (long) jump, and CALL_NOSPEC
is doing a long call instead of a near call. But I have no idea if this
has a visible impact.
alex.
Powered by blists - more mailing lists