[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1516882849.30244.94.camel@infradead.org>
Date: Thu, 25 Jan 2018 12:20:49 +0000
From: David Woodhouse <dwmw2@...radead.org>
To: Borislav Petkov <bp@...en8.de>, tim.c.chen@...ux.intel.com,
pjt@...gle.com, jikos@...nel.org, gregkh@...ux-foundation.org,
dave.hansen@...el.com, mingo@...nel.org, riel@...hat.com,
luto@...capital.net, torvalds@...ux-foundation.org,
ak@...ux.intel.com, keescook@...gle.com, jpoimboe@...hat.com,
peterz@...radead.org, tglx@...utronix.de, hpa@...or.com,
linux-kernel@...r.kernel.org
Cc: linux-tip-commits@...r.kernel.org
Subject: Re: [tip:x86/pti] x86/retpoline: Fill return stack buffer on vmexit
On Thu, 2018-01-25 at 13:07 +0100, Borislav Petkov wrote:
> On Fri, Jan 12, 2018 at 03:37:49AM -0800, tip-bot for David Woodhouse wrote:
> >
> > +/*
> > + * On VMEXIT we must ensure that no RSB predictions learned in the guest
> > + * can be followed in the host, by overwriting the RSB completely. Both
> > + * retpoline and IBRS mitigations for Spectre v2 need this; only on future
> > + * CPUs with IBRS_ATT *might* it be avoided.
> > + */
> > +static inline void vmexit_fill_RSB(void)
> > +{
> > +#ifdef CONFIG_RETPOLINE
> > + unsigned long loops = RSB_CLEAR_LOOPS / 2;
> > +
> > + asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE
> > + ALTERNATIVE("jmp 910f",
> > + __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)),
> > + X86_FEATURE_RETPOLINE)
> > + "910:"
> > + : "=&r" (loops), ASM_CALL_CONSTRAINT
> > + : "r" (loops) : "memory" );
> > +#endif
> > +}
> Btw,
>
> this thing is a real pain to backport to older kernels without breaking
> kABI (I know, I know, latter sucks but it is what it is)
I haven't had lunch yet, so I don't feel queasy and I'm vaguely
interested... *why* does it break kABI?
> so I'm thinking we could simplify that thing regardless.
>
> As it is, 41 bytes get replaced currently:
>
> [ 0.437005] apply_alternatives: feat: 7*32+12, old: ( (ptrval), len: 43), repl: ( (ptrval), len: 43), pad: 41
> [ 0.438885] (ptrval): old_insn: eb 29 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
> [ 0.440001] (ptrval): rpl_insn: 48 c7 c0 10 00 00 00 e8 07 00 00 00 f3 90 0f ae e8 eb f9 e8 07 00 00 00 f3 90 0f ae e8 eb f9 48 ff c8 75 e3 48 81 c4 00 01 00 00
> [ 0.444002] (ptrval): final_insn: 48 c7 c0 10 00 00 00 e8 07 00 00 00 f3 90 0f ae e8 eb f9 e8 07 00 00 00 f3 90 0f ae e8 eb f9 48 ff c8 75 e3 48 81 c4 00 01 00 00
>
> Not that it doesn't work but the less bytes we replace, the better.
>
> So it could be made to simply call two functions. The replacing then
> turns into:
>
> [ 0.438154] apply_alternatives: feat: 7*32+12, old: (ffffffff81060b60 len: 5), repl: (ffffffff82434692, len: 5), pad: 0
> [ 0.440002] ffffffff81060b60: old_insn: e8 ab 73 01 00
> [ 0.441003] ffffffff82434692: rpl_insn: e8 89 38 c4 fe
> [ 0.441966] apply_alternatives: Fix CALL offset: 0x173bb, CALL 0xffffffff81077f20
> [ 0.444002] ffffffff81060b60: final_insn: e8 bb 73 01 00
>
> The "old_insn" above is:
> ffffffff81060b60: e8 ab 73 01
> 00 callq ffffffff81077f10 <__fill_rsb_nop>
>
> and final_insn is
> ffffffff82434692: e8 89 38 c4
> fe callq ffffffff81077f20 <__fill_rsb>
>
> so both CALLs with immediates for which there's no speculation going
> on.
>
> I had to add a new alternative_void_call() macro for that but that's
> straight-forward. Also, this causes objtool to segfault so Josh and I
> need to look at that first.
>
> Other than that, it gets a lot simpler this way:
>
> --
> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> index cf5961ca8677..a84863c1b7d3 100644
> --- a/arch/x86/include/asm/alternative.h
> +++ b/arch/x86/include/asm/alternative.h
> @@ -210,6 +210,11 @@ static inline int alternatives_text_reserved(void *start, void *end)
> asm volatile (ALTERNATIVE("call %P[old]", "call %P[new]", feature) \
> : output : [old] "i" (oldfunc), [new] "i" (newfunc), ## input)
>
> +/* Like alternative_io, but for replacing a direct call with another one. */
> +#define alternative_void_call(oldfunc, newfunc, feature, input...) \
> + asm volatile (ALTERNATIVE("call %P[old]", "call %P[new]", feature) \
> + : : [old] "i" (oldfunc), [new] "i" (newfunc), ## input)
But you aren't doing the call at all in the other case, and
alternatives *always* handled the case where the first 'alternative'
instruction was a branch, didn't it?
So couldn't it just be alternative(nop, call __fill_rsb_func)?
But I still don't understand why it matters.
> +void __fill_rsb(void)
> +{
> + unsigned long loops;
> +
> + asm volatile (__stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1))
> + : "=r" (loops), ASM_CALL_CONSTRAINT
> + : : "memory" );
> +}
> +#endif
The out-of-line function should be __clear_rsb() if it's using
RSB_CLEAR_LOOPS, and __fill_rsb() if it's using RSB_FILL_LOOPS. I think
we're only using one right now but Andi at least is posting patches
which use the other, as part of the Skylake clusterfuck.
Download attachment "smime.p7s" of type "application/x-pkcs7-signature" (5213 bytes)
Powered by blists - more mailing lists