[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231129072053.GA30650@noisy.programming.kicks-ass.net>
Date: Wed, 29 Nov 2023 08:20:53 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org, x86@...nel.org
Subject: Re: [RFC] x86/kvm/emulate: Avoid RET for fastops
On Tue, Nov 28, 2023 at 05:37:52PM -0800, Sean Christopherson wrote:
> On Sun, Nov 12, 2023, Peter Zijlstra wrote:
> > Hi,
> >
> > Inspired by the likes of ba5ca5e5e6a1 ("x86/retpoline: Don't clobber
> > RFLAGS during srso_safe_ret()") I had it on my TODO to look at this,
> > because the call-depth-tracking rethunk definitely also clobbers flags
> > and that's a ton harder to fix.
> >
> > Looking at this recently I noticed that there's really only one callsite
> > (twice, the testcc thing is basically separate from the rest of the
> > fastop stuff) and thus CALL+RET is totally silly, we can JMP+JMP.
> >
> > The below implements this, and aside from objtool going apeshit (it
> > fails to recognise the fastop JMP_NOSPEC as a jump-table and instead
> > classifies it as a tail-call), it actually builds and the asm looks
> > good sensible enough.
> >
> > I've not yet figured out how to test this stuff, but does something like
> > this look sane to you guys?
>
> Yes? The idea seems sound, but I haven't thought _that_ hard about whether or not
> there's any possible gotchas. I did a quick test and nothing exploded (and
> usually when this code breaks, it breaks spectacularly).
That's encouraging..
> > Given that rethunks are quite fat and slow, this could be sold as a
> > performance optimization I suppose.
> >
> > ---
> >
> > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > index f93e9b96927a..2cd3b5a46e7a 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -412,6 +412,17 @@ static inline void call_depth_return_thunk(void) {}
> > "call *%[thunk_target]\n", \
> > X86_FEATURE_RETPOLINE_LFENCE)
> >
> > +# define JMP_NOSPEC \
> > + ALTERNATIVE_2( \
> > + ANNOTATE_RETPOLINE_SAFE \
> > + "jmp *%[thunk_target]\n", \
> > + "jmp __x86_indirect_thunk_%V[thunk_target]\n", \
> > + X86_FEATURE_RETPOLINE, \
> > + "lfence;\n" \
> > + ANNOTATE_RETPOLINE_SAFE \
> > + "jmp *%[thunk_target]\n", \
> > + X86_FEATURE_RETPOLINE_LFENCE)
>
> There needs a 32-bit version (eww) and a CONFIG_RETPOLINE=n version. :-/
I'll go make that happen. Thanks!
Powered by blists - more mailing lists