[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231018181431.skre6i6vzrxsprck@treble>
Date: Wed, 18 Oct 2023 11:14:31 -0700
From: Josh Poimboeuf <jpoimboe@...nel.org>
To: Borislav Petkov <bp@...en8.de>
Cc: Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
linux-tip-commits@...r.kernel.org,
David Kaplan <david.kaplan@....com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>, x86@...nel.org,
David Howells <dhowells@...hat.com>
Subject: Re: [tip: x86/bugs] x86/retpoline: Ensure default return thunk isn't
used at runtime
On Wed, Oct 18, 2023 at 07:55:31PM +0200, Borislav Petkov wrote:
> On Wed, Oct 18, 2023 at 08:54:33AM -0700, Josh Poimboeuf wrote:
> > On Wed, Oct 18, 2023 at 05:12:45PM +0200, Borislav Petkov wrote:
> > > On Wed, Oct 18, 2023 at 03:38:56PM +0200, Ingo Molnar wrote:
> > > > If then WARN_ONCE().
> > >
> > > WARN_ONCE() is not enough considering that if this fires, it means we're
> > > not really properly protected against one of those RET-speculation
> > > things.
> > >
> > > It needs to be warning constantly but then still allow booting. I.e,
> > > a ratelimited warn of sorts but I don't think we have that... yet.
> >
> > I'm not sure a rate-limited WARN() would be a good thing. Either the
> > user is regularly checking dmesg (most likely in some automated fashion)
> > or they're not. If the latter, a rate-limited WARN() would wrap dmesg
> > pretty quickly.
>
> Well, freezing the box without any mention about why it happens is not
> viable either. So for lack of a better solution, overflowing dmesg is
> all we could do.
Why not just WARN_ONCE() then?
> And, on a related note, I'm thinking I should revert:
>
> e92626af3234 ("x86/retpoline: Remove .text..__x86.return_thunk section")
>
> after all because I'm debugging another similar issue reported by
> dhowells.
>
> And I can reproduce it on linux-next with his config and gcc-13. The
> splat looks like this below - and mind you, that's in a VM. On baremetal
> you get to see only the first warning and output stops.
>
> And that happens because for whatever reason apply_returns() can't find
> that last jmp __x86_return_thunk for %r15 and it barfs.
>
> When I revert e92626af3234, it is fixed. It fixes dhowells' box too.
>
> Which means, IMHO, objtool is missing to add a return return call site
> at the end of that __x86_indirect_thunk_r15.
>
> And considering how close we are to the merge window, I'd let that
> .text..__x86.return_thunk section exist so that objtool can find the
> return sites more reliably that what we currently have.
>
> We can always do e92626af3234 later, when it has seen more testing.
Ok. A revert is fine for now, but either way we do need to get to the
bottom of why objtool is messing up. Can you share the config?
--
Josh
Powered by blists - more mailing lists