[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9feaddd6-f013-468a-b0eb-2dd2896fb88f@citrix.com>
Date: Fri, 8 Nov 2024 02:02:53 +0000
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: Dave Hansen <dave.hansen@...el.com>, "Xin Li (Intel)" <xin@...or.com>,
linux-kernel@...r.kernel.org
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
peterz@...radead.org
Subject: Re: [PATCH v2 1/1] x86/fred: Clear WFE in missing-ENDBRANCH #CPs
On 07/11/2024 8:51 pm, Dave Hansen wrote:
> On 9/16/24 11:10, Xin Li (Intel) wrote:
>> The WFE, i.e., WAIT_FOR_ENDBRANCH, bit in the augmented CS of FRED
>> stack frame is set to 1 in missing-ENDBRANCH #CP exceptions.
>>
>> The CPU will generate another missing-ENDBRANCH #CP if the WFE bit
>> is left set, because the CPU IBT will be set in the WFE state upon
>> completion of the following ERETS instruction and then the CPU will
>> resume from the instruction that just caused the previous #CP.
>>
>> Clear WFE to avoid dead looping in missing-ENDBRANCH #CPs.
>>
>> Describe the IBT story in the comment of ibt_clear_fred_wfe() using
>> Andrew Cooper's write-up.
> I should have responded to this earlier. I do see why Andrew thought my
> earlier description was off base. Let me see if I can try for a better
> changelog:
>
> The kernel can enable Indirect Branch Tracking (IBT) for itself.
> Hardware generates a #CP exception if a kernel indirect branch lands
> somewhere other than an ENDBR instruction. The kernel #CP handler then
> decides if the it should warn or do a fatal BUG().
Perhaps "an appropriate ENDBR"?
You also get #CP[endbr] for encountering the wrong ENDBR{32,64} instruction.
> The BUG() case works fine with or without FRED. But the warning mode is
> broken with FRED.
>
> In short, the pre-FRED architecture clobbers the kernel IBT state of an
> interrupted context. That includes clobbering the state of IBT when the
> #CP went off, suppressing future #CP's. This is bad architecture, but
> handy for a #CP handler that wants to suppress those future #CP's.
There isn't really a warning mode, so much as a singleton selftest to
check that #CP gets generated.
>
> FRED, on the other hand, provides space on the entry stack (in an
> expanded CS area) to save and restore IBT state. Since the hardware
> doesn't clobber the IBT state, software must do it instead.
>
> Aside:
> Why does without-FRED case work? There is only one CET WFE bit
> per privilege level. The #CP handler itself has an ENDBR
> instruction. That ENDBR clears WFE on the way to handling the
> #CP. Consider what would happen if a kernel indirect call landed
> on an XOR instead of an ENDBR:
>
> CALL (*%rax) // sets WFE
> XOR %rax,%rax // uh oh, not an ENDBR
> #CP
> ENDBR // first instruction in CP handler, clears WFE
> ... handle #CP here
> IRET
> XOR %rax,%rax // No problem, WFE still clear!
>
> See? The handler clears WFE and lets the XOR run.
>
> Is that a more complete (and accurate) story for folks?
More complete, yes.
"The #CP handler itself has an ENDBR instruction." is a consequence, not
a cause.
CALL *ind does indeed set WFE, and WFE stays asserted across the
instruction boundary, but behaves somewhat like the Resume Flag (falls
to zero everywhere else). It's not "ENDBR clears WFE" because there's
the suppress state too generated by the NOTRK prefix, which causes WFE
to fall to 0 on all instructions.
When decode finds an instruction, and WFE is set, and the instruction is
not the right ENDBR, it raises a #CP fault.
IDT event delivery sets WFE=1 because it delivered an event, because the
spec says so.
If the old context was CPL3, then the interrupted context's WFE is
stashed away in MSR_U_CET. But, if the old context was CPL<3, the WFE=1
both clobbers the interrupted context, and requires there to be an ENDBR
in the handler.
On the way out, the problem is that IRET doesn't set WFE=? in the
returned-to context.
What FRED does is stash the interrupted context's WFE on the stack, and
ERET{S,U} restores it on the way out.
Is this any clearer?
~Andrew
P.S. for bonus points, consider what happens if a regular interrupt
occurs between CALL *(%rax) and XOR. If you can generate precise
interrupts, you can escape CET-IBT protections indefinitely.
Powered by blists - more mailing lists