[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5D679723-D84F-42F0-AD8A-8BD1A38FB6CD@zytor.com>
Date: Fri, 17 Mar 2023 14:32:28 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Lai Jiangshan <jiangshanlai@...il.com>, Xin Li <xin3.li@...el.com>
CC: linux-kernel@...r.kernel.org, x86@...nel.org, kvm@...r.kernel.org,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, peterz@...radead.org,
andrew.cooper3@...rix.com, seanjc@...gle.com, pbonzini@...hat.com,
ravi.v.shankar@...el.com
Subject: Re: [PATCH v5 22/34] x86/fred: FRED initialization code
On March 17, 2023 6:35:57 AM PDT, Lai Jiangshan <jiangshanlai@...il.com> wrote:
>Hello
>
>
>Comments in cpu_init_fred_exceptions() seem scarce for understanding.
>
>On Tue, Mar 7, 2023 at 11:07 AM Xin Li <xin3.li@...el.com> wrote:
>
>> +/*
>> + * Initialize FRED on this CPU. This cannot be __init as it is called
>> + * during CPU hotplug.
>> + */
>> +void cpu_init_fred_exceptions(void)
>> +{
>> + wrmsrl(MSR_IA32_FRED_CONFIG,
>> + FRED_CONFIG_ENTRYPOINT(fred_entrypoint_user) |
>> + FRED_CONFIG_REDZONE(8) | /* Reserve for CALL emulation */
>> + FRED_CONFIG_INT_STKLVL(0));
>
>What is it about "Reserve for CALL emulation"?
>
>I guess it relates to X86_TRAP_BP. In entry_64.S:
>
> .if \vector == X86_TRAP_BP
> /*
> * If coming from kernel space, create a 6-word gap to allow the
> * int3 handler to emulate a call instruction.
> */
>
>> +
>> + wrmsrl(MSR_IA32_FRED_STKLVLS,
>> + FRED_STKLVL(X86_TRAP_DB, 1) |
>> + FRED_STKLVL(X86_TRAP_NMI, 2) |
>> + FRED_STKLVL(X86_TRAP_MC, 2) |
>> + FRED_STKLVL(X86_TRAP_DF, 3));
>
>Why each exception here needs a stack level > 0?
>Especially for X86_TRAP_DB and X86_TRAP_NMI.
>
>Why does or why does not X86_TRAP_VE have a stack level > 0?
>
>X86_TRAP_DF is the highest stack level, is it accidental
>or deliberate?
>
>Thanks
>Lai
>
Yes, the extra redzone space is there to allow for the call emulation without having to adjust the stack frame "manually".
In theory we could enable it only while code patching is in progress, but that would probably just result in stack overflows becoming utterly impossible to debug as we have to consider the worst case.
The purpose of separate stacks for NMI, #DB and #MC *in the kernel* (remember that user space faults are always taken on stack level 0) is to avoid overflowing the kernel stack. #DB in the kernel would imply the use of a kernel debugger.
#DF is the highest level because a #DF means "something went wrong *while delivering an exception*." The number of cases for which that can happen with FRED is drastically reduced and basically amount to "the stack you pointed me to is broken."
Thus, you basically always want to change stacks on #DF, which means it should be at the highest level.
Powered by blists - more mailing lists