[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <91c7727f66afa7c1f424fb08958579dfa3dc708c.camel@redhat.com>
Date: Fri, 26 Jul 2024 11:08:36 -0400
From: mlevitsk@...hat.com
To: Chao Gao <chao.gao@...el.com>
Cc: kvm@...r.kernel.org, Dave Hansen <dave.hansen@...ux.intel.com>, Sean
Christopherson <seanjc@...gle.com>, Borislav Petkov <bp@...en8.de>, Thomas
Gleixner <tglx@...utronix.de>, x86@...nel.org,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>, Paolo
Bonzini <pbonzini@...hat.com>, "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH 1/2] KVM: x86: relax canonical checks for some x86
architectural msrs
У пт, 2024-07-26 у 14:50 +0800, Chao Gao пише:
> On Thu, Jul 25, 2024 at 11:01:09AM -0400, Maxim Levitsky wrote:
> > Several architectural msrs (e.g MSR_KERNEL_GS_BASE) must contain
> > a canonical address, and according to Intel PRM, this is enforced
> > by #GP on a MSR write.
> >
> > However with the introduction of the LA57 the definition of
> > what is a canonical address became blurred.
> >
> > Few tests done on Sapphire Rapids CPU and on Zen4 CPU,
> > reveal:
> >
> > 1. These CPUs do allow full 57-bit wide non canonical values
> > to be written to MSR_GS_BASE, MSR_FS_BASE, MSR_KERNEL_GS_BASE,
> > regardless of the state of CR4.LA57.
> > Zen4 in addition to that even allows such writes to
> > MSR_CSTAR and MSR_LSTAR.
>
> This actually is documented/implied at least in ISE [1]. In Chapter 6.4
> "CANONICALITY CHECKING FOR DATA ADDRESSES WRITTEN TO CONTROL REGISTERS AND
> MSRS"
>
> In Processors that support LAM continue to require the addresses written to
> control registers or MSRs to be 57-bit canonical if the processor _supports_
> 5-level paging or 48-bit canonical if it supports only 4-level paging
>
> [1]: https://cdrdv2.intel.com/v1/dl/getContent/671368
I haven't found this in the actual PRM, but mine is relatively old,
(from September 2023, I didn't bother to update it because 5 level paging is
quite an old feature)
>
> >
> > 2. These CPUs don't prevent the user from switching back to 4 level
> > paging with values that will be non canonical in 4 level paging,
> > and instead just allow the msrs to contain these values.
> >
> > Since these MSRS are all passed through to the guest, and microcode
> > allows the non canonical values to get into these msrs,
> > KVM has to tolerate such values and avoid crashing the guest.
> >
> > To do so, always allow the host initiated values regardless of
> > the state of CR4.LA57, instead only gate this by the actual hardware
> > support for 5 level paging.
> >
> > To be on the safe side leave the check for guest writes as is.
> >
> > Signed-off-by: Maxim Levitsky <mlevitsk@...hat.com>
> > ---
> > arch/x86/kvm/x86.c | 31 ++++++++++++++++++++++++++++++-
> > 1 file changed, 30 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index a6968eadd418..c599deff916e 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -1844,7 +1844,36 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
> > case MSR_KERNEL_GS_BASE:
> > case MSR_CSTAR:
> > case MSR_LSTAR:
> > - if (is_noncanonical_address(data, vcpu))
> > +
> > + /*
> > + * Both AMD and Intel cpus tend to allow values which
> > + * are canonical in the 5 level paging mode but are not
> > + * canonical in the 4 level paging mode to be written
> > + * to the above msrs, regardless of the state of the CR4.LA57.
> > + *
> > + * Intel CPUs do honour CR4.LA57 for the MSR_CSTAR/MSR_LSTAR,
> > + * AMD cpus don't even do that.
> > + *
> > + * Both CPUs also allow non canonical values to remain in
> > + * these MSRs if the CPU was in 5 level paging mode and was
> > + * switched back to 4 level paging, and tolerate these values
> > + * both in native MSRs and in vmcs/vmcb fields.
> > + *
> > + * To avoid crashing a guest, which manages using one of the above
> > + * tricks to get non canonical value to one of
> > + * these MSRs, and later migrates, allow the host initiated
> > + * writes regardless of the state of CR4.LA57.
> > + *
> > + * To be on the safe side, don't allow the guest initiated
> > + * writes to bypass the canonical check (e.g be more strict
> > + * than what the actual ucode usually does).
>
> I may think guest-initiated writes should be allowed as well because this is
> the architectural behavior.
Note though that for MSR_CSTAR/MSR_LSTAR I did set #GP, depending on CR.LA57.
Ah, I see it, KVM intercepts these msrs on VMX (but not on SVM) and I was under
the impression that it doesn't, that is why I get #GP depending on CR4.LA57....
I do wonder why we intercept these msrs on VMX and not on SVM.
It all makes sense now, thanks a lot for the explanation!
>
> > + */
> > +
> > + if (!host_initiated && is_noncanonical_address(data, vcpu))
> > + return 1;
> > +
> > + if (!__is_canonical_address(data,
> > + boot_cpu_has(X86_FEATURE_LA57) ? 57 : 48))
>
> boot_cpu_has(X86_FEATURE_LA57)=1 means LA57 is enabled. Right?
>
> With this change, host-initiated writes must be 48-bit canonical if LA57 isn't
> enabled on the host, even if it is enabled in the guest. (note that KVM can
> expose LA57 to guests even if LA57 is disabled on the host, see
> kvm_set_cpu_caps()).
Sorry about this - we indeed need to use kvm_cpu_cap_has(X86_FEATURE_LA57) because
it is forced based on host raw CPUID.
I remember I wanted to do exactly this but forgot somehow.
Also I need to update this for nested VMX - these msrs are also checked there based
on CR4.LA57.
Thanks for the clarification, and it all makes sense. I'll send v2 with all of this
when I get back from a vacation (next Wednesday).
Best regards,
Maxim Levitsky
>
> > return 1;
>
Powered by blists - more mailing lists