[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZrE78zQYU95o6QCq@google.com>
Date: Mon, 5 Aug 2024 13:54:11 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: mlevitsk@...hat.com
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Borislav Petkov <bp@...en8.de>, "H. Peter Anvin" <hpa@...or.com>, Paolo Bonzini <pbonzini@...hat.com>,
Ingo Molnar <mingo@...hat.com>, x86@...nel.org, Thomas Gleixner <tglx@...utronix.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, Chao Gao <chao.gao@...el.com>
Subject: Re: [PATCH v2 1/2] KVM: x86: relax canonical check for some x86
architectural msrs
On Mon, Aug 05, 2024, mlevitsk@...hat.com wrote:
> У пн, 2024-08-05 у 09:39 -0700, Sean Christopherson пише:
> > On Mon, Aug 05, 2024, mlevitsk@...hat.com wrote:
> > > У пт, 2024-08-02 у 08:53 -0700, Sean Christopherson пише:
> > > > > > > Checking kvm_cpu_cap_has() is wrong. What the _host_ supports is irrelevant,
> > > > > > > what matters is what the guest CPU supports, i.e. this should check guest CPUID.
> > > > > > > Ah, but for safety, KVM also needs to check kvm_cpu_cap_has() to prevent faulting
> > > > > > > on a bad load into hardware. Which means adding a "governed" feature until my
> > > > > > > CPUID rework lands.
> > >
> > > Well the problem is that we passthrough these MSRs, and that means that the guest
> > > can modify them at will, and only ucode can prevent it from doing so.
> > >
> > > So even if the 5 level paging is disabled in the guest's CPUID, but host supports it,
> > > nothing will prevent the guest to write non canonical value to one of those MSRs,
> > > and later KVM during migration or just KVM_SET_SREGS will fail.
> >
> > Ahh, and now I recall the discussions around the virtualization holes with LA57.
> >
> > > Thus I used kvm_cpu_cap_has on purpose to make KVM follow the actual ucode
> > > behavior.
> >
> > I'm leaning towards having KVM do the right thing when emulation happens to be
> > triggered. If KVM checks kvm_cpu_cap_has() instead of guest_cpu_cap_has() (looking
> > at the future), then KVM will extend the virtualization hole to MSRs that are
> > never passed through, and also to the nested VMX checks. Or I suppose we could
> > add separate helpers for passthrough MSRs vs. non-passthrough, but that seems
> > like it'd add very little value and a lot of maintenance burden.
> >
> > Practically speaking, outside of tests, I can't imagine the guest will ever care
> > if there is inconsistent behavior with respect to loading non-canonical values
> > into MSRs.
> >
>
> Hi,
>
> If we weren't allowing the guest (and even nested guest assuming that L1
> hypervisor allows it) to write these MSRs directly, I would have agreed with
> you, but we do allow this.
>
> This means that for example a L2, even a malicious L2, can on purpose write
> non canonical value to one of these MSRs, and later on, KVM could kill the L0
L1?
> due to canonical check.
Ugh, right, if L1 manually saves/restores MSRs and happens to trigger emulation
on WRMSR at the 'wrong" time.
Host userspace save/restore would suffer the same problem. We could grant host
userspace accesses an exception, but that's rather pointless.
> Or L1 (not Linux, because it only lets canonical GS_BASE/FS_BASE), allow the
> untrusted userspace to write any value to say GS_BASE, thus allowing
> malicious L1 userspace to crash L1 (also a security violation).
FWIW, I don't think this is possible. WR{FS,GS}BASE and other instructions that
load FS/GS.base honor CR4.LA57, it's only WRMSR that does not.
> IMHO if we really want to do it right, we need to disable pass-though of
> these MSRs if ucode check is more lax than our check, that is if L1 is
> running without 5 level paging enabled but L0 does have it supported.
>
> I don't know if this configuration is common, and thus how much this will
> affect performance.
MSR_FS_BASE and SR_KERNEL_GS_BASE are hot spots when WR{FS,GS}BASE are unsupported,
or if the guest kernels doesn't utilize those instructions.
All in all, I agree it's not worth trying to plug the virtualization hole for MSRs,
especially since mimicking hardware yields much simpler code overall. E.g. add
a dedicated MSR helper, and have that one check kvm_cpu_cap_has(), including in
VM-Entry flows, but keep the existing is_noncanonical_address() for all non-WRMSR
path.
Something like this?
static inline bool is_noncanonical_msr_value(u64 la)
{
u8 virt_addr_bits = kvm_cpu_cap_has(X86_FEATURE_LA57) ? 57 : 48;
return !__is_canonical_address(la, virt_addr_bits);
}
Powered by blists - more mailing lists