linux-kernel - Re: [PATCH v2 1/2] KVM: x86: relax canonical check for some x86 architectural msrs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <ZrE78zQYU95o6QCq@google.com>
Date: Mon, 5 Aug 2024 13:54:11 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: mlevitsk@...hat.com
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Borislav Petkov <bp@...en8.de>, "H. Peter Anvin" <hpa@...or.com>, Paolo Bonzini <pbonzini@...hat.com>, 
	Ingo Molnar <mingo@...hat.com>, x86@...nel.org, Thomas Gleixner <tglx@...utronix.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, Chao Gao <chao.gao@...el.com>
Subject: Re: [PATCH v2 1/2] KVM: x86: relax canonical check for some x86
 architectural msrs

On Mon, Aug 05, 2024, mlevitsk@...hat.com wrote:
> У пн, 2024-08-05 у 09:39 -0700, Sean Christopherson пише:
> > On Mon, Aug 05, 2024, mlevitsk@...hat.com wrote:
> > > У пт, 2024-08-02 у 08:53 -0700, Sean Christopherson пише:
> > > > > > > Checking kvm_cpu_cap_has() is wrong.  What the _host_ supports is irrelevant,
> > > > > > > what matters is what the guest CPU supports, i.e. this should check guest CPUID.
> > > > > > > Ah, but for safety, KVM also needs to check kvm_cpu_cap_has() to prevent faulting
> > > > > > > on a bad load into hardware.  Which means adding a "governed" feature until my
> > > > > > > CPUID rework lands.
> > > 
> > > Well the problem is that we passthrough these MSRs, and that means that the guest
> > > can modify them at will, and only ucode can prevent it from doing so.
> > > 
> > > So even if the 5 level paging is disabled in the guest's CPUID, but host supports it,
> > > nothing will prevent the guest to write non canonical value to one of those MSRs, 
> > > and later KVM during migration or just KVM_SET_SREGS will fail.
> >  
> > Ahh, and now I recall the discussions around the virtualization holes with LA57.
> > 
> > > Thus I used kvm_cpu_cap_has on purpose to make KVM follow the actual ucode
> > > behavior.
> > 
> > I'm leaning towards having KVM do the right thing when emulation happens to be
> > triggered.  If KVM checks kvm_cpu_cap_has() instead of guest_cpu_cap_has() (looking
> > at the future), then KVM will extend the virtualization hole to MSRs that are
> > never passed through, and also to the nested VMX checks.  Or I suppose we could
> > add separate helpers for passthrough MSRs vs. non-passthrough, but that seems
> > like it'd add very little value and a lot of maintenance burden.
> > 
> > Practically speaking, outside of tests, I can't imagine the guest will ever care
> > if there is inconsistent behavior with respect to loading non-canonical values
> > into MSRs.
> > 
> 
> Hi,
> 
> If we weren't allowing the guest (and even nested guest assuming that L1
> hypervisor allows it) to write these MSRs directly, I would have agreed with
> you, but we do allow this.
> 
> This means that for example a L2, even a malicious L2, can on purpose write
> non canonical value to one of these MSRs, and later on, KVM could kill the L0
                                                                             L1?
> due to canonical check.

Ugh, right, if L1 manually saves/restores MSRs and happens to trigger emulation
on WRMSR at the 'wrong" time.

Host userspace save/restore would suffer the same problem.  We could grant host
userspace accesses an exception, but that's rather pointless.

> Or L1 (not Linux, because it only lets canonical GS_BASE/FS_BASE), allow the
> untrusted userspace to write any value to say GS_BASE, thus allowing
> malicious L1 userspace to crash L1 (also a security violation).

FWIW, I don't think this is possible.  WR{FS,GS}BASE and other instructions that
load FS/GS.base honor CR4.LA57, it's only WRMSR that does not.

> IMHO if we really want to do it right, we need to disable pass-though of
> these MSRs if ucode check is more lax than our check, that is if L1 is
> running without 5 level paging enabled but L0 does have it supported.
>
> I don't know if this configuration is common, and thus how much this will
> affect performance.

MSR_FS_BASE and SR_KERNEL_GS_BASE are hot spots when WR{FS,GS}BASE are unsupported,
or if the guest kernels doesn't utilize those instructions.

All in all, I agree it's not worth trying to plug the virtualization hole for MSRs,
especially since mimicking hardware yields much simpler code overall.  E.g. add
a dedicated MSR helper, and have that one check kvm_cpu_cap_has(), including in
VM-Entry flows, but keep the existing is_noncanonical_address() for all non-WRMSR
path.

Something like this?

static inline bool is_noncanonical_msr_value(u64 la)
{
	u8 virt_addr_bits = kvm_cpu_cap_has(X86_FEATURE_LA57) ? 57 : 48;

	return !__is_canonical_address(la, virt_addr_bits);
}