linux-kernel - Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Z6N-kn1-p6nIWHeP@google.com>
Date: Wed, 5 Feb 2025 07:06:58 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: David Woodhouse <dwmw2@...radead.org>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Paul Durrant <paul@....org>, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org, 
	syzbot+cdeaeec70992eca2d920@...kaller.appspotmail.com, 
	Joao Martins <joao.m.martins@...cle.com>
Subject: Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial
 synthetic range

On Wed, Feb 05, 2025, David Woodhouse wrote:
> On Fri, 2025-01-31 at 17:13 -0800, Sean Christopherson wrote:
> > --- a/arch/x86/kvm/xen.c
> > +++ b/arch/x86/kvm/xen.c
> > @@ -1324,6 +1324,14 @@ int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc)
> >  	     xhc->blob_size_32 || xhc->blob_size_64))
> >  		return -EINVAL;
> >  
> > +	/*
> > +	 * Restrict the MSR to the range that is unofficially reserved for
> > +	 * synthetic, virtualization-defined MSRs, e.g. to prevent confusing
> > +	 * KVM by colliding with a real MSR that requires special handling.
> > +	 */
> > +	if (xhc->msr && (xhc->msr < 0x40000000 || xhc->msr > 0x4fffffff))
> > +		return -EINVAL;
> > +
> >  	mutex_lock(&kvm->arch.xen.xen_lock);
> >  
> >  	if (xhc->msr && !kvm->arch.xen_hvm_config.msr)
> 
> I'd prefer to see #defines for those magic values.

Can do.  Hmm, and since this would be visible to userspace, arguably the #defines
should go in arch/x86/include/uapi/asm/kvm.h

> Especially as there is a corresponding requirement that they never be set
> from host context (which is where the potential locking issues come in).
> Which train of thought leads me to ponder this as an alternative (or
> additional) solution:
> 
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3733,7 +3733,13 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>         u32 msr = msr_info->index;
>         u64 data = msr_info->data;
>  
> -       if (msr && msr == vcpu->kvm->arch.xen_hvm_config.msr)
> +       /*
> +        * Do not allow host-initiated writes to trigger the Xen hypercall
> +        * page setup; it could incur locking paths which are not expected
> +        * if userspace sets the MSR in an unusual location.

That's just as likely to break userspace.  Doing a save/restore on the MSR doesn't
make a whole lot of sense since it's effectively a "command" MSR, but IMO it's not
any less likely than userspace putting the MSR index outside of the synthetic range.

Side topic, upstream QEMU doesn't even appear to put the MSR at the Hyper-V
address.  It tells the guest that's where the MSR is located, but the config
passed to KVM still uses the default.

        /* Hypercall MSR base address */
        if (hyperv_enabled(cpu)) {
            c->ebx = XEN_HYPERCALL_MSR_HYPERV;
            kvm_xen_init(cs->kvm_state, c->ebx);
        } else {
            c->ebx = XEN_HYPERCALL_MSR;
        }

...

        /* hyperv_enabled() doesn't work yet. */
        uint32_t msr = XEN_HYPERCALL_MSR;
        ret = kvm_xen_init(s, msr);
        if (ret < 0) {
            return ret;
        }

Userspace breakage aside, disallowng host writes would fix the immediate issue,
and I think would mitigate all concerns with putting the host at risk.  But it's
not enough to actually make an overlapping MSR index work.  E.g. if the MSR is
passed through to the guest, the write will go through to the hardware MSR, unless
the WRMSR happens to be emulated.

I really don't want to broadly support redirecting any MSR, because to truly go
down that path we'd need to deal with x2APIC, EFER, and other MSRs that have
special treatment and meaning.

While KVM's stance is usually that a misconfigured vCPU model is userspace's
problem, in this case I don't see any value in letting userspace be stupid.  It
can't work generally, it creates unique ABI for KVM_SET_MSRS, and unless there's
a crazy use case I'm overlooking, there's no sane reason for userspace to put the
index in outside of the synthetic range (whereas defining seemingly nonsensical
CPUID feature bits is useful for testing purposes, implementing support in
userspace, etc).