linux-kernel - Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <48FAD370-09F1-47AA-8892-8BE29DC8A949@infradead.org>
Date: Wed, 05 Feb 2025 16:18:46 +0000
From: David Woodhouse <dwmw2@...radead.org>
To: Sean Christopherson <seanjc@...gle.com>
CC: Paolo Bonzini <pbonzini@...hat.com>, Paul Durrant <paul@....org>,
 kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
 syzbot+cdeaeec70992eca2d920@...kaller.appspotmail.com,
 Joao Martins <joao.m.martins@...cle.com>
Subject: Re: [PATCH 1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range

On 5 February 2025 15:51:01 GMT, Sean Christopherson <seanjc@...gle.com> wrote:
>On Wed, Feb 05, 2025, David Woodhouse wrote:
>> 
>> Save/restore on the MSR makes no sense. It's a write-only MSR; writing
>> to it has no effect *other* than populating the target page. In KVM we
>> don't implement reading from it at all; I don't think Xen does either?
>
>Hah, that's another KVM bug, technically.  KVM relies on the MSR not being handled
>in order to generate the write-only semantics, but if the MSR index collides with
>an MSR that KVM emulates, then the MSR would be readable.  KVM supports Hyper-V's
>HV_X64_MSR_TSC_INVARIANT_CONTROL (0x40000118), so just a few hundred more MSRs
>until fireworks :-)

Nah, I don't see that as a bug. If there's a conflict then the Xen hypercall MSR "steals" writes from the one it conflicts with, sure. But since it's a write-only MSR the conflicting one still works fine for reads. Which means that Xen can "conflict" with a read-only MSR and nobody cares; arguably there's no bug at all in that case.

>> Those two happen in reverse chronological order, don't they? And in the
>> lower one the comment tells you that hyperv_enabled() doesn't work yet.
>> When the higher one is called later, it calls kvm_xen_init() *again* to
>> put the MSR in the right place.
>> 
>> It could be prettier, but I don't think it's broken, is it?
>
>Gah, -ENOCOFFEE.

I'll take the criticism though; that code is distinctly non-obvious, even with that hint in the comment about hyperv_enabled() not being usable yet.

ISTR we needed to do the Xen init early on, even before we knew precisely which MSR to use. And that's why we do it that way and then just call the function again later if we need to change the MSR. I'll see if that can be simplified, and at the very least I'll update the existing comment to explicitly state that the function will get called again later if needed.

You shouldn't *need* coffee to understand the code.

>> > Userspace breakage aside, disallowng host writes would fix the immediate issue,
>> > and I think would mitigate all concerns with putting the host at risk.  But it's
>> > not enough to actually make an overlapping MSR index work.  E.g. if the MSR is
>> > passed through to the guest, the write will go through to the hardware MSR, unless
>> > the WRMSR happens to be emulated.
>> > 
>> > I really don't want to broadly support redirecting any MSR, because to truly go
>> > down that path we'd need to deal with x2APIC, EFER, and other MSRs that have
>> > special treatment and meaning.
>> > 
>> > While KVM's stance is usually that a misconfigured vCPU model is userspace's
>> > problem, in this case I don't see any value in letting userspace be stupid.  It
>> > can't work generally, it creates unique ABI for KVM_SET_MSRS, and unless there's
>> > a crazy use case I'm overlooking, there's no sane reason for userspace to put the
>> > index in outside of the synthetic range (whereas defining seemingly nonsensical
>> > CPUID feature bits is useful for testing purposes, implementing support in
>> > userspace, etc).
>> 
>> Right, I think we should do *both*. Blocking host writes solves the
>> issue of locking problems with the hypercall page setup. All it would
>> take for that issue to recur is for us (or Microsoft) to invent a new
>> MSR in the synthetic range which is also written on vCPU init/reset.
>> And then the sanity check on where the VMM puts the Xen MSR doesn't
>> save us.
>
>Ugh, indeed.  MSRs are quite the conundrum.  Userspace MSR filters have a similar
>problem, where it's impossible to know the semantics of future hardware MSRs, and
>so it's impossible to document which MSRs userspace is allowed to intercept :-/
>
>Oh!  It doesn't help KVM avoid breaking userspace, but a way for QEMU to avoid a
>future collision would be to have QEMU start at 0x40000200 when Hyper-V is enabled,
>but then use KVM_GET_MSR_INDEX_LIST to detect a collision with KVM Hyper-V, e.g.
>increment the index until an available index is found (with sanity checks and whatnot).

Makes sense. I think that's a third separate patch, yes?

>> But yes, we should *also* do that sanity check.
>
>Ah, I'm a-ok with that.