linux-kernel - Re: [PATCH] KVM: x86: Inject #UD on "unsupported" hypercall if patching fails

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <57313f38-5b2b-e352-7502-1a3a70fa4ef1@redhat.com>
Date:   Fri, 10 Dec 2021 23:41:14 +0100
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Hou Wenlong <houwenlong93@...ux.alibaba.com>
Subject: Re: [PATCH] KVM: x86: Inject #UD on "unsupported" hypercall if
 patching fails

On 12/10/21 23:29, Sean Christopherson wrote:
> Inject a #UD if patching in the correct hypercall fails, e.g. due to
> emulator_write_emulated() failing because RIP is mapped not-writable by
> the guest.  The guest is likely doomed in any case, but observing a #UD
> in the guest is far friendlier to debug/triage than a !WRITABLE #PF with
> CR2 pointing at the RIP of the faulting instruction.
> 
> Ideally, KVM wouldn't patch at all; it's the guest's responsibility to
> identify and use the correct hypercall instruction (VMCALL vs. VMMCALL).
> Sadly, older Linux kernels prior to commit c1118b3602c2 ("x86: kvm: use
> alternatives for VMCALL vs. VMMCALL if kernel text is read-only") do the
> wrong thing and blindly use VMCALL, i.e. removing the patching would
> break running VMs with older kernels.
> 
> One could argue that KVM should be "fixed" to ignore guest paging
> protections instead of injecting #UD, but patching in the first place was
> a mistake as it was a hack-a-fix for a guest bug.

Sort of.  I agree that patching is awful, but I'm not sure about 
injecting #UD vs. just doing the hypercall; the original reason for the 
patching was to allow Intel<->AMD cross-vendor migration to work somewhat.

That in turn promoted Linux's ill-conceived sloppiness of just using 
vmcall, which lasted until commit c1118b3602c2.

> There are myriad fatal
> issues with KVM's patching:
> 
>    1. Patches using an emulated guest write, which will fail if RIP is not
>       mapped writable.  This is the issue being mitigated.
> 
>    2. Doesn't ensure the write is "atomic", e.g. a hypercall that splits a
>       page boundary will be handled as two separate writes, which means
>       that a partial, corrupted instruction can be observed by a vCPU.

Only the third bytes differs between VMCALL and VMMCALL so that's not 
really a problem.  (Apparently what happened is that Microsoft asked 
Intel to use 0xc1 like AMD, and VMware asked AMD to use 0xd9 like Intel, 
or something like that; and they ended up swapping opcodes.  But this 
may be an urban legend, no matter how plausible).

The big ones are 1 and 4.

Thanks,

Paolo

>    3. Doesn't serialize other CPU cores after updating the code stream.
> 
>    4. Completely fails to account for the case where KVM is emulating due
>       to invalid guest state with unrestricted_guest=0.  Patching and
>       retrying the instruction will result in vCPU getting stuck in an
>       infinite loop.
> 
> But, the "support" _so_ awful, especially #1, that there's practically
> zero chance that a modern guest kernel can rely on KVM to patch the guest.
> So, rather than proliferate KVM's bad behavior any further than the
> absolute minimum needed for backwards compatibility, just try to make it
> suck a little less.