linux-kernel - Re: [PATCH 7/9] KVM: SVM: Inject #UD on attempted emulation for SEV guest w/o insn buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ac496e47-c949-0e9d-4735-d51a7c9c0f62@oracle.com>
Date:   Thu, 20 Jan 2022 16:11:02 +0000
From:   Liam Merwick <liam.merwick@...cle.com>
To:     Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Cc:     Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Tom Lendacky <thomas.lendacky@....com>,
        Brijesh Singh <brijesh.singh@....com>,
        Liam Merwick <liam.merwick@...cle.com>
Subject: Re: [PATCH 7/9] KVM: SVM: Inject #UD on attempted emulation for SEV
 guest w/o insn buffer

On 20/01/2022 01:07, Sean Christopherson wrote:
> Inject #UD if KVM attempts emulation for an SEV guests without an insn
> buffer and instruction decoding is required.  The previous behavior of
> allowing emulation if there is no insn buffer is undesirable as doing so
> means KVM is reading guest private memory and thus decoding cyphertext,
> i.e. is emulating garbage.  The check was previously necessary as the
> emulation type was not provided, i.e. SVM needed to allow emulation to
> handle completion of emulation after exiting to userspace to handle I/O.
> 

A few cyphertext references...

> Signed-off-by: Sean Christopherson <seanjc@...gle.com>

Reviewed-by: Liam Merwick <liam.merwick@...cle.com>

> ---
>   arch/x86/kvm/svm/svm.c | 89 ++++++++++++++++++++++++++----------------
>   1 file changed, 55 insertions(+), 34 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index ed2ca875b84b..d324183fc596 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4277,49 +4277,70 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
>   	if (sev_es_guest(vcpu->kvm))
>   		return false;
>   
> +	/*
> +	 * Emulation is possible if the instruction is already decoded, e.g.
> +	 * when completing I/O after returning from userspace.
> +	 */
> +	if (emul_type & EMULTYPE_NO_DECODE)
> +		return true;
> +
> +	/*
> +	 * Emulation is possible for SEV guests if and only if a prefilled
> +	 * buffer containing the bytes of the intercepted instruction is
> +	 * available. SEV guest memory is encrypted with a guest specific key
> +	 * and cannot be decrypted by KVM, i.e. KVM would read cyphertext and
> +	 * decode garbage.
> +	 *
> +	 * Inject #UD if KVM reached this point without an instruction buffer.
> +	 * In practice, this path should never be hit by a well-behaved guest,
> +	 * e.g. KVM doesn't intercept #UD or #GP for SEV guests, but this path
> +	 * is still theoretically reachable, e.g. via unaccelerated fault-like
> +	 * AVIC access, and needs to be handled by KVM to avoid putting the
> +	 * guest into an infinite loop.   Injecting #UD is somewhat arbitrary,
> +	 * but its the least awful option given lack of insight into the guest.
> +	 */
> +	if (unlikely(!insn)) {
> +		kvm_queue_exception(vcpu, UD_VECTOR);
> +		return false;
> +	}
> +
> +	/*
> +	 * Emulate for SEV guests if the insn buffer is not empty.  The buffer
> +	 * will be empty if the DecodeAssist microcode cannot fetch bytes for
> +	 * the faulting instruction because the code fetch itself faulted, e.g.
> +	 * the guest attempted to fetch from emulated MMIO or a guest page
> +	 * table used to translate CS:RIP resides in emulated MMIO.
> +	 */
> +	if (likely(insn_len))
> +		return true;
> +
>   	/*
>   	 * Detect and workaround Errata 1096 Fam_17h_00_0Fh.
>   	 *
>   	 * Errata:
> -	 * When CPU raise #NPF on guest data access and vCPU CR4.SMAP=1, it is
> -	 * possible that CPU microcode implementing DecodeAssist will fail
> -	 * to read bytes of instruction which caused #NPF. In this case,
> -	 * GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
> -	 * return 0 instead of the correct guest instruction bytes.
> -	 *
> -	 * This happens because CPU microcode reading instruction bytes
> -	 * uses a special opcode which attempts to read data using CPL=0
> -	 * privileges. The microcode reads CS:RIP and if it hits a SMAP
> -	 * fault, it gives up and returns no instruction bytes.
> +	 * When CPU raises #NPF on guest data access and vCPU CR4.SMAP=1, it is
> +	 * possible that CPU microcode implementing DecodeAssist will fail to
> +	 * read guest memory at CS:RIP and vmcb.GuestIntrBytes will incorrectly
> +	 * be '0'.  This happens because microcode reads CS:RIP using a _data_
> +	 * loap uop with CPL=0 privileges.  If the load hits a SMAP #PF, ucode
> +	 * gives up and does not fill the instruction bytes buffer.
>   	 *
>   	 * Detection:
> -	 * We reach here in case CPU supports DecodeAssist, raised #NPF and
> -	 * returned 0 in GuestIntrBytes field of the VMCB.
> -	 * First, errata can only be triggered in case vCPU CR4.SMAP=1.
> -	 * Second, if vCPU CR4.SMEP=1, errata could only be triggered
> -	 * in case vCPU CPL==3 (Because otherwise guest would have triggered
> -	 * a SMEP fault instead of #NPF).
> -	 * Otherwise, vCPU CR4.SMEP=0, errata could be triggered by any vCPU CPL.
> -	 * As most guests enable SMAP if they have also enabled SMEP, use above
> -	 * logic in order to attempt minimize false-positive of detecting errata
> -	 * while still preserving all cases semantic correctness.
> +	 * KVM reaches this point if the VM is an SEV guest, the CPU supports
> +	 * DecodeAssist, a #NPF was raised, KVM's page fault handler triggered
> +	 * emulation (e.g. for MMIO), and the CPU returned 0 in GuestIntrBytes
> +	 * field of the VMCB.
>   	 *
> -	 * Workaround:
> -	 * To determine what instruction the guest was executing, the hypervisor
> -	 * will have to decode the instruction at the instruction pointer.
> +	 * This does _not_ mean that the erratum has been encountered, as the
> +	 * DecodeAssist will also fail if the load for CS:RIP hits a legitimate
> +	 * #PF, e.g. if the guest attempt to execute from emulated MMIO and
> +	 * encountered a reserved/not-present #PF.
>   	 *
> -	 * In non SEV guest, hypervisor will be able to read the guest
> -	 * memory to decode the instruction pointer when insn_len is zero
> -	 * so we return true to indicate that decoding is possible.
> -	 *
> -	 * But in the SEV guest, the guest memory is encrypted with the
> -	 * guest specific key and hypervisor will not be able to decode the
> -	 * instruction pointer so we will not able to workaround it. Lets
> -	 * print the error and request to kill the guest.
> +	 * To reduce the likelihood of false positives, take action if and only
> +	 * if CR4.SMAP=1 (obviously required to hit the erratum) and CR4.SMEP=0
> +	 * or CPL=3.  If SMEP=1 and CPL!=3, the erratum cannot have been hit as
> +	 * the guest would have encountered a SMEP violation #PF, not a #NPF.
>   	 */
> -	if (likely(!insn || insn_len))
> -		return true;
> -
>   	cr4 = kvm_read_cr4(vcpu);
>   	smep = cr4 & X86_CR4_SMEP;
>   	smap = cr4 & X86_CR4_SMAP;