lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c4f06a75-412c-d546-9ce7-4bf4cc49d102@redhat.com>
Date:   Thu, 1 Apr 2021 21:56:50 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Maxim Levitsky <mlevitsk@...hat.com>, kvm@...r.kernel.org
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Wanpeng Li <wanpengli@...cent.com>,
        Borislav Petkov <bp@...en8.de>,
        Jim Mattson <jmattson@...gle.com>,
        "open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)" 
        <linux-kernel@...r.kernel.org>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, Joerg Roedel <joro@...tes.org>,
        Ingo Molnar <mingo@...hat.com>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
        Sean Christopherson <seanjc@...gle.com>
Subject: Re: [PATCH 3/4] KVM: x86: correctly merge pending and injected
 exception

On 01/04/21 16:38, Maxim Levitsky wrote:
> +static int kvm_do_deliver_pending_exception(struct kvm_vcpu *vcpu)
> +{
> +	int class1, class2, ret;
> +
> +	/* try to deliver current pending exception as VM exit */
> +	if (is_guest_mode(vcpu)) {
> +		ret = kvm_x86_ops.nested_ops->deliver_exception_as_vmexit(vcpu);
> +		if (ret || !vcpu->arch.pending_exception.valid)
> +			return ret;
> +	}
> +
> +	/* No injected exception, so just deliver the payload and inject it */
> +	if (!vcpu->arch.injected_exception.valid) {
> +		trace_kvm_inj_exception(vcpu->arch.pending_exception.nr,
> +					vcpu->arch.pending_exception.has_error_code,
> +					vcpu->arch.pending_exception.error_code);
> +queue:

If you move the queue label to the top of the function, you can "goto queue" for #DF as well and you don't need to call kvm_do_deliver_pending_exception again.  In fact you can merge this function and kvm_deliver_pending_exception completely:


static int kvm_deliver_pending_exception_as_vmexit(struct kvm_vcpu *vcpu)
{
	WARN_ON(!vcpu->arch.pending_exception.valid);
	if (is_guest_mode(vcpu))
		return kvm_x86_ops.nested_ops->deliver_exception_as_vmexit(vcpu);
	else
		return 0;
}

static int kvm_merge_injected_exception(struct kvm_vcpu *vcpu)
{
	/*
	 * First check if the pending exception takes precedence
	 * over the injected one, which will be reported in the
	 * vmexit info.
	 */
	ret = kvm_deliver_pending_exception_as_vmexit(vcpu);
	if (ret || !vcpu->arch.pending_exception.valid)
		return ret;

	if (vcpu->arch.injected_exception.nr == DF_VECTOR) {
		...
		return 0;
	}
	...
	if ((class1 == EXCPT_CONTRIBUTORY && class2 == EXCPT_CONTRIBUTORY)
	    || (class1 == EXCPT_PF && class2 != EXCPT_BENIGN)) {
		...
	}
	vcpu->arch.injected_exception.valid = false;
}

static int kvm_deliver_pending_exception(struct kvm_vcpu *vcpu)
{
	if (!vcpu->arch.pending_exception.valid)
		return 0;

	if (vcpu->arch.injected_exception.valid)
		kvm_merge_injected_exception(vcpu);

	ret = kvm_deliver_pending_exception_as_vmexit(vcpu));
	if (ret || !vcpu->arch.pending_exception.valid)
		return ret;

	trace_kvm_inj_exception(vcpu->arch.pending_exception.nr,
				vcpu->arch.pending_exception.has_error_code,
				vcpu->arch.pending_exception.error_code);
	...
}

Note that if the pending exception is a page fault, its payload
must be delivered to CR2 before converting it to a double fault.

Going forward to vmx.c:

> 
>  	if (mtf_pending) {
>  		if (block_nested_events)
>  			return -EBUSY;
> +
>  		nested_vmx_update_pending_dbg(vcpu);

Should this instead "WARN_ON(vmx_pending_dbg_trap(vcpu));" since
the pending-#DB-plus-MTF combination is handled in
vmx_deliver_exception_as_vmexit?...

> 
> +
> +	if (vmx->nested.mtf_pending && vmx_pending_dbg_trap(vcpu)) {
> +		/*
> +		 * A pending monitor trap takes precedence over pending
> +		 * debug exception which is 'stashed' into
> +		 * 'GUEST_PENDING_DBG_EXCEPTIONS'
> +		 */
> +
> +		nested_vmx_update_pending_dbg(vcpu);
> +		vmx->nested.mtf_pending = false;
> +		nested_vmx_vmexit(vcpu, EXIT_REASON_MONITOR_TRAP_FLAG, 0, 0);
> +		return 0;
> +	}

... though this is quite ugly, even more so if you add the case of an
INIT with a pending #DB.  The problem is that INIT and MTF have higher
priority than debug exceptions.

The good thing is that an INIT or MTF plus #DB cannot happen with
nested_run_pending == 1, so it will always be inject right away.

There is precedent with KVM_GET_* modifying the CPU state; in
particular, KVM_GET_MPSTATE can modify CS and RIP and even cause a
vmexit via kvm_apic_accept_events.  And in fact, because
kvm_apic_accept_events calls kvm_check_nested_events, calling it
from KVM_GET_VCPU_EVENTS would fix the problem: the injected exception
would go into the IDT-vectored exit field, while the pending exception
would go into GUEST_PENDING_DBG_EXCEPTION and disappear.

However, you cannot do kvm_apic_accept_events twice because there would
be a race with INIT: a #DB exception could be first stored by
KVM_GET_VCPU_EVENTS, then disappear when kvm_apic_accept_events
KVM_GET_MPSTATE is called.

Fortunately, the correct order for KVM_GET_* events is first
KVM_GET_VCPU_EVENTS and then KVM_GET_MPSTATE.  So perhaps
instead of calling kvm_deliver_pending_exception on vmexit,
KVM_GET_VCPU_EVENTS could call kvm_apic_accept_events (and thus
kvm_check_nested_events) instead of KVM_GET_MPSTATE.  In addition:
nested_ops.check_events would be split in two, with high-priority
events (triple fault, now in kvm_check_nested_events; INIT; MTF)
remaining in the first and interrupts in the second, tentatively
named check_interrupts).

I'll take a look at cleaning up the current mess we have around
kvm_apic_accept_events, where most of the calls do not have to
handle guest mode at all.

Thanks for this work and for showing that it's possible to fix the
underlying mess with exception vmexit.  It may be a bit more
complicated than this, but it's a great start.

Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ