linux-kernel - Re: [PATCH 3/5] KVM: nSVM: Don't forget about L1-injected events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YkTlxCV9wmA3fTlN@google.com>
Date:   Wed, 30 Mar 2022 23:20:36 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     "Maciej S. Szmigiero" <mail@...iej.szmigiero.name>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Tom Lendacky <thomas.lendacky@....com>,
        Brijesh Singh <brijesh.singh@....com>,
        Jon Grimm <Jon.Grimm@....com>,
        David Kaplan <David.Kaplan@....com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Liam Merwick <liam.merwick@...cle.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/5] KVM: nSVM: Don't forget about L1-injected events

On Thu, Mar 31, 2022, Maciej S. Szmigiero wrote:
> On 30.03.2022 23:59, Sean Christopherson wrote:
> > On Thu, Mar 10, 2022, Maciej S. Szmigiero wrote:
> > > @@ -3627,6 +3632,14 @@ static void svm_complete_interrupts(struct kvm_vcpu *vcpu)
> > >   	if (!(exitintinfo & SVM_EXITINTINFO_VALID))
> > >   		return;
> > > +	/* L1 -> L2 event re-injection needs a different handling */
> > > +	if (is_guest_mode(vcpu) &&
> > > +	    exit_during_event_injection(svm, svm->nested.ctl.event_inj,
> > > +					svm->nested.ctl.event_inj_err)) {
> > > +		nested_svm_maybe_reinject(vcpu);
> > 
> > Why is this manually re-injecting?  More specifically, why does the below (out of
> > sight in the diff) code that re-queues the exception/interrupt not work?  The
> > re-queued event should be picked up by nested_save_pending_event_to_vmcb12() and
> > propagatred to vmcb12.
> 
> A L1 -> L2 injected event should either be re-injected until successfully
> injected into L2 or propagated to VMCB12 if there is a nested VMEXIT
> during its delivery.
> 
> svm_complete_interrupts() does not do such re-injection in some cases
> (soft interrupts, soft exceptions, #VC) - it is trying to resort to
> emulation instead, which is incorrect in this case.
> 
> I think it's better to split out this L1 -> L2 nested case to a
> separate function in nested.c rather than to fill
> svm_complete_interrupts() in already very large svm.c with "if" blocks
> here and there.

Ah, I see it now.  WTF.

Ugh, commit 66fd3f7f901f ("KVM: Do not re-execute INTn instruction.") fixed VMX,
but left SVM broken.

Re-executing the INTn is wrong, the instruction has already completed decode and
execution.  E.g. if there's there's a code breakpoint on the INTn, rewinding will
cause a spurious #DB.

KVM's INT3 shenanigans are bonkers, but I guess there's no better option given
that the APM says "Software interrupts cannot be properly injected if the processor
does not support the NextRIP field.".  What a mess.

Anyways, for the common nrips=true case, I strongly prefer that we properly fix
the non-nested case and re-inject software interrupts, which should in turn
naturally fix this nested case.  And for nrips=false, my vote is to either punt
and document it as a "KVM erratum", or straight up make nested require nrips.

Note, that also requires updating svm_queue_exception(), which assumes it will
only be handed hardware exceptions, i.e. hardcodes type EXEPT.  That's blatantly
wrong, e.g. if userspace injects a software exception via KVM_SET_VCPU_EVENTS.