linux-kernel - Re: [PATCH v2 09/21] KVM: nVMX: Unconditionally clear mtf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4069fcb7e559b0732681ecc9234f9910b59df411.camel@redhat.com>
Date:   Wed, 06 Jul 2022 23:03:09 +0300
From:   Maxim Levitsky <mlevitsk@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Oliver Upton <oupton@...gle.com>,
        Peter Shier <pshier@...gle.com>
Subject: Re: [PATCH v2 09/21] KVM: nVMX: Unconditionally clear mtf_pending
 on nested VM-Exit

On Wed, 2022-07-06 at 16:45 +0000, Sean Christopherson wrote:
> On Wed, Jul 06, 2022, Maxim Levitsky wrote:
> > On Tue, 2022-06-14 at 20:47 +0000, Sean Christopherson wrote:
> > > Clear mtf_pending on nested VM-Exit instead of handling the clear on a
> > > case-by-case basis in vmx_check_nested_events().  The pending MTF should
> > > rever survive nested VM-Exit, as it is a property of KVM's run of the
> > ^^ typo: never
> > 
> > Also it is not clear what the 'case by case' means.
> > 
> > I see that the vmx_check_nested_events always clears it unless nested run is pending
> > or we re-inject an event.
> 
> Those two "unless ..." are the "cases".  The point I'm trying to make in the changelog
> is that there's no need for any conditional logic whatsoever.
> 
> > > @@ -3927,6 +3919,9 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
> > >  		clear_bit(KVM_APIC_INIT, &apic->pending_events);
> > >  		if (vcpu->arch.mp_state != KVM_MP_STATE_INIT_RECEIVED)
> > >  			nested_vmx_vmexit(vcpu, EXIT_REASON_INIT_SIGNAL, 0, 0);
> > > +
> > > +		/* MTF is discarded if the vCPU is in WFS. */
> > > +		vmx->nested.mtf_pending = false;
> > >  		return 0;
> > 
> > I guess MTF should also be discarded if we enter SMM, and I see that
> > VMX also enter SMM with a pseudo VM exit (in vmx_enter_smm) which
> > will clear the MTF. Good.
> 
> No, a pending MTF should be preserved across SMI. 

Indeed, now I see it:

"If an MTF VM exit was pending at the time of the previous SMI, an MTF VM exit is pending on the instruction
boundary following execution of RSM. The following items detail the treatment of MTF VM exits that may be
pending following RSM:"

You might also want to add it as some comment in the source.



>  It's not a regression because
> KVM incorrectly prioritizes MTF (and trap-like #DBs) over SMI (and because if KVM
> did prioritize SMI, the existing code would also drop the pending MTF).  Note, this
> isn't the only flaw that needs to be addressed in order to correctly prioritize SMIs,
> e.g. KVM_{G,S}ET_NESTED_STATE would need to save/restore a pending MTF if the vCPU is
> in SMM after an SMI that arrived while L2 was active.

When we fix this, should we store it to SMRAM, or to some KVM internal state?
Or VMCS12, as noted in the other mail.


> 
> Tangentially related, KVM's pseudo VM-Exit on SMI emulation is completely wrong[*].
> 
> [*] https://lore.kernel.org/all/Yobt1XwOfb5M6Dfa@google.com
> 


I have seen a patch on the KVM mailing list recently exactly about preserving CET
state in the SMRAM, I'll need to take a look.



Best regards,
	Maxim Levitsky