lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 9 Sep 2021 18:59:05 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Xiaoyao Li <xiaoyao.li@...el.com>
Cc:     Chenyi Qiang <chenyi.qiang@...el.com>, pbonzini@...hat.com,
        vkuznets@...hat.com, wanpengli@...cent.com, jmattson@...gle.com,
        joro@...tes.org, tglx@...utronix.de, mingo@...hat.com,
        bp@...en8.de, hpa@...or.com, x86@...nel.org, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] KVM: VMX: Enable Notify VM exit

On Tue, Sep 07, 2021, Xiaoyao Li wrote:
> On 9/3/2021 12:36 AM, Sean Christopherson wrote:
> > On Thu, Sep 02, 2021, Sean Christopherson wrote:
> > > On Tue, Aug 03, 2021, Xiaoyao Li wrote:
> > > > On 8/2/2021 11:46 PM, Sean Christopherson wrote:
> > > > > > > > @@ -5642,6 +5653,31 @@ static int handle_bus_lock_vmexit(struct kvm_vcpu *vcpu)
> > > > > > > >     	return 0;
> > > > > > > >     }
> > > > > > > > +static int handle_notify(struct kvm_vcpu *vcpu)
> > > > > > > > +{
> > > > > > > > +	unsigned long exit_qual = vmx_get_exit_qual(vcpu);
> > > > > > > > +
> > > > > > > > +	if (!(exit_qual & NOTIFY_VM_CONTEXT_INVALID)) {
> > > > > > > 
> > > > > > > What does CONTEXT_INVALID mean?  The ISE doesn't provide any information whatsoever.
> > > > > > 
> > > > > > It means whether the VM context is corrupted and not valid in the VMCS.
> > > > > 
> > > > > Well that's a bit terrifying.  Under what conditions can the VM context become
> > > > > corrupted?  E.g. if the context can be corrupted by an inopportune NOTIFY exit,
> > > > > then KVM needs to be ultra conservative as a false positive could be fatal to a
> > > > > guest.
> > > > > 
> > > > 
> > > > Short answer is no case will set the VM_CONTEXT_INVALID bit.
> > > 
> > > But something must set it, otherwise it wouldn't exist.
> 
> For existing Intel silicon, no case will set it. Maybe in the future new
> case will set it.
> 
> > The condition(s) under
> > > which it can be set matters because it affects how KVM should respond.  E.g. if
> > > the guest can trigger VM_CONTEXT_INVALID at will, then we should probably treat
> > > it as a shutdown and reset the VMCS.
> > 
> > Oh, and "shutdown" would be relative to the VMCS, i.e. if L2 triggers a NOTIFY
> > exit with VM_CONTEXT_INVALID then KVM shouldn't kill the entire VM.  The least
> > awful option would probably be to synthesize a shutdown VM-Exit to L1.  That
> > won't communicate to L1 that vmcs12 state is stale/bogus, but I don't see any way
> > to handle that via an existing VM-Exit reason :-/
> > 
> > > But if VM_CONTEXT_INVALID can occur if and only if there's a hardware/ucode
> > > issue, then we can do:
> > > 
> > > 	if (KVM_BUG_ON(exit_qual & NOTIFY_VM_CONTEXT_INVALID, vcpu->kvm))
> > > 		return -EIO;
> > > 
> > > Either way, to enable this by default we need some form of documentation that
> > > describes what conditions lead to VM_CONTEXT_INVALID.
> 
> I still don't know why the conditions lead to it matters. I think the
> consensus is that once VM_CONTEXT_INVALID happens, the vcpu can no longer
> run.

Yes, and no longer being able to run the vCPU is precisely the problem.  The
condition(s) matters because if there's a possibility, however small, that enabling
NOTIFY_WINDOW can kill a well-behaved guest then it absolutely cannot be enabled by
default.

> Either KVM_BUG_ON() or a specific EXIT to userspace should be OK?

Not if the VM_CONTEXT_INVALID happens while L2 is running.  If software can trigger
VM_CONTEXT_INVALID at will, then killing the VM would open up the door to a
malicious L2 killing L1 (which would be rather ironic since this is an anti-DoS
feature).  IIUC, VM_CONTEXT_INVALID only means the current VMCS is garbage, thus
an occurence while L2 is active means that vmcs02 is junk, but L1's state in vmcs01,
vmcs12, etc... is still valid.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ