lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 05 May 2020 14:40:43 +0300
From:   Maxim Levitsky <mlevitsk@...hat.com>
To:     Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
        Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org
Cc:     linux-kernel@...r.kernel.org
Subject: Re: AVIC related warning in enable_irq_window

On Tue, 2020-05-05 at 14:55 +0700, Suravee Suthikulpanit wrote:
> Paolo / Maxim,
> 
> On 5/4/20 5:49 PM, Paolo Bonzini wrote:
> > On 04/05/20 12:37, Suravee Suthikulpanit wrote:
> > > On 5/4/20 4:25 PM, Paolo Bonzini wrote:
> > > > On 04/05/20 11:13, Maxim Levitsky wrote:
> > > > > On Mon, 2020-05-04 at 15:46 +0700, Suravee Suthikulpanit wrote:
> > > > > > On 5/2/20 11:42 PM, Paolo Bonzini wrote:
> > > > > > > On 02/05/20 15:58, Maxim Levitsky wrote:
> > > > > > > > The AVIC is disabled by svm_toggle_avic_for_irq_window, which calls
> > > > > > > > kvm_request_apicv_update, which broadcasts the
> > > > > > > > KVM_REQ_APICV_UPDATE vcpu request,
> > > > > > > > however it doesn't broadcast it to CPU on which now we are
> > > > > > > > running, which seems OK,
> > > > > > > > because the code that handles that broadcast runs on each VCPU
> > > > > > > > entry, thus when this CPU will enter guest mode it will notice 
> > > > > > > > and disable the AVIC. >>>>>>>
> > > > > > > > However later in svm_enable_vintr, there is test
> > > > > > > > 'WARN_ON(kvm_vcpu_apicv_active(&svm->vcpu));'
> > > > > > > > which is still true on current CPU because of the above.
> > > > > > > 
> > > > > > > Good point!  We can just remove the WARN_ON I think.  Can you send
> > > > > > > a patch?
> > > > > > > 
> > > > > > 
> > > > > > Instead, as an alternative to remove the WARN_ON(), would it be
> > > > > > better to just explicitly calling kvm_vcpu_update_apicv(vcpu) 
> > > > > > to update the apicv_active flag right after kvm_request_apicv_update()?
> > > > > > 
> > > > > 
> > > > > This should work IMHO, other that the fact kvm_vcpu_update_apicv will
> > > > > be called again, when this vcpu is entered since the KVM_REQ_APICV_UPDATE
> > > > > will still be pending on it.
> > > > > It shoudn't be a problem, and we can even add a check to do nothing
> > > > > when it is called while avic is already in target enable state.
> > > > 
> > > > I thought about that but I think it's a bit confusing.  If we want to
> > > > keep the WARN_ON, Maxim can add an equivalent one to svm_vcpu_run, which
> > > > is even better because the invariant is clearer.
> > > > 
> > > > WARN_ON((vmcb->control.int_ctl & (AVIC_ENABLE_MASK | V_IRQ_MASK))
> > > >      == (AVIC_ENABLE_MASK | V_IRQ_MASK));
> > > > 
> 
> Based on my experiment, it seems that the hardware sets the V_IRQ_MASK bit
> when #VMEXIT despite this bit being ignored when AVIC is enabled.
> (I'll double check w/ HW team on this.) In this case, I don't think we can
> use the WARN_ON() as suggested above.
> 
> I think we should keep the warning in the svm_set_vintr() since we want to know
> if the V_IRQ, V_INTR_PRIO, V_IGN_TPR, and V_INTR_VECTOR are ignored when calling
> svm_set_vintr().
> 
> Instead, I would consider explicitly call kvm_vcpu_update_apicv() since it would
> be benefit from not having to wait for the next vcpu_enter_guest for this vcpu to process
> the request. This is less confusing to me. In this case, we would need to
> kvm_clear_request(KVM_REQ_APICV_UPDATE) for this vcpu as well.
> 
> On the other hand, would be it useful to implement kvm_make_all_cpus_request_but_self(),
> which sends request to all other corpus excluding itself?
> 
> > By the way, there is another possible cleanup: the clearing
> > of V_IRQ_MASK can be removed from interrupt_window_interception since it
> > has already called svm_clear_vintr.
> 
> Maxim, I can help with the clean up patches if you would prefer.

I currently am waiting for the decision on how to we are going to fix this.
I don't have a strong opinion on how to fix this, but at least I think that we know
what is going on. 

Initially I was thinking that something was broken in AVIC, especially when I noticed
that guest would hang when I did LatencyMon benchmark in it.
Luckily the other fix that I tested and reviewed seems to fix those hangs.

In a few days I plan to do some nvme passthrough stress testing as I used to do when I was
developing my nvme-mdev driver with AVIC. I am very curios on how this will turn out.

Best regards,
	Maxim Levitsky

> 
> Thanks,
> Suravee
> 
> 
> > Paolo
> > 
> 
> 


Powered by blists - more mailing lists