lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 26 Sep 2023 22:04:01 -0700
From:   Xin Li <xin@...or.com>
To:     Mingwei Zhang <mizhang@...gle.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Cc:     "H. Peter Anvin" <hpa@...or.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Jim Mattson <jmattson@...gle.com>,
        Like Xu <likexu@...cent.com>, Kan Liang <kan.liang@...el.com>,
        Dapeng1 Mi <dapeng1.mi@...el.com>
Subject: Re: [PATCH] KVM: x86: Move kvm_check_request(KVM_REQ_NMI) after
 kvm_check_request(KVM_REQ_NMI)

On 9/26/2023 9:15 PM, Mingwei Zhang wrote:
> ah, typo in the subject: The 2nd KVM_REQ_NMI should be KVM_REQ_PMI.
> Sorry about that.
> 
> On Tue, Sep 26, 2023 at 9:09 PM Mingwei Zhang <mizhang@...gle.com> wrote:
>>
>> Move kvm_check_request(KVM_REQ_NMI) after kvm_check_request(KVM_REQ_NMI).

Please remove it, no need to repeat the subject.

>> When vPMU is active use, processing each KVM_REQ_PMI will generate a
>> KVM_REQ_NMI. Existing control flow after KVM_REQ_PMI finished will fail the
>> guest enter, jump to kvm_x86_cancel_injection(), and re-enter
>> vcpu_enter_guest(), this wasted lot of cycles and increase the overhead for
>> vPMU as well as the virtualization.

Optimization is after correctness, so please explain if this is correct
first!

>>
>> So move the code snippet of kvm_check_request(KVM_REQ_NMI) to make KVM
>> runloop more efficient with vPMU.
>>
>> To evaluate the effectiveness of this change, we launch a 8-vcpu QEMU VM on
>> an Intel SPR CPU. In the VM, we run perf with all 48 events Intel vtune
>> uses. In addition, we use SPEC2017 benchmark programs as the workload with
>> the setup of using single core, single thread.
>>
>> At the host level, we probe the invocations to vmx_cancel_injection() with
>> the following command:
>>
>>      $ perf probe -a vmx_cancel_injection
>>      $ perf stat -a -e probe:vmx_cancel_injection -I 10000 # per 10 seconds
>>
>> The following is the result that we collected at beginning of the spec2017
>> benchmark run (so mostly for 500.perlbench_r in spec2017). Kindly forgive
>> the incompleteness.
>>
>> On kernel without the change:
>>      10.010018010              14254      probe:vmx_cancel_injection
>>      20.037646388              15207      probe:vmx_cancel_injection
>>      30.078739816              15261      probe:vmx_cancel_injection
>>      40.114033258              15085      probe:vmx_cancel_injection
>>      50.149297460              15112      probe:vmx_cancel_injection
>>      60.185103088              15104      probe:vmx_cancel_injection
>>
>> On kernel with the change:
>>      10.003595390                 40      probe:vmx_cancel_injection
>>      20.017855682                 31      probe:vmx_cancel_injection
>>      30.028355883                 34      probe:vmx_cancel_injection
>>      40.038686298                 31      probe:vmx_cancel_injection
>>      50.048795162                 20      probe:vmx_cancel_injection
>>      60.069057747                 19      probe:vmx_cancel_injection
>>
>>  From the above, it is clear that we save 1500 invocations per vcpu per
>> second to vmx_cancel_injection() for workloads like perlbench.
>>
>> Signed-off-by: Mingwei Zhang <mizhang@...gle.com>
>> ---
>>   arch/x86/kvm/x86.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 42a4e8f5e89a..302b6f8ddfb1 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -10580,12 +10580,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>                  if (kvm_check_request(KVM_REQ_SMI, vcpu))
>>                          process_smi(vcpu);
>>   #endif
>> -               if (kvm_check_request(KVM_REQ_NMI, vcpu))
>> -                       process_nmi(vcpu);
>>                  if (kvm_check_request(KVM_REQ_PMU, vcpu))
>>                          kvm_pmu_handle_event(vcpu);
>>                  if (kvm_check_request(KVM_REQ_PMI, vcpu))
>>                          kvm_pmu_deliver_pmi(vcpu);
>> +               if (kvm_check_request(KVM_REQ_NMI, vcpu))
>> +                       process_nmi(vcpu);
>>                  if (kvm_check_request(KVM_REQ_IOAPIC_EOI_EXIT, vcpu)) {
>>                          BUG_ON(vcpu->arch.pending_ioapic_eoi > 255);
>>                          if (test_bit(vcpu->arch.pending_ioapic_eoi,
>>
>> base-commit: 73554b29bd70546c1a9efc9c160641ef1b849358
>> --
>> 2.42.0.515.g380fc7ccd1-goog
>>
> 

-- 
Thanks!
     Xin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ