lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ec014e8d-eb5f-03cc-3ed1-da58039ef034@bytedance.com>
Date:   Mon, 25 Oct 2021 11:14:13 +0800
From:   zhenwei pi <pizhenwei@...edance.com>
To:     Wanpeng Li <kernellwp@...il.com>,
        Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        Wanpeng Li <wanpengli@...cent.com>,
        LKML <linux-kernel@...r.kernel.org>, linux-doc@...r.kernel.org
Subject: Re: [PATCH] x86/kvm: Introduce boot parameter no-kvm-pvipi

On 10/21/21 3:17 PM, zhenwei pi wrote:
> On 10/21/21 1:03 PM, Wanpeng Li wrote:
>> On Thu, 21 Oct 2021 at 11:05, zhenwei pi <pizhenwei@...edance.com> wrote:
>>>
>>>
>>> On 10/21/21 4:12 AM, Sean Christopherson wrote:
>>>> On Wed, Oct 20, 2021, Wanpeng Li wrote:
>>>>> On Wed, 20 Oct 2021 at 20:08, zhenwei pi <pizhenwei@...edance.com> 
>>>>> wrote:
>>>>>>
>>>>>> Although host side exposes KVM PV SEND IPI feature to guest side,
>>>>>> guest should still have a chance to disable it.
>>>>>>
>>>>>> A typicall case of this parameter:
>>>>>> If the host AMD server enables AVIC feature, the flat mode of APIC
>>>>>> get better performance in the guest.
>>>>>
>>>>> Hmm, I didn't find enough valuable information in your posting. We
>>>>> observe AMD a lot before.
>>>>> https://lore.kernel.org/all/CANRm+Cx597FNRUCyVz1D=B6Vs2GX3Sw57X7Muk+yMpi_hb+v1w@mail.gmail.com/T/#u 
>>>>>
>>>>
>>>> I too would like to see numbers.  I suspect the answer is going to 
>>>> be that
>>>> AVIC performs poorly in CPU overcommit scenarios because of the cost 
>>>> of managing
>>>> the tables and handling "failed delivery" exits, but that AVIC does 
>>>> quite well
>>>> when vCPUs are pinned 1:1 and IPIs rarely require an exit to the host.
>>>>
>>>
>>> Test env:
>>> CPU: AMD EPYC 7642 48-Core Processor
>>>
>>> Kmod args(enable avic and disable nested):
>>> modprobe kvm-amd nested=0 avic=1 npt=1
>>>
>>> QEMU args(disable x2apic):
>>> ... -cpu host,x2apic=off ...
>>>
>>> Benchmark tool:
>>> https://github.com/bytedance/kvm-utils/tree/master/microbenchmark/apic-ipi 
>>>
>>>
>>> ~# insmod apic_ipi.ko options=5 && dmesg -c
>>>
>>>    apic_ipi: 1 NUMA node(s)
>>>    apic_ipi: apic [flat]
>>>    apic_ipi: apic->send_IPI[default_send_IPI_single+0x0/0x40]
>>>    apic_ipi: apic->send_IPI_mask[kvm_send_ipi_mask+0x0/0x10]
>>>    apic_ipi:     IPI[kvm_send_ipi_mask] from CPU[0] to CPU[1]
>>>    apic_ipi:             total cycles 375671259, avg 3756
>>>    apic_ipi:     IPI[flat_send_IPI_mask] from CPU[0] to CPU[1]
>>>    apic_ipi:             total cycles 221961822, avg 2219
>>>
>>>
>>> apic->send_IPI_mask[kvm_send_ipi_mask+0x0/0x10]
>>>     -> This line show current send_IPI_mask is kvm_send_ipi_mask(because
>>> of PV SEND IPI FEATURE)
>>>
>>> apic_ipi:       IPI[kvm_send_ipi_mask] from CPU[0] to CPU[1]
>>> apic_ipi:               total cycles 375671259, avg 3756
>>>     -->These lines show the average cycles of each kvm_send_ipi_mask: 
>>> 3756
>>>
>>> apic_ipi:       IPI[flat_send_IPI_mask] from CPU[0] to CPU[1]
>>> apic_ipi:               total cycles 221961822, avg 2219
>>>     -->These lines show the average cycles of each 
>>> flat_send_IPI_mask: 2219
>>
>> Just single target IPI is not eough.
>>
>>      Wanpeng
>>
> 
> Benchmark smp_call_function_single 
> (https://github.com/bytedance/kvm-utils/blob/master/microbenchmark/ipi-bench/ipi_bench.c): 
> 
> 
>   Test env:
>   CPU: AMD EPYC 7642 48-Core Processor
> 
>   Kmod args(enable avic and disable nested):
>   modprobe kvm-amd nested=0 avic=1 npt=1
> 
>   QEMU args(disable x2apic):
>   ... -cpu host,x2apic=off ...
> 
> 1> without no-kvm-pvipi:
> ipi_bench_single wait[1], CPU0[NODE0] -> CPU1[NODE0], loop = 100000
>       elapsed =        424945631 cycles, average =     4249 cycles
>       ipitime =        385246136 cycles, average =     3852 cycles
> ipi_bench_single wait[0], CPU0[NODE0] -> CPU1[NODE0], loop = 100000
>       elapsed =        419057953 cycles, average =     4190 cycles
> 
> 2> with no-kvm-pvipi:
> ipi_bench_single wait[1], CPU0[NODE0] -> CPU1[NODE0], loop = 100000
>       elapsed =        321756407 cycles, average =     3217 cycles
>       ipitime =        299433550 cycles, average =     2994 cycles
> ipi_bench_single wait[0], CPU0[NODE0] -> CPU1[NODE0], loop = 100000
>       elapsed =        295382146 cycles, average =     2953 cycles
> 
> 
Hi, Wanpeng & Sean

Also benchmark redis(by 127.0.0.1) in a guest(2vCPU), 'no-kvm-pvipi' 
gets better performance.

Test env:
Host side: pin 2vCPU on 2core in a die.
Guest side: run command:
   taskset -c 1 ./redis-server --appendonly no
   taskset -c 0 ./redis-benchmark -h 127.0.0.1 -d 1024 -n 10000000 -t get

1> without no-kvm-pvipi:
redis QPS: 193203.12 requests per second
kvm_pv_send_ipi exit: ~18K/s

2> with no-kvm-pvipi:
redis QPS: 196028.47 requests per second
avic_incomplete_ipi_interception exit: ~5K/s

-- 
zhenwei pi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ