lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 11 Aug 2017 09:43:12 +0800
From:   "Longpeng (Mike)" <longpeng2@...wei.com>
To:     Eric Farman <farman@...ux.vnet.ibm.com>
CC:     Cornelia Huck <cohuck@...hat.com>, <pbonzini@...hat.com>,
        <rkrcmar@...hat.com>, <agraf@...e.com>, <borntraeger@...ibm.com>,
        <christoffer.dall@...aro.org>, <marc.zyngier@....com>,
        <james.hogan@...tec.com>, <kvm@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <weidong.huang@...wei.com>,
        <arei.gonglei@...wei.com>, <wangxinxin.wang@...wei.com>,
        <longpeng.mike@...il.com>, <david@...hat.com>
Subject: Re: [PATCH v2 0/4] KVM: optimize the kvm_vcpu_on_spin



On 2017/8/10 21:18, Eric Farman wrote:

> 
> 
> On 08/08/2017 04:14 AM, Longpeng (Mike) wrote:
>>
>>
>> On 2017/8/8 15:41, Cornelia Huck wrote:
>>
>>> On Tue, 8 Aug 2017 12:05:31 +0800
>>> "Longpeng(Mike)" <longpeng2@...wei.com> wrote:
>>>
>>>> This is a simple optimization for kvm_vcpu_on_spin, the
>>>> main idea is described in patch-1's commit msg.
>>>
>>> I think this generally looks good now.
>>>
>>>>
>>>> I did some tests base on the RFC version, the result shows
>>>> that it can improves the performance slightly.
>>>
>>> Did you re-run tests on this version?
>>
>>
>> Hi Cornelia,
>>
>> I didn't re-run tests on V2. But the major difference between RFC and V2
>> is that V2 only cache result for X86 (s390/arm needn't) and V2 saves a
>> expensive operation ( 440-1400 cycles on my test machine ) for X86/VMX.
>>
>> So I think V2's performance is at least the same as RFC or even slightly
>> better. :)
>>
>>>
>>> I would also like to see some s390 numbers; unfortunately I only have a
>>> z/VM environment and any performance numbers would be nearly useless
>>> there. Maybe somebody within IBM with a better setup can run a quick
>>> test?
> 
> Won't swear I didn't screw something up, but here's some quick numbers. Host was
> 4.12.0 with and without this series, running QEMU 2.10.0-rc0. Created 4 guests,
> each with 4 CPU (unpinned) and 4GB RAM.  VM1 did full kernel compiles with
> kernbench, which took averages of 5 runs of different job sizes (I threw away
> the "-j 1" numbers). VM2-VM4 ran cpu burners on 2 of their 4 cpus.
> 
> Numbers from VM1 kernbench output, and the delta between runs:
> 
> load -j 3        before        after        delta
> Elapsed Time        183.178        182.58        -0.598
> User Time        534.19        531.52        -2.67
> System Time        32.538        33.37        0.832
> Percent CPU        308.8        309        0.2
> Context Switches    98484.6        99001        516.4
> Sleeps            227347        228752        1405
> 
> load -j 16        before        after        delta
> Elapsed Time        153.352        147.59        -5.762
> User Time        545.829        533.41        -12.419
> System Time        34.289        34.85        0.561
> Percent CPU        347.6        348        0.4
> Context Switches    160518        159120        -1398
> Sleeps            240740        240536        -204
> 


Thanks Eric!

The `Elapsed Time` is smaller with this series , the result is the same as my
numbers in cover-letter.

> 
>  - Eric
> 
> 
> .
> 


-- 
Regards,
Longpeng(Mike)

Powered by blists - more mailing lists