lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 10 Aug 2017 09:18:09 -0400
From:   Eric Farman <farman@...ux.vnet.ibm.com>
To:     "Longpeng (Mike)" <longpeng2@...wei.com>,
        Cornelia Huck <cohuck@...hat.com>
Cc:     pbonzini@...hat.com, rkrcmar@...hat.com, agraf@...e.com,
        borntraeger@...ibm.com, christoffer.dall@...aro.org,
        marc.zyngier@....com, james.hogan@...tec.com, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, weidong.huang@...wei.com,
        arei.gonglei@...wei.com, wangxinxin.wang@...wei.com,
        longpeng.mike@...il.com, david@...hat.com
Subject: Re: [PATCH v2 0/4] KVM: optimize the kvm_vcpu_on_spin



On 08/08/2017 04:14 AM, Longpeng (Mike) wrote:
> 
> 
> On 2017/8/8 15:41, Cornelia Huck wrote:
> 
>> On Tue, 8 Aug 2017 12:05:31 +0800
>> "Longpeng(Mike)" <longpeng2@...wei.com> wrote:
>>
>>> This is a simple optimization for kvm_vcpu_on_spin, the
>>> main idea is described in patch-1's commit msg.
>>
>> I think this generally looks good now.
>>
>>>
>>> I did some tests base on the RFC version, the result shows
>>> that it can improves the performance slightly.
>>
>> Did you re-run tests on this version?
> 
> 
> Hi Cornelia,
> 
> I didn't re-run tests on V2. But the major difference between RFC and V2
> is that V2 only cache result for X86 (s390/arm needn't) and V2 saves a
> expensive operation ( 440-1400 cycles on my test machine ) for X86/VMX.
> 
> So I think V2's performance is at least the same as RFC or even slightly
> better. :)
> 
>>
>> I would also like to see some s390 numbers; unfortunately I only have a
>> z/VM environment and any performance numbers would be nearly useless
>> there. Maybe somebody within IBM with a better setup can run a quick
>> test?

Won't swear I didn't screw something up, but here's some quick numbers. 
Host was 4.12.0 with and without this series, running QEMU 2.10.0-rc0. 
Created 4 guests, each with 4 CPU (unpinned) and 4GB RAM.  VM1 did full 
kernel compiles with kernbench, which took averages of 5 runs of 
different job sizes (I threw away the "-j 1" numbers). VM2-VM4 ran cpu 
burners on 2 of their 4 cpus.

Numbers from VM1 kernbench output, and the delta between runs:

load -j 3		before		after		delta
Elapsed Time		183.178		182.58		-0.598
User Time		534.19		531.52		-2.67
System Time		32.538		33.37		0.832
Percent CPU		308.8		309		0.2
Context Switches	98484.6		99001		516.4
Sleeps			227347		228752		1405

load -j 16		before		after		delta
Elapsed Time		153.352		147.59		-5.762
User Time		545.829		533.41		-12.419
System Time		34.289		34.85		0.561
Percent CPU		347.6		348		0.4
Context Switches	160518		159120		-1398
Sleeps			240740		240536		-204


  - Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ