linux-kernel - Re: [RFC PATCH v2 0/7] x86/idle: add halt poll support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <25e566b9-a8f4-2d90-0ba3-725f1a215c1f@gmail.com>
Date:   Thu, 14 Sep 2017 16:36:00 +0800
From:   Quan Xu <quan.xu0@...il.com>
To:     Yang Zhang <yang.zhang.wz@...il.com>,
        "Michael S. Tsirkin" <mst@...hat.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        wanpeng.li@...mail.com, pbonzini@...hat.com, tglx@...utronix.de,
        rkrcmar@...hat.com, dmatlack@...gle.com, agraf@...e.de,
        peterz@...radead.org, linux-doc@...r.kernel.org
Subject: Re: [RFC PATCH v2 0/7] x86/idle: add halt poll support



on 2017/9/13 19:56, Yang Zhang wrote:
> On 2017/8/29 22:56, Michael S. Tsirkin wrote:
>> On Tue, Aug 29, 2017 at 11:46:34AM +0000, Yang Zhang wrote:
>>> Some latency-intensive workload will see obviously performance
>>> drop when running inside VM.
>>
>> But are we trading a lot of CPU for a bit of lower latency?
>>
>>> The main reason is that the overhead
>>> is amplified when running inside VM. The most cost i have seen is
>>> inside idle path.
>>>
>>> This patch introduces a new mechanism to poll for a while before
>>> entering idle state. If schedule is needed during poll, then we
>>> don't need to goes through the heavy overhead path.
>>
>> Isn't it the job of an idle driver to find the best way to
>> halt the CPU?
>>
>> It looks like just by adding a cstate we can make it
>> halt at higher latencies only. And at lower latencies,
>> if it's doing a good job we can hopefully use mwait to
>> stop the CPU.
>>
>> In fact I have been experimenting with exactly that.
>> Some initial results are encouraging but I could use help
>> with testing and especially tuning. If you can help
>> pls let me know!
>
> Quan, Can you help to test it and give result? Thanks.
>

Hi, MST

I have tested the patch "intel_idle: add pv cstates when running on 
kvm"  on a recent host that allows guests
to execute mwait without an exit. also I have tested our patch "[RFC 
PATCH v2 0/7] x86/idle: add halt poll support",
upstream linux, and  idle=poll.

the following is the result (which seems better than ever berfore, as I 
ran test case on a more powerful machine):

for __netperf__,  the first column is trans. rate per sec, the second 
column is CPU utilzation.

1. upstream linux

       28371.7 bits/s -- 76.6 %CPU

2. idle=poll

       34372 bit/s -- 999.3 %CPU

3. "[RFC PATCH v2 0/7] x86/idle: add halt poll support",  with different 
values of parameter 'halt_poll_threshold':

       28362.7 bits/s -- 74.7  %CPU (halt_poll_threshold=10000)
       32949.5 bits/s -- 82.5  %CPU (halt_poll_threshold=20000)
       39717.9 bits/s -- 104.1 %CPU (halt_poll_threshold=30000)
       40137.9 bits/s -- 104.4 %CPU (halt_poll_threshold=40000)
       40079.8 bits/s -- 105.6 %CPU (halt_poll_threshold=50000)


4. "intel_idle: add pv cstates when running on kvm"

       33041.8 bits/s  -- 999.4 %CPU





for __ctxsw__, the first column is the time per process context 
switches, the second column is CPU utilzation..

1. upstream linux

       3624.19 ns/ctxsw -- 191.9 %CPU

2. idle=poll

       3419.66 ns/ctxsw -- 999.2 %CPU

3. "[RFC PATCH v2 0/7] x86/idle: add halt poll support", with different 
values of parameter 'halt_poll_threshold':

       1123.40 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=10000)
       1127.38 ns/ctxsw -- 199.7 %CPU (halt_poll_threshold=20000)
       1113.58 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=30000)
       1117.12 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=40000)
       1121.62 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=50000)

  4.  "intel_idle: add pv cstates when running on kvm"

       3427.59 ns/ctxsw -- 999.4 %CPU

-Quan