lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <25e566b9-a8f4-2d90-0ba3-725f1a215c1f@gmail.com> Date: Thu, 14 Sep 2017 16:36:00 +0800 From: Quan Xu <quan.xu0@...il.com> To: Yang Zhang <yang.zhang.wz@...il.com>, "Michael S. Tsirkin" <mst@...hat.com> Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org, wanpeng.li@...mail.com, pbonzini@...hat.com, tglx@...utronix.de, rkrcmar@...hat.com, dmatlack@...gle.com, agraf@...e.de, peterz@...radead.org, linux-doc@...r.kernel.org Subject: Re: [RFC PATCH v2 0/7] x86/idle: add halt poll support on 2017/9/13 19:56, Yang Zhang wrote: > On 2017/8/29 22:56, Michael S. Tsirkin wrote: >> On Tue, Aug 29, 2017 at 11:46:34AM +0000, Yang Zhang wrote: >>> Some latency-intensive workload will see obviously performance >>> drop when running inside VM. >> >> But are we trading a lot of CPU for a bit of lower latency? >> >>> The main reason is that the overhead >>> is amplified when running inside VM. The most cost i have seen is >>> inside idle path. >>> >>> This patch introduces a new mechanism to poll for a while before >>> entering idle state. If schedule is needed during poll, then we >>> don't need to goes through the heavy overhead path. >> >> Isn't it the job of an idle driver to find the best way to >> halt the CPU? >> >> It looks like just by adding a cstate we can make it >> halt at higher latencies only. And at lower latencies, >> if it's doing a good job we can hopefully use mwait to >> stop the CPU. >> >> In fact I have been experimenting with exactly that. >> Some initial results are encouraging but I could use help >> with testing and especially tuning. If you can help >> pls let me know! > > Quan, Can you help to test it and give result? Thanks. > Hi, MST I have tested the patch "intel_idle: add pv cstates when running on kvm" on a recent host that allows guests to execute mwait without an exit. also I have tested our patch "[RFC PATCH v2 0/7] x86/idle: add halt poll support", upstream linux, and idle=poll. the following is the result (which seems better than ever berfore, as I ran test case on a more powerful machine): for __netperf__, the first column is trans. rate per sec, the second column is CPU utilzation. 1. upstream linux 28371.7 bits/s -- 76.6 %CPU 2. idle=poll 34372 bit/s -- 999.3 %CPU 3. "[RFC PATCH v2 0/7] x86/idle: add halt poll support", with different values of parameter 'halt_poll_threshold': 28362.7 bits/s -- 74.7 %CPU (halt_poll_threshold=10000) 32949.5 bits/s -- 82.5 %CPU (halt_poll_threshold=20000) 39717.9 bits/s -- 104.1 %CPU (halt_poll_threshold=30000) 40137.9 bits/s -- 104.4 %CPU (halt_poll_threshold=40000) 40079.8 bits/s -- 105.6 %CPU (halt_poll_threshold=50000) 4. "intel_idle: add pv cstates when running on kvm" 33041.8 bits/s -- 999.4 %CPU for __ctxsw__, the first column is the time per process context switches, the second column is CPU utilzation.. 1. upstream linux 3624.19 ns/ctxsw -- 191.9 %CPU 2. idle=poll 3419.66 ns/ctxsw -- 999.2 %CPU 3. "[RFC PATCH v2 0/7] x86/idle: add halt poll support", with different values of parameter 'halt_poll_threshold': 1123.40 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=10000) 1127.38 ns/ctxsw -- 199.7 %CPU (halt_poll_threshold=20000) 1113.58 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=30000) 1117.12 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=40000) 1121.62 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=50000) 4. "intel_idle: add pv cstates when running on kvm" 3427.59 ns/ctxsw -- 999.4 %CPU -Quan
Powered by blists - more mailing lists