lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZjVba9wOiIlhqjfi@google.com>
Date: Fri, 3 May 2024 14:47:23 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Parshuram Sangle <parshuram.sangle@...el.com>
Cc: kvm@...r.kernel.org, pbonzini@...hat.com, linux-kernel@...r.kernel.org, 
	jaishankar.rajendran@...el.com
Subject: Re: [PATCH 0/2] KVM: enable halt poll shrink parameter

On Thu, Nov 02, 2023, Parshuram Sangle wrote:
> KVM halt polling interval growth and shrink behavior has evolved since its
> inception. The current mechanism adjusts the polling interval based on whether
> vcpu wakeup was received or not during polling interval using grow and shrink
> parameter values. Though grow parameter is logically set to 2 by default,
> shrink parameter is kept disabled (set to 0).
> 
> Disabled shrink has two issues:
> 1) Resets polling interval to 0 on every un-successful poll assuming it is
> less likely to receive a vcpu wakeup in further shrunk intervals.
> 2) Even on successful poll, if total block time is greater or equal to current
> poll_ns value, polling interval is reset to 0 instead shrinking gradually.
> 
> These aspects reduce the chances receiving valid wakeup during polling and
> lose potential performance benefits for VM workloads.
> 
> Below is the summary of experiments conducted to assess performance and power
> impact by enabling the halt_poll_ns_shrink parameter(value set to 2).
> 
> Performance Test Summary: (Higher is better)
> --------------------------------------------
> Platform Details: Chrome Brya platform
> CPU - Alder Lake (12th Gen Intel CPU i7-1255U)
> Host kernel version - 5.15.127-20371-g710a1611ad33
> 
> Android VM workload (Score)   Base      Shrink Enabled (value 2)    Delta
> ---------------------------------------------------------------------------
> GeekBench Multi-core(CPU)     5754      5856                        2%
> 3D Mark Slingshot(CPU+GPU)    15486     15885                       3%
> Stream (handopt)(Memory)      20566     21594                       5%
> fio seq-read (Storage)        727       747                         3%
> fio seq-write (Storage)       331       343                         3%
> fio rand-read (Storage)       690       732                         6%
> fio rand-write (Storage)      299       300                         1%
> 
> Steam Gaming VM (Avg FPS)     Base      Shrink Enabled (value 2)    Delta
> ---------------------------------------------------------------------------
> Metro Redux (OpenGL)          54.80     59.60                       9%
> Dota 2 (Open GL)              48.74     51.40                       5%
> Dota 2 (Vulkan)               20.80     21.10                       1%
> SpaceShip (Vulkan)            20.40     21.52                       6%
> 
> With Shrink enabled, majority of workloads show higher % of successful polling.
> Reduced latency of returning control back to VM and avoided overhead of vm_exit
> contribute to these performance gains.
> 
> Power Impact Assessment Summary: (Lower is better)
> --------------------------------------------------
> Method : DAQ measurements of CPU and Memory rails
> 
> CPU+Memory (Watt)             Base      Shrink Enabled (value 2)    Delta
> ---------------------------------------------------------------------------
> Idle* (Host)                  0.636     0.631                       -0.8%
> Video Playback (Host)         2.225     2.210                       -0.7%
> Tomb Raider (VM)              17.261    17.175                      -0.5%
> SpaceShip Benchmark(VM)       17.079    17.123                       0.3%
> 
> *Idle power - Idle system with no application running, Android and Borealis
> VMs enabled running no workload. Duration 180 sec.
> 
> Power measurements done for Chrome idle scenario and active Gaming VM 
> workload show negligible power overhead since additional polling creates
> very short duration bursts which are less likely to have gone to a
> complete idle CPU state.
> 
> NOTE: No tests are conducted on non-x86 platform with this changed config
> 
> The default values of grow and shrink parameters get commonly used by
> various VM deployments unless specifically tuned for performance. Hence
> referring to performance and power measurements results shown above, it is
> recommended to have shrink enabled (with value 2) by default so that there
> is no need to explicitly set this parameter through kernel cmdline or by
> other means.

I am by no means an expert on halt polling or power management, but all of this
seems like a reasonable tradeoff.  And even without the numbers you provided,
starting from scratch after a single failure is rather odd.

So unless someone objects, I'll plan on applying this for 6.11 in a few weeks
(after the 6.10 merge window closes).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ