lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM6PR12MB35005789AA6383B850DFC57ECA519@DM6PR12MB3500.namprd12.prod.outlook.com>
Date:   Tue, 11 Jan 2022 06:34:20 +0000
From:   Kechen Lu <kechenl@...dia.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>
CC:     "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "seanjc@...gle.com" <seanjc@...gle.com>,
        "wanpengli@...cent.com" <wanpengli@...cent.com>,
        "vkuznets@...hat.com" <vkuznets@...hat.com>,
        Somdutta Roy <somduttar@...dia.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [RFC PATCH v2 0/3] KVM: x86: add per-vCPU exits disable
 capability

Hi Michael,

> -----Original Message-----
> From: Michael S. Tsirkin <mst@...hat.com>
> Sent: Monday, January 10, 2022 1:18 PM
> To: Kechen Lu <kechenl@...dia.com>
> Cc: kvm@...r.kernel.org; pbonzini@...hat.com; seanjc@...gle.com;
> wanpengli@...cent.com; vkuznets@...hat.com; Somdutta Roy
> <somduttar@...dia.com>; linux-kernel@...r.kernel.org
> Subject: Re: [RFC PATCH v2 0/3] KVM: x86: add per-vCPU exits disable
> capability
> 
> External email: Use caution opening links or attachments
> 
> 
> On Tue, Dec 21, 2021 at 01:04:46AM -0800, Kechen Lu wrote:
> > Summary
> > ===========
> > Introduce support of vCPU-scoped ioctl with
> KVM_CAP_X86_DISABLE_EXITS
> > cap for disabling exits to enable finer-grained VM exits disabling on
> > per vCPU scales instead of whole guest. This patch series enabled the
> > vCPU-scoped exits control on HLT VM-exits.
> >
> > Motivation
> > ============
> > In use cases like Windows guest running heavy CPU-bound workloads,
> > disabling HLT VM-exits could mitigate host sched ctx switch overhead.
> > Simply HLT disabling on all vCPUs could bring performance benefits,
> > but if no pCPUs reserved for host threads, could happened to the
> > forced preemption as host does not know the time to do the schedule
> > for other host threads want to run. With this patch, we could only
> > disable part of vCPUs HLT exits for one guest, this still keeps
> > performance benefits, and also shows resiliency to host stressing
> > workload running at the same time.
> >
> > Performance and Testing
> > =========================
> > In the host stressing workload experiment with Windows guest heavy
> > CPU-bound workloads, it shows good resiliency and having the ~3%
> > performance improvement. E.g. Passmark running in a Windows guest with
> > this patch disabling HLT exits on only half of vCPUs still showing
> > 2.4% higher main score v/s baseline.
> >
> > Tested everything on AMD machines.
> >
> >
> > v1->v2 (Sean Christopherson) :
> > - Add explicit restriction for VM-scoped exits disabling to be called
> >   before vCPUs creation (patch 1)
> > - Use vCPU ioctl instead of 64bit vCPU bitmask (patch 3), and make exits
> >   disable flags check purely for vCPU instead of VM (patch 2)
> 
> This is still quite blunt and assumes a ton of configuration on the host exactly
> matching the workload within guest. Which seems a waste since guests
> actually have the smarts to know what's happening within them.
> 

For now we use fixed configuration on the host for our guests, it still 
gives promising performance benefits on most workloads in our use case. But 
yeah, it's not adaptive and flexible for workloads in guest.

> If you are going to allow guest to halt a vCPU, how about working on
> exposing mwait to guest cleanly instead?
> The idea is to expose this in ACPI - linux guests ignore ACPI and go by CPUID
> but windows guests follow ACPI. Linux can be patched ;)
> 
> What we would have is a mirror of host ACPI states, such that lower states
> invoke HLT and exit, higher power states invoke mwait and wait within guest.
> 
> The nice thing with this approach is that it's already supported by the host
> kernel, so it's just a question of coding up ACPI.
> 

This idea looks really interesting! If we could achieve idling longer time(deeper power
State) causing HLT and exit, shorter time idle(higher power state) mwait in guest, 
through ACPI config, that's indeed a more adaptive and cleaner approach. But especially
for Windows guest, its idle process execution and idle/sleep state switching logic seems
not well documented, need to figure out impacts on idle process and os PM behaviors 
with the change.

But much thanks for this suggestion, I will try to explore it a bit,
and will get updates posted. 

Thanks!

Best Regards,
Kechen

> 
> 
> >
> > Best Regards,
> > Kechen
> >
> > Kechen Lu (3):
> >   KVM: x86: only allow exits disable before vCPUs created
> >   KVM: x86: move ()_in_guest checking to vCPU scope
> >   KVM: x86: add vCPU ioctl for HLT exits disable capability
> >
> >  Documentation/virt/kvm/api.rst     |  4 +++-
> >  arch/x86/include/asm/kvm-x86-ops.h |  1 +
> >  arch/x86/include/asm/kvm_host.h    |  7 +++++++
> >  arch/x86/kvm/cpuid.c               |  2 +-
> >  arch/x86/kvm/lapic.c               |  2 +-
> >  arch/x86/kvm/svm/svm.c             | 20 +++++++++++++++-----
> >  arch/x86/kvm/vmx/vmx.c             | 26 ++++++++++++++++++--------
> >  arch/x86/kvm/x86.c                 | 24 +++++++++++++++++++++++-
> >  arch/x86/kvm/x86.h                 | 16 ++++++++--------
> >  9 files changed, 77 insertions(+), 25 deletions(-)
> >
> > --
> > 2.30.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ