lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 10 Apr 2020 00:56:52 -0700
From:   Ankur Arora <ankur.a.arora@...cle.com>
To:     Jürgen Groß <jgross@...e.com>,
        linux-kernel@...r.kernel.org, x86@...nel.org
Cc:     peterz@...radead.org, hpa@...or.com, jpoimboe@...hat.com,
        namit@...are.com, mhiramat@...nel.org, bp@...en8.de,
        vkuznets@...hat.com, pbonzini@...hat.com,
        boris.ostrovsky@...cle.com, mihai.carabas@...cle.com,
        kvm@...r.kernel.org, xen-devel@...ts.xenproject.org,
        virtualization@...ts.linux-foundation.org
Subject: Re: [RFC PATCH 00/26] Runtime paravirt patching

So, first thanks for the quick comments even though some of my choices
were straight NAKs (or maybe because of that!)

Second, I clearly did a bad job of motivating the series. Let me try
to address the motivation comments first and then I can address the
technical concerns separately.

[ I'm collating all the motivation comments below. ]


>> A KVM host (or another hypervisor) might advertise paravirtualized
>> features and optimization hints (ex KVM_HINTS_REALTIME) which might
>> become stale over the lifetime of the guest. For instance, the 

Thomas> If your host changes his advertised behaviour then you want to
Thomas> fix the host setup or find a competent admin.

Juergen> Then this hint is wrong if it can't be guaranteed.

I agree, the hint behaviour is wrong and the host shouldn't be giving
hints it can only temporarily honor.
The host problem is hard to fix though: the behaviour change is
either because of a guest migration or in case of a hosted guest,
cloud economics -- customers want to go to a 2-1 or worse VCPU-CPU
ratio at times of low load.

I had an offline discussion with Paolo Bonzini where he agreed that
it makes sense to make KVM_HINTS_REALTIME a dynamic hint rather than
static as it is now. (That was really the starting point for this
series.)

>> host might go from being undersubscribed to being oversubscribed
>> (or the other way round) and it would make sense for the guest
>> switch pv-ops based on that.

Juergen> I think using pvops for such a feature change is just wrong.
Juergen> What comes next? Using pvops for being able to migrate a guest
Juergen> from an Intel to an AMD machine?

My statement about switching pv-ops was too broadly worded. What
I meant to say was that KVM guests choose pv_lock_ops to be native
or paravirt based on undersubscribed/oversubscribed hint at boot,
and this choice should be available at run-time as well.

KVM chooses between native/paravirt spinlocks at boot based on this
reasoning (from commit b2798ba0b8):
"Waiman Long mentioned that:
> Generally speaking, unfair lock performs well for VMs with a small
> number of vCPUs. Native qspinlock may perform better than pvqspinlock
> if there is vCPU pinning and there is no vCPU over-commitment.
"

PeterZ> So what, the paravirt spinlock stuff works just fine when
PeterZ> you're not oversubscribed.
Yeah, the paravirt spinlocks work fine for both under and oversubscribed
hosts, but they are more expensive and that extra cost provides no benefits
when CPUs are pinned.
For instance, pvqueued spin_unlock() is a call+locked cmpxchg as opposed
to just a movb $0, (%rdi).

This difference shows up in kernbench running on a KVM guest with native
and paravirt spinlocks. I ran with 8 and 64 CPU guests with CPUs pinned.

The native version performs same or better.

8 CPU       Native  (std-dev)  Paravirt (std-dev)
             -----------------  -----------------
-j  4: sys  151.89  ( 0.2462)  160.14   ( 4.8366)    +5.4%
-j 32: sys  162.715 (11.4129)  170.225  (11.1138)    +4.6%
-j  0: sys  164.193 ( 9.4063)  170.843  ( 8.9651)    +4.0%


64 CPU       Native  (std-dev)  Paravirt (std-dev)
             -----------------  -----------------
-j  32: sys 209.448 (0.37009)  210.976   (0.4245)    +0.7%
-j 256: sys 267.401 (61.0928)  285.73   (78.8021)    +6.8%
-j   0: sys 286.313 (56.5978)  307.721  (70.9758)    +7.4%

In all cases the pv_kick, pv_wait numbers were minimal as expected.
The lock_slowpath counts were higher with PV but AFAICS the native
and paravirt lock_slowpath are not directly comparable.

Detailed kernbench numbers attached.

Thanks
Ankur

View attachment "8-cpus.txt" of type "text/plain" (1340 bytes)

View attachment "64-cpus.txt" of type "text/plain" (1414 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ