lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YK5gFUjh6MX6+vx3@hirez.programming.kicks-ass.net>
Date:   Wed, 26 May 2021 16:49:57 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Masanori Misono <m.misono760@...il.com>
Cc:     David Woodhouse <dwmw@...zon.co.uk>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Rohit Jain <rohit.k.jain@...cle.com>,
        Ingo Molnar <mingo@...hat.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 0/1] Make vCPUs that are HLT state candidates for
 load balancing

On Wed, May 26, 2021 at 10:37:26PM +0900, Masanori Misono wrote:
> Hi,
> 
> I observed performance degradation when running some parallel programs on a
> VM that has (1) KVM_FEATURE_PV_UNHALT, (2) KVM_FEATURE_STEAL_TIME, and (3)
> multi-core architecture. The benchmark results are shown at the bottom. An
> example of libvirt XML for creating such VM is
> 
> ```
> [...]
>   <vcpu placement='static'>8</vcpu>
>   <cpu mode='host-model'>
>     <topology sockets='1' cores='8' threads='1'/>
>   </cpu>
>   <qemu:commandline>
>     <qemu:arg value='-cpu'/>
>     <qemu:arg value='host,l3-cache=on,+kvm-pv-unhalt,+kvm-steal-time'/>
>   </qemu:commandline>
> [...]
> ```
> 
> I investigate the cause and found that the problem occurs in the following
> ways:
> 
> - vCPU1 schedules thread A, and vCPU2 schedules thread B. vCPU1 and vCPU2
>   share LLC.
> - Thread A tries to acquire a lock but fails, resulting in a sleep state
>   (via futex.)
> - vCPU1 becomes idle because there are no runnable threads and does HLT,
>   which leads to HLT VMEXIT (if idle=halt, and KVM doesn't disable HLT
>   VMEXIT using KVM_CAP_X86_DISABLE_EXITS).
> - KVM sets vCPU1's st->preempted as 1 in kvm_steal_time_set_preempted().
> - Thread C wakes on vCPU2. vCPU2 tries to do load balancing in
>   select_idle_core(). Although vCPU1 is idle, vCPU1 is not a candidate for
>   load balancing because is_vcpu_preempted(vCPU1) is true, hence
>   available_idle_cpu(vPCU1) is false.
> - As a result, both thread B and thread C stay in the vCPU2's runqueue, and
>   vCPU1 is not utilized.
> 
> The patch changes kvm_arch_cpu_put() so that it does not set st->preempted
> as 1 when a vCPU does HLT VMEXIT. As a result, is_vcpu_preempted(vCPU)
> becomes 0, and the vCPU becomes a candidate for CFS load balancing.

I'm conficted on this; the vcpu stops running, the pcpu can go do
anything, it might start the next task. There is no saying how quickly
the vcpu task can return to running.

I'm guessing your setup doesn't actually overload the system; and when
it doesn't have the vcpu thread to run, the pcpu actually goes idle too.
But for those 1:1 cases we already have knobs to disable much of this
IIRC.

So I'm tempted to say things are working as expected and you're just not
configured right.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ