[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190730120939.GM31381@hirez.programming.kicks-ass.net>
Date: Tue, 30 Jul 2019 14:09:39 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Wanpeng Li <kernellwp@...il.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
Paolo Bonzini <pbonzini@...hat.com>,
Radim Krčmář <rkrcmar@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] KVM: Disable wake-affine vCPU process to mitigate lock
holder preemption
On Tue, Jul 30, 2019 at 05:33:55PM +0800, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@...cent.com>
>
> Wake-affine is a feature inside scheduler which we attempt to make processes
> running closely, it gains benefit mostly from cache-hit. When waker tries
> to wakup wakee, it needs to select cpu to run wakee, wake affine heuristic
> mays select the cpu which waker is running on currently instead of the prev
> cpu which wakee was last time running.
>
> However, in multiple VMs over-subscribe virtualization scenario, it increases
> the probability to incur vCPU stacking which means that the sibling vCPUs from
> the same VM will be stacked on one pCPU. I test three 80 vCPUs VMs running on
> one 80 pCPUs Skylake server(PLE is supported), the ebizzy score can increase 17%
> after disabling wake-affine for vCPU process.
>
> When qemu/other vCPU inject virtual interrupt to guest through waking up one
> sleeping vCPU, it increases the probability to stack vCPUs/qemu by scheduler
> wake-affine. vCPU stacking issue can greately inceases the lock synchronization
> latency in a virtualized environment. This patch disables wake-affine vCPU
> process to mitigtate lock holder preemption.
>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Paolo Bonzini <pbonzini@...hat.com>
> Cc: Radim Krčmář <rkrcmar@...hat.com>
> Signed-off-by: Wanpeng Li <wanpengli@...cent.com>
> ---
> include/linux/sched.h | 1 +
> kernel/sched/fair.c | 3 +++
> virt/kvm/kvm_main.c | 1 +
> 3 files changed, 5 insertions(+)
> index 036be95..18eb1fa 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5428,6 +5428,9 @@ static int wake_wide(struct task_struct *p)
> unsigned int slave = p->wakee_flips;
> int factor = this_cpu_read(sd_llc_size);
>
> + if (unlikely(p->flags & PF_NO_WAKE_AFFINE))
> + return 1;
> +
> if (master < slave)
> swap(master, slave);
> if (slave < factor || master < slave * factor)
I intensely dislike how you misrepresent this patch as a KVM patch.
Also the above is very much not the right place, even if this PF_flag
were to live.
Powered by blists - more mailing lists