[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZDSa9Bbqvh0btgQo@google.com>
Date: Mon, 10 Apr 2023 16:25:40 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Jeremi Piotrowski <jpiotrowski@...ux.microsoft.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Tianyu Lan <ltykernel@...il.com>,
Michael Kelley <mikelley@...rosoft.com>
Subject: Re: [PATCH] KVM: SVM: Disable TDP MMU when running on Hyper-V
On Wed, Apr 05, 2023, Jeremi Piotrowski wrote:
> On 3/7/2023 6:36 PM, Sean Christopherson wrote:
> > Thinking about this more, I would rather revert commit 1e0c7d40758b ("KVM: SVM:
> > hyper-v: Remote TLB flush for SVM") or fix the thing properly straitaway. KVM
> > doesn't magically handle the flushes correctly for the shadow/legacy MMU, KVM just
> > happens to get lucky and not run afoul of the underlying bugs. The revert appears
> > to be reasonably straightforward (see bottom).
>
> Hi Sean,
>
> I'm back, and I don't have good news. The fix for the missing hyperv TLB flushes has
> landed in Linus' tree and I now had the chance to test things outside Azure, in WSL on my
> AMD laptop.
>
> There is some seriously weird interaction going on between TDP MMU and Hyper-V, with
> or without enlightened TLB. My laptop has 16 vCPUs, so the WSL VM also has 16 vCPUs.
> I have hardcoded the kernel to disable enlightened TLB (so we know that is not interfering).
> I'm running a Flatcar Linux VM inside the WSL VM using legacy BIOS, a single CPU
> and 4GB of RAM.
>
> If I run with `kvm.tdp_mmu=0`, I can boot and shutdown my VM consistently in 20 seconds.
>
> If I run with TDP MMU, the VM boot stalls for seconds at a time in various spots
> (loading grub, decompressing kernel, during kernel boot), the boot output feels like
> it's happening in slow motion. The fastest I see it finish the same cycle is 2 minutes,
> I have also seen it take 4 minutes, sometimes even not finish at all. Same everything,
> the only difference is the value of `kvm.tdp_mmu`.
When a stall occurs, can you tell where the time is lost? E.g. is the CPU stuck
in L0, L1, or L2? L2 being a single vCPU rules out quite a few scenarios, e.g.
lock contention and whatnot.
If you can run perf in WSL, that might be the easiest way to suss out what's going
on.
> So I would like to revisit disabling tdp_mmu on hyperv altogether for the time being but it
> should probably be with the following condition:
>
> tdp_mmu_enabled = tdp_mmu_allowed && tdp_enabled && !hypervisor_is_type(X86_HYPER_MS_HYPERV)
>
> Do you have an environment where you would be able to reproduce this? A Windows server perhaps
> or an AMD laptop?
Hrm, not easily, no. Can you try two things?
1. Linus' tree on Intel hardware
2. kvm-x86/next[*] on Intel hardware
Don't bother with #2 if #1 (Linus' tree) does NOT suffer the same stalls as AMD.
#2 is interesting iff Intel is also affected as kvm-x86/next has an optimization
for CR0.WP toggling, which was the achilles heel of the TDP MMU. If Intel isn't
affected, then something other than CR0.WP is to blame.
I fully expect both experiments to show the same behavior as AMD, but if for some
reason they don't, the results should help narrow the search.
[*] https://github.com/kvm-x86/linux/tree/next
Powered by blists - more mailing lists