[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <21b1ee26-dfd4-923d-72da-d8ded3dd819c@linux.microsoft.com>
Date: Mon, 13 Feb 2023 19:05:12 +0100
From: Jeremi Piotrowski <jpiotrowski@...ux.microsoft.com>
To: Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Tianyu Lan <ltykernel@...il.com>,
"Michael Kelley (LINUX)" <mikelley@...rosoft.com>
Subject: Re: "KVM: x86/mmu: Overhaul TDP MMU zapping and flushing" breaks SVM
on Hyper-V
On 13/02/2023 13:50, Paolo Bonzini wrote:
>
> On 2/13/23 13:44, Jeremi Piotrowski wrote:
>> Just built a kernel from that tree, and it displays the same behavior. The problem
>> is not that the addresses are wrong, but that the flushes are issued at the wrong
>> time now. At least for what "enlightened NPT TLB flush" requires.
>
> It is not clear to me why HvCallFluyshGuestPhysicalAddressSpace or HvCallFlushGuestPhysicalAddressList would have stricter requirements than a "regular" TLB shootdown using INVEPT.
>
> Can you clarify what you mean by wrong time, preferrably with some kind of sequence of events?
>
> That is, something like
>
> CPU 0 Modify EPT from ... to ...
> CPU 0 call_rcu() to free page table
> CPU 1 ... which is invalid because ...
> CPU 0 HvCallFlushGuestPhysicalAddressSpace
>
> Paolo
So I looked at the ftrace (all kvm&kvmmu events + hyperv_nested_* events) I see the following:
With tdp_mmu=0:
kvm_exit
sequence of kvm_mmu_prepare_zap_page
hyperv_nested_flush_guest_mapping (always follows every sequence of kvm_mmu_prepare_zap_page)
kvm_entry
With tdp_mmu=1 I see:
kvm_mmu_prepare_zap_page and kvm_tdp_mmu_spte_changed events from a kworker context, but
they are not followed by hyperv_nested_flush_guest_mapping. The only hyperv_nested_flush_guest_mapping
events I see happen from the qemu process context.
Also the number of flush hypercalls is significantly lower: a 7second sequence through OVMF with
tdp_mmu=0 produces ~270 flush hypercalls. In the traces with tdp_mmu=1 I now see max 3.
So this might be easier to diagnose than I thought: the HvCallFlushGuestPhysicalAddressSpace calls
are missing now.
Powered by blists - more mailing lists