[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mtfguvhu.fsf@redhat.com>
Date: Tue, 17 May 2022 15:41:33 +0200
From: Vitaly Kuznetsov <vkuznets@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Michael Kelley <mikelley@...rosoft.com>,
Siddharth Chandrasekaran <sidcha@...zon.de>,
linux-hyperv@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 04/34] KVM: x86: hyper-v: Handle
HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently
Sean Christopherson <seanjc@...gle.com> writes:
> On Thu, Apr 14, 2022, Vitaly Kuznetsov wrote:
>> @@ -1862,15 +1890,58 @@ void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
>> {
>> struct kvm_vcpu_hv_tlb_flush_ring *tlb_flush_ring;
>> struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> + struct kvm_vcpu_hv_tlb_flush_entry *entry;
>> + int read_idx, write_idx;
>> + u64 address;
>> + u32 count;
>> + int i, j;
>>
>> - kvm_vcpu_flush_tlb_guest(vcpu);
>> -
>> - if (!hv_vcpu)
>> + if (!tdp_enabled || !hv_vcpu) {
>> + kvm_vcpu_flush_tlb_guest(vcpu);
>> return;
>> + }
>>
>> tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
>>
>> - tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
>> + /*
>> + * TLB flush must be performed on the target vCPU so 'read_idx'
>> + * (AKA 'tail') cannot change underneath, the compiler is free
>> + * to re-read it.
>> + */
>> + read_idx = tlb_flush_ring->read_idx;
>> +
>> + /*
>> + * 'write_idx' (AKA 'head') can be concurently updated by a different
>> + * vCPU so we must be sure it's read once.
>> + */
>> + write_idx = READ_ONCE(tlb_flush_ring->write_idx);
>> +
>> + /* Pairs with smp_wmb() in hv_tlb_flush_ring_enqueue() */
>> + smp_rmb();
>> +
>> + for (i = read_idx; i != write_idx; i = (i + 1) % KVM_HV_TLB_FLUSH_RING_SIZE) {
>> + entry = &tlb_flush_ring->entries[i];
>> +
>> + if (entry->flush_all)
>> + goto out_flush_all;
>> +
>> + /*
>> + * Lower 12 bits of 'address' encode the number of additional
>> + * pages to flush.
>> + */
>> + address = entry->addr & PAGE_MASK;
>> + count = (entry->addr & ~PAGE_MASK) + 1;
>> + for (j = 0; j < count; j++)
>> + static_call(kvm_x86_flush_tlb_gva)(vcpu, address + j * PAGE_SIZE);
>> + }
>> + ++vcpu->stat.tlb_flush;
>
> Bumping tlb_flush is inconsistent with how KVM handles INVLPG, and could be wrong
> if the ring is empty (might be impossible without a bug?). And if my math is right,
> or at least in the ballpark, tlb_flush will be incremented once regardless of whether
> the loop flushed 1 page or 64k pages (completely full ring, full count on every one).
>
> I'd prefer to either drop the stat adjustment entirely, or bump invlpg in the loop, e.g.
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 56f06cf85282..5654c9d56289 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1945,10 +1945,11 @@ void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
> for (i = read_idx; i != write_idx; i = (i + 1) % KVM_HV_TLB_FLUSH_RING_SIZE) {
> address = tlb_flush_ring->entries[i] & PAGE_MASK;
> count = (tlb_flush_ring->entries[i] & ~PAGE_MASK) + 1;
> - for (j = 0; j < count; j++)
> + for (j = 0; j < count; j++) {
> static_call(kvm_x86_flush_tlb_gva)(vcpu, address + j * PAGE_SIZE);
> + ++vcpu->stat.invlpg;
> + }
> }
> - ++vcpu->stat.tlb_flush;
>
> out_empty_ring:
> tlb_flush_ring->read_idx = write_idx;
>
My idea was that flushing individual GVAs is always 'less intrusive'
than flushing the whole address space which counts as '1' in
'stat.tlb_flush'. Yes, 'flush 1 GVA' is equal to 'flush 64k' but on the
other hand if we do the math yor way we get:
- flush the whole address space: "stat.tlb_flush" is incremented by '1'.
- flush 100 indivudual GVAs: "stat.tlb_flush" is incremented by '100'.
What if we instead give 'stat.tlb_flush' the following meaning here:
"how many indivudual TLB flush requests were submitted", i.e.:
for (i = read_idx; i != write_idx; i = (i + 1) % KVM_HV_TLB_FLUSH_RING_SIZE) {
address = tlb_flush_ring->entries[i] & PAGE_MASK;
count = (tlb_flush_ring->entries[i] & ~PAGE_MASK) + 1;
for (j = 0; j < count; j++)
static_call(kvm_x86_flush_tlb_gva)(vcpu, address + j * PAGE_SIZE);
++vcpu->stat.invlpg;
}
(something in between what I have now and what you suggest). What do you think?
--
Vitaly
Powered by blists - more mailing lists