linux-kernel - Re: [PATCH v10 02/39] KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <877d1voiuz.fsf@redhat.com>
Date:   Thu, 22 Sep 2022 11:31:48 +0200
From:   Vitaly Kuznetsov <vkuznets@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     kvm@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Michael Kelley <mikelley@...rosoft.com>,
        Siddharth Chandrasekaran <sidcha@...zon.de>,
        Yuan Yao <yuan.yao@...ux.intel.com>,
        Maxim Levitsky <mlevitsk@...hat.com>,
        linux-hyperv@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v10 02/39] KVM: x86: hyper-v: Resurrect dedicated
 KVM_REQ_HV_TLB_FLUSH flag

Sean Christopherson <seanjc@...gle.com> writes:

> On Wed, Sep 21, 2022, Vitaly Kuznetsov wrote:
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index f62d5799fcd7..86504a8bfd9a 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -3418,11 +3418,17 @@ static inline void kvm_vcpu_flush_tlb_current(struct kvm_vcpu *vcpu)
>>   */
>>  void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu)
>>  {
>> -	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu))
>> +	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu)) {
>>  		kvm_vcpu_flush_tlb_current(vcpu);
>> +		kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
>
> This isn't correct, flush_tlb_current() flushes "host" TLB entries, i.e. guest-physical
> mappings in Intel terminology, where flush_tlb_guest() and (IIUC) Hyper-V's paravirt
> TLB flush both flesh "guest" TLB entries, i.e. linear and combined
> mappings.

(Honestly, I was waiting for this comment when I first brought this, I
even put it in a separate patch with a provokative "KVM: x86:
KVM_REQ_TLB_FLUSH_CURRENT is a superset of KVM_REQ_HV_TLB_FLUSH too"
name but AFAIR the only comment I got was "please merge with the patch
which clears KVM_REQ_TLB_FLUSH_GUEST" so started thinking this was the
right thing to do :) Jokes aside,

This small optimization was done for nSVM case. When switching from L1
to L2 and vice versa, the code does nested_svm_transition_tlb_flush()
which is

	kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
	kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);

On AMD, both KVM_REQ_TLB_FLUSH_CURRENT and KVM_REQ_TLB_FLUSH_GUEST are
the same thing (.flush_tlb_current == .flush_tlb_guest ==
svm_flush_tlb_current()) flushing the whole ASID so processing Hyper-V
TLB flush requests is ceratainly redundant.

Now let's get to VMX and the point of my confusion (and thanks in
advance for educating me!):
AFAIU, when EPT is in use:
 KVM_REQ_TLB_FLUSH_CURRENT == invept
 KVM_REQ_TLB_FLUSH_GUEST = invvpid

For "normal" mappings (which are mapped on both stages) this is the same
thing as they're 'tagged' with both VPID and 'EPT root'. The question is
what's left. Given your comment, do I understand correctly that in case
of an invalid mapping in the guest (GVA doesn't resolve to a GPA), this
will only be tagged with VPID but not with 'EPT root' (as the CPU never
reached to the second translation stage)? We certainly can't ignore
these. Another (probably pure theoretical question) is what are the
mappings which are tagged with 'EPT root' but don't have a VPID tag? Are
these the mapping which happen when e.g. vCPU has paging disabled? These
are probably unrelated to Hyper-V TLB flushing.

To preserve the 'small' optimization, we can probably move 
 kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);

to nested_svm_transition_tlb_flush() or, in case this sounds too
hackish, we can drop it for now and add it to the (already overfull)
bucket of the "optimize nested_svm_transition_tlb_flush()".

-- 
Vitaly