[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YRF+qzvH8jbLCuNE@google.com>
Date: Mon, 9 Aug 2021 19:14:51 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Maxim Levitsky <mlevitsk@...hat.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
Wanpeng Li <wanpengli@...cent.com>,
Thomas Gleixner <tglx@...utronix.de>,
Joerg Roedel <joro@...tes.org>, Borislav Petkov <bp@...en8.de>,
Jim Mattson <jmattson@...gle.com>,
"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
"open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
<linux-kernel@...r.kernel.org>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH v3 06/12] KVM: x86: don't disable APICv memslot when
inhibited
On Mon, Aug 09, 2021, Maxim Levitsky wrote:
> On Tue, 2021-08-03 at 10:44 +0200, Paolo Bonzini wrote:
> > Reviewing this patch and the next one together.
> >
> > On 02/08/21 20:33, Maxim Levitsky wrote:
> > > +static int avic_alloc_access_page(struct kvm *kvm)
> > > {
> > > void __user *ret;
> > > int r = 0;
> > >
> > > mutex_lock(&kvm->slots_lock);
> > > +
> > > + if (kvm->arch.apic_access_memslot_enabled)
> > > goto out;
> >
> > This variable is overloaded between "is access enabled" and "is the
> > memslot allocated". I think you should check
> > kvm->arch.apicv_inhibit_reasons instead in kvm_faultin_pfn.
> >
> >
> > > + if (!activate)
> > > + kvm_zap_gfn_range(kvm, gpa_to_gfn(APIC_DEFAULT_PHYS_BASE),
> > > + gpa_to_gfn(APIC_DEFAULT_PHYS_BASE + PAGE_SIZE));
> > > +
> >
> > Off by one, the last argument of kvm_zap_gfn_range is inclusive:
>
> Actually is it?
Nope. The actual implementation is exclusive for both legacy and TDP MMU. And
as you covered below, the fixed and variable MTRR helpers provide exclusive
start+end, so there's no functional bug. The "0 - ~0" use case is irrevelant
because there can't be physical memory at -4096.
The ~0ull case can be fixed by adding a helper to get the max GFN possible, e.g.
steal this code from kvm_tdp_mmu_put_root()
gfn_t max_gfn = 1ULL << (shadow_phys_bits - PAGE_SHIFT);
and maybe add a comment saying it intentionally ignores guest.MAXPHYADDR (from
CPUID) so that the helper can be used even when CPUID is being modified.
> There are 3 uses of this function.
> Two of them (kvm_post_set_cr0 and one case in update_mtrr) use 0,~0ULL which is indeed inclusive,
> but for variable mtrrs I see that in var_mtrr_range this code:
>
> *end = (*start | ~mask) + 1;
>
> and the *end is passed to kvm_zap_gfn_range.
>
>
> Another thing I noticed that I added calls to kvm_inc_notifier_count/kvm_dec_notifier_count
> in the kvm_zap_gfn_range but these do seem to have non inclusive ends, thus
> I need to fix them sadly if this is the case.
> This depends on mmu_notifier_ops and it is not documented well.
>
> However at least mmu_notifier_retry_hva, does assume a non inclusive range since it checks
>
>
> hva >= kvm->mmu_notifier_range_start &&
> hva < kvm->mmu_notifier_range_end
>
>
> Also looking at the algorithm of the kvm_zap_gfn_range.
> Suppose that gfn_start == gfn_end and we have a memslot with one page at gfn_start
>
> Then:
>
>
> start = max(gfn_start, memslot->base_gfn); // start = memslot->base_gfn
> end = min(gfn_end, memslot->base_gfn + memslot->npages); // end = memslot->base_gfn
>
> if (start >= end)
> continue;
>
> In this case it seems that it will do nothing. So I suspect that kvm_zap_gfn_range
> actually needs non inclusive range but due to the facts that it was used much
> it didn't cause trouble.
>
> Another thing I found in kvm_zap_gfn_range:
>
> kvm_flush_remote_tlbs_with_address(kvm, gfn_start, gfn_end);
>
> But kvm_flush_remote_tlbs_with_address expects (struct kvm *kvm, u64 start_gfn, u64 pages)
Heh, surpise, surprise, a rare path with no architecturally visible effects is
busted :-)
> kvm_flush_remote_tlbs_with_address is also for some reason called twice with
> the same parameters.
It's called twice in the current code because mmu_lock is dropped between handling
the current MMU and the legacy mmu.
> Could you help with that? Am I missing something?
Powered by blists - more mailing lists