[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YtWPSILmAp/0m5eC@google.com>
Date: Mon, 18 Jul 2022 16:50:16 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Mingwei Zhang <mizhang@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/4] KVM: x86/mmu: Document the "rules" for using
host_pfn_mapping_level()
On Sat, Jul 16, 2022, Mingwei Zhang wrote:
> On Fri, Jul 15, 2022, Sean Christopherson wrote:
> > Add a comment to document how host_pfn_mapping_level() can be used safely,
> > as the line between safe and dangerous is quite thin. E.g. if KVM were
> > to ever support in-place promotion to create huge pages, consuming the
> > level is safe if the caller holds mmu_lock and checks that there's an
> > existing _leaf_ SPTE, but unsafe if the caller only checks that there's a
> > non-leaf SPTE.
> >
> > Opportunistically tweak the existing comments to explicitly document why
> > KVM needs to use READ_ONCE().
> >
> > No functional change intended.
> >
> > Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> > ---
> > arch/x86/kvm/mmu/mmu.c | 42 +++++++++++++++++++++++++++++++++++-------
> > 1 file changed, 35 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index bebff1d5acd4..d5b644f3e003 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -2919,6 +2919,31 @@ static void direct_pte_prefetch(struct kvm_vcpu *vcpu, u64 *sptep)
> > __direct_pte_prefetch(vcpu, sp, sptep);
> > }
> >
> > +/*
> > + * Lookup the mapping level for @gfn in the current mm.
> > + *
> > + * WARNING! Use of host_pfn_mapping_level() requires the caller and the end
> > + * consumer to be tied into KVM's handlers for MMU notifier events!
> Since calling this function won't cause kernel crash now, I guess we can
> remove the warning sign here, but keep the remaining statement since it
> is necessary.
Calling this function won't _directly_ crash the kernel, but improper usage can
most definitely crash the host kernel, or even worse, silently corrupt host and
or guest data. E.g. if KVM were to race with an mmu_notifier event and incorrectly
map a stale huge page into the guest.
So yes, the function itself is robust, but usage is still very subtle and delicate.
> > + *
> > + * There are several ways to safely use this helper:
> > + *
> > + * - Check mmu_notifier_retry_hva() after grabbing the mapping level, before
> > + * consuming it. In this case, mmu_lock doesn't need to be held during the
> > + * lookup, but it does need to be held while checking the MMU notifier.
>
> but it does need to be held while checking the MMU notifier and
> consuming the result.
I didn't want to include "consuming the result" because arguably the result is
being consumed while running the guest, and obviously KVM doesn't hold mmu_lock
while running the guest (though I fully acknowledge the above effectively uses
"consume" in the sense of shoving the result into SPTEs).
> > + *
> > + * - Hold mmu_lock AND ensure there is no in-progress MMU notifier invalidation
> > + * event for the hva. This can be done by explicit checking the MMU notifier
s/explicit/explicitly
> > + * or by ensuring that KVM already has a valid mapping that covers the hva.
>
> Yes, more specifically, "mmu notifier sequence counter".
Heh, depends on what the reader interprets as "sequence counter". If the reader
interprets that as the literal sequence counter, mmu_notifier_seq, then this phrasing
is incorrect as mmu_notifier_seq isn't bumped until the invalidation completes,
i.e. it guards against _past_ invalidations, not in-progress validations.
My preference is to intentionally not be precise in describing how to check for an
in-progress invalidation, e.g. so that this comment doesn't need to be updated if
the details change, and to also to try and force developers to do more than copy
and paste if they want to use this helper.
Powered by blists - more mailing lists