lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 5 Jun 2023 10:17:27 -0700
From:   Ben Gardon <bgardon@...gle.com>
To:     Jim Mattson <jmattson@...gle.com>
Cc:     Mingwei Zhang <mizhang@...gle.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] KVM: x86/mmu: Remove KVM MMU write lock when accessing indirect_shadow_pages

On Mon, Jun 5, 2023 at 9:55 AM Jim Mattson <jmattson@...gle.com> wrote:
>
> On Sun, Jun 4, 2023 at 5:43 PM Mingwei Zhang <mizhang@...gle.com> wrote:
> >
> > Remove KVM MMU write lock when accessing indirect_shadow_pages counter when
> > page role is direct because this counter value is used as a coarse-grained
> > heuristics to check if there is nested guest active. Racing with this
> > heuristics without mmu lock will be harmless because the corresponding
> > indirect shadow sptes for the GPA will either be zapped by this thread or
> > some other thread who has previously zapped all indirect shadow pages and
> > makes the value to 0.
> >
> > Because of that, remove the KVM MMU write lock pair to potentially reduce
> > the lock contension and improve the performance of nested VM. In addition
> > opportunistically change the comment of 'direct mmu' to make the
> > description consistent with other places.
> >
> > Reported-by: Jim Mattson <jmattson@...gle.com>
> > Signed-off-by: Mingwei Zhang <mizhang@...gle.com>
> > ---
> >  arch/x86/kvm/x86.c | 10 ++--------
> >  1 file changed, 2 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 5ad55ef71433..97cfa5a00ff2 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -8585,15 +8585,9 @@ static bool reexecute_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> >
> >         kvm_release_pfn_clean(pfn);
> >
> > -       /* The instructions are well-emulated on direct mmu. */
> > +       /* The instructions are well-emulated on Direct MMUs. */

Nit: Internally within Google, on older kernels, we have the "Direct
MMU" which was the precursor to the TDP MMU we all know and love. This
comment however does not refer to the Direct MMU. Direct here just
refers to the direct role bit being set. Since it's just descriptive,
direct should not be capitalized in this comment, so no reason to
change this line.

> >         if (vcpu->arch.mmu->root_role.direct) {
> > -               unsigned int indirect_shadow_pages;
> > -
> > -               write_lock(&vcpu->kvm->mmu_lock);
> > -               indirect_shadow_pages = vcpu->kvm->arch.indirect_shadow_pages;
> > -               write_unlock(&vcpu->kvm->mmu_lock);
> > -
> > -               if (indirect_shadow_pages)
> > +               if (READ_ONCE(vcpu->kvm->arch.indirect_shadow_pages))
>
> I don't understand the need for READ_ONCE() here. That implies that
> there is something tricky going on, and I don't think that's the case.

Agree this doesn't need a READ_ONCE. Just a read is fine.
kvm_mmu_unprotect_page starts by acquiring the MMU lock, so there's
not much room to reorder anything anyway.

Thanks for sending a patch to fix this. The critical section of the
MMU lock here is small, but any lock acquisition in write mode can
mess up performance of otherwise happy read-mode uses.

>
> >                         kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(gpa));
> >
> >                 return true;
> >
> > base-commit: 31b4fc3bc64aadd660c5bfa5178c86a7ba61e0f7
> > --
> > 2.41.0.rc0.172.g3f132b7071-goog
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ