linux-kernel - Re: [PATCH 5/7] KVM: X86: Don't unsync pagetables when speculative

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJhGHyCpQDfon_RFefV_kRzeNBg0EvzyEh9KRogqTRrBQHpYeA@mail.gmail.com>
Date:   Sat, 18 Sep 2021 11:06:37 +0800
From:   Lai Jiangshan <jiangshanlai@...il.com>
To:     Maxim Levitsky <mlevitsk@...hat.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Lai Jiangshan <laijs@...ux.alibaba.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
        kvm@...r.kernel.org
Subject: Re: [PATCH 5/7] KVM: X86: Don't unsync pagetables when speculative

It is weird that I did not receive this email.

On Mon, Sep 13, 2021 at 7:02 PM Maxim Levitsky <mlevitsk@...hat.com> wrote:
>
> On Tue, 2021-08-24 at 15:55 +0800, Lai Jiangshan wrote:
> > From: Lai Jiangshan <laijs@...ux.alibaba.com>
> >
> > We'd better only unsync the pagetable when there just was a really
> > write fault on a level-1 pagetable.
> >
> > Signed-off-by: Lai Jiangshan <laijs@...ux.alibaba.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c          | 6 +++++-
> >  arch/x86/kvm/mmu/mmu_internal.h | 3 ++-
> >  arch/x86/kvm/mmu/spte.c         | 2 +-
> >  3 files changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index a165eb8713bc..e5932af6f11c 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -2600,7 +2600,8 @@ static void kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
> >   * were marked unsync (or if there is no shadow page), -EPERM if the SPTE must
> >   * be write-protected.
> >   */
> > -int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
> > +int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync,
> > +                         bool speculative)
> >  {
> >       struct kvm_mmu_page *sp;
> >       bool locked = false;
> > @@ -2626,6 +2627,9 @@ int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
> >               if (sp->unsync)
> >                       continue;
> >
> > +             if (speculative)
> > +                     return -EEXIST;
>
> Woudn't it be better to ensure that callers set can_unsync = false when speculating?

I don't want to change the current behavior of "can_unsync"

For a gfn:
  case1: All sps for the gfn are synced
  case2: Some sps for the gfn are synced and the others are not
  case3: All sps for the gfn are not synced

"!can_unsync" causes the function to return non-zero for all cases.
"speculative" causes the function to return non-zero for case1,case2.

I don't think it would be bug if the behavior of old "!can_unsync" is changed
to the behavior of this new "speculative".  But the meaning of "!can_unsync"
has to be changed.

!can_unsync: all sps for @gfn can't be unsync.  (derived from current code)
==>
!can_unsync: it should not do any unsync operation.

I have sent the patch in V2 without any change.  If the new meaning
is preferred, I will respin the patch, or I will send it separately
if no other patches in V2 need to be updated.

>
> Also if I understand correctly this is not fixing a bug, but an optimization?
>

It is not fixing any bugs.  But it is weird to do unsync operation on sps when
speculative which would cause future overhead with no reason.

> Best regards,
>         Maxim Levitsky
>
>
> > +
> >               /*
> >                * TDP MMU page faults require an additional spinlock as they
> >                * run with mmu_lock held for read, not write, and the unsync
> > diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> > index 658d8d228d43..f5d8be787993 100644
> > --- a/arch/x86/kvm/mmu/mmu_internal.h
> > +++ b/arch/x86/kvm/mmu/mmu_internal.h
> > @@ -116,7 +116,8 @@ static inline bool kvm_vcpu_ad_need_write_protect(struct kvm_vcpu *vcpu)
> >              kvm_x86_ops.cpu_dirty_log_size;
> >  }
> >
> > -int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync);
> > +int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync,
> > +                         bool speculative);
> >
> >  void kvm_mmu_gfn_disallow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
> >  void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
> > diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
> > index 3e97cdb13eb7..b68a580f3510 100644
> > --- a/arch/x86/kvm/mmu/spte.c
> > +++ b/arch/x86/kvm/mmu/spte.c
> > @@ -159,7 +159,7 @@ int make_spte(struct kvm_vcpu *vcpu, unsigned int pte_access, int level,
> >                * e.g. it's write-tracked (upper-level SPs) or has one or more
> >                * shadow pages and unsync'ing pages is not allowed.
> >                */
> > -             if (mmu_try_to_unsync_pages(vcpu, gfn, can_unsync)) {
> > +             if (mmu_try_to_unsync_pages(vcpu, gfn, can_unsync, speculative)) {
> >                       pgprintk("%s: found shadow page for %llx, marking ro\n",
> >                                __func__, gfn);
> >                       ret |= SET_SPTE_WRITE_PROTECTED_PT;
>
>