linux-kernel - Re: [RFC 07/19] KVM: x86/mmu: Factor wrprot for nested PML out of make

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANgfPd-f+VXQJnz-LPuiy+rTDkSdw3zjUfozaqzgb8n0rv9STA@mail.gmail.com>
Date:   Thu, 18 Nov 2021 09:43:46 -0800
From:   Ben Gardon <bgardon@...gle.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        Paolo Bonzini <pbonzini@...hat.com>,
        Peter Xu <peterx@...hat.com>, Peter Shier <pshier@...gle.com>,
        David Matlack <dmatlack@...gle.com>,
        Mingwei Zhang <mizhang@...gle.com>,
        Yulei Zhang <yulei.kernel@...il.com>,
        Wanpeng Li <kernellwp@...il.com>,
        Xiao Guangrong <xiaoguangrong.eric@...il.com>,
        Kai Huang <kai.huang@...el.com>,
        Keqian Zhu <zhukeqian1@...wei.com>,
        David Hildenbrand <david@...hat.com>
Subject: Re: [RFC 07/19] KVM: x86/mmu: Factor wrprot for nested PML out of make_spte

On Wed, Nov 17, 2021 at 6:12 PM Sean Christopherson <seanjc@...gle.com> wrote:
>
> On Wed, Nov 10, 2021, Ben Gardon wrote:
> > When running a nested VM, KVM write protects SPTEs in the EPT/NPT02
> > instead of using PML for dirty tracking. This avoids expensive
> > translation later, when emptying the Page Modification Log. In service
> > of removing the vCPU pointer from make_spte, factor the check for nested
> > PML out of the function.
>
> Aha!  The dependency on @vcpu can be avoided without having to take a flag from
> the caller.  The shadow page has everything we need.  The check is really "is this
> a page for L2 EPT".  The kvm_x86_ops.cpu_dirty_log_size gets us the EPT part, and
> kvm_mmu_page.guest_mode gets us the L2 part.

Haha that's way cleaner than what I was doing! Seems like an obvious
solution in retrospect. I'll include this in the next version of the
series I send out unless Paolo beats me and just merges it directly.
Happy to give this my reviewed-by.

>
> Compile tested only...
>
> From 773414e4fd7010c38ac89221d16089f3dcc57467 Mon Sep 17 00:00:00 2001
> From: Sean Christopherson <seanjc@...gle.com>
> Date: Wed, 17 Nov 2021 18:08:42 -0800
> Subject: [PATCH] KVM: x86/mmu: Use shadow page role to detect PML-unfriendly
>  pages for L2
>
> Rework make_spte() to query the shadow page's role, specifically whether
> or not it's a guest_mode page, a.k.a. a page for L2, when determining if
> the SPTE is compatible with PML.  This eliminates a dependency on @vcpu,
> with a future goal of being able to create SPTEs without a specific vCPU.
>
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>

Reviewed-by: Ben Gardon <bgardon@...gle.com>

> ---
>  arch/x86/kvm/mmu/mmu_internal.h | 7 +++----
>  arch/x86/kvm/mmu/spte.c         | 2 +-
>  2 files changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> index 8ede43a826af..03882b2624c8 100644
> --- a/arch/x86/kvm/mmu/mmu_internal.h
> +++ b/arch/x86/kvm/mmu/mmu_internal.h
> @@ -109,7 +109,7 @@ static inline int kvm_mmu_page_as_id(struct kvm_mmu_page *sp)
>         return kvm_mmu_role_as_id(sp->role);
>  }
>
> -static inline bool kvm_vcpu_ad_need_write_protect(struct kvm_vcpu *vcpu)
> +static inline bool kvm_mmu_page_ad_need_write_protect(struct kvm_mmu_page *sp)
>  {
>         /*
>          * When using the EPT page-modification log, the GPAs in the CPU dirty
> @@ -117,10 +117,9 @@ static inline bool kvm_vcpu_ad_need_write_protect(struct kvm_vcpu *vcpu)
>          * on write protection to record dirty pages, which bypasses PML, since
>          * writes now result in a vmexit.  Note, the check on CPU dirty logging
>          * being enabled is mandatory as the bits used to denote WP-only SPTEs
> -        * are reserved for NPT w/ PAE (32-bit KVM).
> +        * are reserved for PAE paging (32-bit KVM).
>          */
> -       return vcpu->arch.mmu == &vcpu->arch.guest_mmu &&
> -              kvm_x86_ops.cpu_dirty_log_size;
> +       return kvm_x86_ops.cpu_dirty_log_size && sp->role.guest_mode;
>  }
>
>  int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
> diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
> index 0c76c45fdb68..84e64dbdd89e 100644
> --- a/arch/x86/kvm/mmu/spte.c
> +++ b/arch/x86/kvm/mmu/spte.c
> @@ -101,7 +101,7 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
>
>         if (sp->role.ad_disabled)
>                 spte |= SPTE_TDP_AD_DISABLED_MASK;
> -       else if (kvm_vcpu_ad_need_write_protect(vcpu))
> +       else if (kvm_mmu_page_ad_need_write_protect(sp))
>                 spte |= SPTE_TDP_AD_WRPROT_ONLY_MASK;
>
>         /*
> --