[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y5jAbS4kwRAdrWwM@google.com>
Date: Tue, 13 Dec 2022 18:11:57 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Lai Jiangshan <jiangshanlai@...il.com>
Cc: linux-kernel@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
Lai Jiangshan <jiangshan.ljs@...group.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, kvm@...r.kernel.org
Subject: Re: [PATCH 1/2] kvm: x86/mmu: Reduce the update to the spte in
FNAME(sync_page)
On Mon, Dec 12, 2022, Lai Jiangshan wrote:
> From: Lai Jiangshan <jiangshan.ljs@...group.com>
>
> Sometimes when the guest updates its pagetable, it adds only new gptes
> to it without changing any existed one, so there is no point to update
> the sptes for these existed gptes.
>
> Also when the sptes for these unchanged gptes are updated, the AD
> bits are also removed since make_spte() is called with prefetch=true
> which might result unneeded TLB flushing.
If either of the proposed changes is kept, please move this to a separate patch.
Skipping updates for PTEs with the same protections is separate logical change
from skipping updates when making the SPTE writable.
Actually, can't we just pass @prefetch=false to make_spte()? FNAME(prefetch_invalid_gpte)
has already verified the Accessed bit is set in the GPTE, so at least for guest
correctness there's no need to access-track the SPTE. Host page aging is already
fuzzy so I don't think there are problems there.
> Do nothing if the permissions are unchanged or only write-access is
> being added.
I'm pretty sure skipping the "make writable" case is architecturally wrong. On a
#PF, any TLB entries for the faulting virtual address are required to be removed.
That means KVM _must_ refresh the SPTE if a vCPU takes a !WRITABLE fault on an
unsync page. E.g. see kvm_inject_emulated_page_fault().
> Only update the spte when write-access is being removed. Drop the SPTE
> otherwise.
Correctness aside, there needs to be far more analysis and justification for a
change like this, e.g. performance numbers for various workloads.
> ---
> arch/x86/kvm/mmu/paging_tmpl.h | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index e5662dbd519c..613f043a3e9e 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -1023,7 +1023,7 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
> for (i = 0; i < SPTE_ENT_PER_PAGE; i++) {
> u64 *sptep, spte;
> struct kvm_memory_slot *slot;
> - unsigned pte_access;
> + unsigned old_pte_access, pte_access;
> pt_element_t gpte;
> gpa_t pte_gpa;
> gfn_t gfn;
> @@ -1064,6 +1064,23 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
> continue;
> }
>
> + /*
> + * Drop the SPTE if the new protections would result in access
> + * permissions other than write-access is changing. Do nothing
> + * if the permissions are unchanged or only write-access is
> + * being added. Only update the spte when write-access is being
> + * removed.
> + */
> + old_pte_access = kvm_mmu_page_get_access(sp, i);
> + if (old_pte_access == pte_access ||
> + (old_pte_access | ACC_WRITE_MASK) == pte_access)
> + continue;
> + if (old_pte_access != (pte_access | ACC_WRITE_MASK)) {
> + drop_spte(vcpu->kvm, &sp->spt[i]);
> + flush = true;
> + continue;
> + }
> +
> /* Update the shadowed access bits in case they changed. */
> kvm_mmu_page_set_access(sp, i, pte_access);
>
> --
> 2.19.1.6.gb485710b
>
Powered by blists - more mailing lists