[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230705101800.ut4c6topn6ylwczs@linux.intel.com>
Date: Wed, 5 Jul 2023 18:18:00 +0800
From: Yu Zhang <yu.c.zhang@...ux.intel.com>
To: David Stevens <stevensd@...omium.org>
Cc: Sean Christopherson <seanjc@...gle.com>,
Marc Zyngier <maz@...nel.org>,
Michael Ellerman <mpe@...erman.id.au>,
Peter Xu <peterx@...hat.com>,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev,
linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
kvm@...r.kernel.org
Subject: Re: [PATCH v7 5/8] KVM: x86/mmu: Don't pass FOLL_GET to
__kvm_follow_pfn
On Tue, Jul 04, 2023 at 04:50:50PM +0900, David Stevens wrote:
> From: David Stevens <stevensd@...omium.org>
>
> Stop passing FOLL_GET to __kvm_follow_pfn. This allows the host to map
> memory into the guest that is backed by un-refcounted struct pages - for
> example, higher order non-compound pages allocated by the amdgpu driver
> via ttm_pool_alloc_page.
I guess you mean the tail pages of the higher order non-compound pages?
And as to the head page, it is said to be set to one coincidentally[*],
and shall not be considered as refcounted. IIUC, refcount of this head
page will be increased and decreased soon in hva_to_pfn_remapped(), so
this may not be a problem(?). But treating this head page differently,
as a refcounted one(e.g., to set the A/D flags), is weired.
Or maybe I missed some context, e.g., can the head page be allocted to
guest at all?
>
> The bulk of this change is tracking the is_refcounted_page flag so that
> non-refcounted pages don't trigger page_count() == 0 warnings. This is
> done by storing the flag in an unused bit in the sptes.
Also, maybe we should mention this only works on x86-64.
>
> Signed-off-by: David Stevens <stevensd@...omium.org>
> ---
> arch/x86/kvm/mmu/mmu.c | 44 +++++++++++++++++++++------------
> arch/x86/kvm/mmu/mmu_internal.h | 1 +
> arch/x86/kvm/mmu/paging_tmpl.h | 9 ++++---
> arch/x86/kvm/mmu/spte.c | 4 ++-
> arch/x86/kvm/mmu/spte.h | 12 ++++++++-
> arch/x86/kvm/mmu/tdp_mmu.c | 22 ++++++++++-------
> 6 files changed, 62 insertions(+), 30 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index e44ab512c3a1..b1607e314497 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
...
> @@ -2937,6 +2943,7 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
> bool host_writable = !fault || fault->map_writable;
> bool prefetch = !fault || fault->prefetch;
> bool write_fault = fault && fault->write;
> + bool is_refcounted = !fault || fault->is_refcounted_page;
Just wonder, what if a non-refcounted page is prefetched? Or is it possible in
practice?
...
>
> @@ -883,7 +884,7 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> */
> static int FNAME(sync_spte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, int i)
> {
> - bool host_writable;
> + bool host_writable, is_refcounted;
> gpa_t first_pte_gpa;
> u64 *sptep, spte;
> struct kvm_memory_slot *slot;
> @@ -940,10 +941,12 @@ static int FNAME(sync_spte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, int
> sptep = &sp->spt[i];
> spte = *sptep;
> host_writable = spte & shadow_host_writable_mask;
> + // TODO: is this correct?
> + is_refcounted = spte & SPTE_MMU_PAGE_REFCOUNTED;
> slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
> make_spte(vcpu, sp, slot, pte_access, gfn,
> spte_to_pfn(spte), spte, true, false,
> - host_writable, &spte);
> + host_writable, is_refcounted, &spte);
Could we restrict that a non-refcounted page shall not be used as shadow page?
[*] https://lore.kernel.org/all/8caf3008-dcf3-985a-631e-e019b277c6f0@amd.com/
B.R.
Yu
Powered by blists - more mailing lists