[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7dd848e5735105ac3bf01b2f2db8b595045f47ad.camel@intel.com>
Date: Wed, 26 Nov 2025 20:47:07 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "linux-coco@...ts.linux.dev"
<linux-coco@...ts.linux.dev>, "Huang, Kai" <kai.huang@...el.com>, "Li,
Xiaoyao" <xiaoyao.li@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>,
"Zhao, Yan Y" <yan.y.zhao@...el.com>, "Wu, Binbin" <binbin.wu@...el.com>,
"kas@...nel.org" <kas@...nel.org>, "seanjc@...gle.com" <seanjc@...gle.com>,
"mingo@...hat.com" <mingo@...hat.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "tglx@...utronix.de" <tglx@...utronix.de>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>, "nik.borisov@...e.com"
<nik.borisov@...e.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
"Annapurve, Vishal" <vannapurve@...gle.com>, "Gao, Chao"
<chao.gao@...el.com>, "bp@...en8.de" <bp@...en8.de>, "x86@...nel.org"
<x86@...nel.org>
CC: "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH v4 06/16] x86/virt/tdx: Improve PAMT refcounts allocation
for sparse memory
Kiryl, curious if you have any comments on the below...
On Wed, 2025-11-26 at 16:45 +0200, Nikolay Borisov wrote:
> > +static int pamt_refcount_populate(pte_t *pte, unsigned long addr, void
> > *data)
> > +{
> > + struct page *page;
> > + pte_t entry;
> > +
> > + page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > + if (!page)
> > return -ENOMEM;
> >
> > + entry = mk_pte(page, PAGE_KERNEL);
> > +
> > + spin_lock(&init_mm.page_table_lock);
> > + /*
> > + * PAMT refcount populations can overlap due to rounding of the
> > + * start/end pfn. Make sure the PAMT range is only populated once.
> > + */
> > + if (pte_none(ptep_get(pte)))
> > + set_pte_at(&init_mm, addr, pte, entry);
> > + else
> > + __free_page(page);
> > + spin_unlock(&init_mm.page_table_lock);
>
> nit: Wouldn't it be better to perform the pte_none() check before doing
> the allocation thus avoiding needless allocations? I.e do the
> alloc/mk_pte only after we are 100% sure we are going to use this entry.
Yes, but I'm also wondering why it needs init_mm.page_table_lock at all. Here is
my reasoning for why it doesn't:
apply_to_page_range() takes init_mm.page_table_lock internally when it modified
page tables in the address range (vmalloc). It needs to do this to avoid races
with other allocations that share the upper level page tables, which could be on
the ends of area that TDX reserves.
But pamt_refcount_populate() is only operating on the PTE's for the address
range that TDX code already controls. Vmalloc should not free the PMD underneath
the PTE operation because there is an allocation in any page tables it covers.
So we can skip the lock and also do the pte_none() check before the page
allocation as Nikolay suggests.
Same for the depopulate path.
Powered by blists - more mailing lists