[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <21115e18-c68d-492d-9fd4-400452bd64c7@suse.com>
Date: Thu, 27 Nov 2025 14:29:18 +0200
From: Nikolay Borisov <nik.borisov@...e.com>
To: Rick Edgecombe <rick.p.edgecombe@...el.com>, bp@...en8.de,
chao.gao@...el.com, dave.hansen@...el.com, isaku.yamahata@...el.com,
kai.huang@...el.com, kas@...nel.org, kvm@...r.kernel.org,
linux-coco@...ts.linux.dev, linux-kernel@...r.kernel.org, mingo@...hat.com,
pbonzini@...hat.com, seanjc@...gle.com, tglx@...utronix.de,
vannapurve@...gle.com, x86@...nel.org, yan.y.zhao@...el.com,
xiaoyao.li@...el.com, binbin.wu@...el.com
Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH v4 07/16] x86/virt/tdx: Add tdx_alloc/free_page() helpers
On 21.11.25 г. 2:51 ч., Rick Edgecombe wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
>
> Add helpers to use when allocating or preparing pages that need DPAMT
> backing. Make them handle races internally for the case of multiple
> callers trying operate on the same 2MB range simultaneously.
>
> While the TDX initialization code in arch/x86 uses pages with 2MB
> alignment, KVM will need to hand 4KB pages for it to use. Under DPAMT,
> these pages will need DPAMT backing 4KB backing.
That paragraph is rather hard to parse. KVM will need to hand 4k pages
to whom? The tdx init code? Also the last sentence with the 2 "backing"
words is hard to parse. Does it say that the 4k pages that KVM need to
pass must be backed by DPAMT pages i.e like a chicken and egg problem?
>
> Add tdx_alloc_page() and tdx_free_page() to handle both page allocation
> and DPAMT installation. Make them behave like normal alloc/free functions
> where allocation can fail in the case of no memory, but free (with any
> necessary DPAMT release) always succeeds. Do this so they can support the
> existing TDX flows that require cleanups to succeed. Also create
> tdx_pamt_put()/tdx_pamt_get() to handle installing DPAMT 4KB backing for
> pages that are already allocated (such as external page tables, or S-EPT
> pages).
>
<snip>
> +
> +/* Serializes adding/removing PAMT memory */
> +static DEFINE_SPINLOCK(pamt_lock);
> +
> +/* Bump PAMT refcount for the given page and allocate PAMT memory if needed */
> +int tdx_pamt_get(struct page *page)
> +{
> + u64 pamt_pa_array[MAX_TDX_ARG_SIZE(rdx)];
> + atomic_t *pamt_refcount;
> + u64 tdx_status;
> + int ret;
> +
> + if (!tdx_supports_dynamic_pamt(&tdx_sysinfo))
> + return 0;
> +
> + ret = alloc_pamt_array(pamt_pa_array);
> + if (ret)
> + goto out_free;
> +
> + pamt_refcount = tdx_find_pamt_refcount(page_to_pfn(page));
> +
> + scoped_guard(spinlock, &pamt_lock) {
> + /*
> + * If the pamt page is already added (i.e. refcount >= 1),
> + * then just increment the refcount.
> + */
> + if (atomic_read(pamt_refcount)) {
> + atomic_inc(pamt_refcount);
> + goto out_free;
> + }
Replace this pair of read/inc with a single call to atomic_inc_not_zero()
> +
> + /* Try to add the pamt page and take the refcount 0->1. */
> +
> + tdx_status = tdh_phymem_pamt_add(page, pamt_pa_array);
> + if (!IS_TDX_SUCCESS(tdx_status)) {
> + pr_err("TDH_PHYMEM_PAMT_ADD failed: %#llx\n", tdx_status);
> + goto out_free;
> + }
> +
> + atomic_inc(pamt_refcount);
> + }
> +
> + return ret;
> +out_free:
> + /*
> + * pamt_pa_array is populated or zeroed up to tdx_dpamt_entry_pages()
> + * above. free_pamt_array() can handle either case.
> + */
> + free_pamt_array(pamt_pa_array);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(tdx_pamt_get);
> +
> +/*
> + * Drop PAMT refcount for the given page and free PAMT memory if it is no
> + * longer needed.
> + */
> +void tdx_pamt_put(struct page *page)
> +{
> + u64 pamt_pa_array[MAX_TDX_ARG_SIZE(rdx)];
> + atomic_t *pamt_refcount;
> + u64 tdx_status;
> +
> + if (!tdx_supports_dynamic_pamt(&tdx_sysinfo))
> + return;
> +
> + pamt_refcount = tdx_find_pamt_refcount(page_to_pfn(page));
> +
> + scoped_guard(spinlock, &pamt_lock) {
> + /*
> + * If the there are more than 1 references on the pamt page,
> + * don't remove it yet. Just decrement the refcount.
> + */
> + if (atomic_read(pamt_refcount) > 1) {
> + atomic_dec(pamt_refcount);
> + return;
> + }
nit: Could be replaced with : atomic_add_unless(pamt_refcount, -1, 1);
Probably it would have been better to simply use atomic64_dec_and_test
and if it returns true do the phymem_pamt_remove, but I suspect you
can't do it because in case it fails you don't want to decrement the
last refcount, though that could be remedied by an extra atomic_int in
the failure path. I guess it might be worth simplifying since the extra
inc will only be needed in exceptional cases (we don't expect failure ot
be the usual path) and freeing is not a fast path.
> +
> + /* Try to remove the pamt page and take the refcount 1->0. */
> +
> + tdx_status = tdh_phymem_pamt_remove(page, pamt_pa_array);
> + if (!IS_TDX_SUCCESS(tdx_status)) {
> + pr_err("TDH_PHYMEM_PAMT_REMOVE failed: %#llx\n", tdx_status);
> +
> + /*
> + * Don't free pamt_pa_array as it could hold garbage
> + * when tdh_phymem_pamt_remove() fails.
> + */
> + return;
> + }
> +
> + atomic_dec(pamt_refcount);
> + }
> +
> + /*
> + * pamt_pa_array is populated up to tdx_dpamt_entry_pages() by the TDX
> + * module with pages, or remains zero inited. free_pamt_array() can
> + * handle either case. Just pass it unconditionally.
> + */
> + free_pamt_array(pamt_pa_array);
> +}
> +EXPORT_SYMBOL_GPL(tdx_pamt_put);
<snip>
Powered by blists - more mailing lists