linux-kernel - Re: [RFC PATCH 08/21] KVM: TDX: Increase/decrease folio ref for huge pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGtprH9wi6zHJ5JeuAnjZThMAzxxibJGo=XN1G1Nx8txZRg8_w@mail.gmail.com>
Date: Mon, 5 May 2025 22:08:24 -0700
From: Vishal Annapurve <vannapurve@...gle.com>
To: Yan Zhao <yan.y.zhao@...el.com>
Cc: pbonzini@...hat.com, seanjc@...gle.com, linux-kernel@...r.kernel.org, 
	kvm@...r.kernel.org, x86@...nel.org, rick.p.edgecombe@...el.com, 
	dave.hansen@...el.com, kirill.shutemov@...el.com, tabba@...gle.com, 
	ackerleytng@...gle.com, quic_eberman@...cinc.com, michael.roth@....com, 
	david@...hat.com, vbabka@...e.cz, jroedel@...e.de, thomas.lendacky@....com, 
	pgonda@...gle.com, zhiquan1.li@...el.com, fan.du@...el.com, 
	jun.miao@...el.com, ira.weiny@...el.com, isaku.yamahata@...el.com, 
	xiaoyao.li@...el.com, binbin.wu@...ux.intel.com, chao.p.peng@...el.com
Subject: Re: [RFC PATCH 08/21] KVM: TDX: Increase/decrease folio ref for huge pages

On Mon, May 5, 2025 at 5:56 PM Yan Zhao <yan.y.zhao@...el.com> wrote:
>
> Sorry for the late reply, I was on leave last week.
>
> On Tue, Apr 29, 2025 at 06:46:59AM -0700, Vishal Annapurve wrote:
> > On Mon, Apr 28, 2025 at 5:52 PM Yan Zhao <yan.y.zhao@...el.com> wrote:
> > > So, we plan to remove folio_ref_add()/folio_put_refs() in future, only invoking
> > > folio_ref_add() in the event of a removal failure.
> >
> > In my opinion, the above scheme can be deployed with this series
> > itself. guest_memfd will not take away memory from TDX VMs without an
> I initially intended to add a separate patch at the end of this series to
> implement invoking folio_ref_add() only upon a removal failure. However, I
> decided against it since it's not a must before guest_memfd supports in-place
> conversion.
>
> We can include it in the next version If you think it's better.

Ackerley is planning to send out a series for 1G Hugetlb support with
guest memfd soon, hopefully this week. Plus I don't see any reason to
hold extra refcounts in TDX stack so it would be good to clean up this
logic.

>
> > invalidation. folio_ref_add() will not work for memory not backed by
> > page structs, but that problem can be solved in future possibly by
> With current TDX code, all memory must be backed by a page struct.
> Both tdh_mem_page_add() and tdh_mem_page_aug() require a "struct page *" rather
> than a pfn.
>
> > notifying guest_memfd of certain ranges being in use even after
> > invalidation completes.
> A curious question:
> To support memory not backed by page structs in future, is there any counterpart
> to the page struct to hold ref count and map count?
>

I imagine the needed support will match similar semantics as VM_PFNMAP
[1] memory. No need to maintain refcounts/map counts for such physical
memory ranges as all users will be notified when mappings are
changed/removed.

Any guest_memfd range updates will result in invalidations/updates of
userspace, guest, IOMMU or any other page tables referring to
guest_memfd backed pfns. This story will become clearer once the
support for PFN range allocator for backing guest_memfd starts getting
discussed.

[1] https://elixir.bootlin.com/linux/v6.14.5/source/mm/memory.c#L6543