[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aR7blxIx6tKD2xiQ@yzhao56-desk.sh.intel.com>
Date: Thu, 20 Nov 2025 17:12:55 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: Michael Roth <michael.roth@....com>
CC: <kvm@...r.kernel.org>, <linux-coco@...ts.linux.dev>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>, <thomas.lendacky@....com>,
<pbonzini@...hat.com>, <seanjc@...gle.com>, <vbabka@...e.cz>,
<ashish.kalra@....com>, <liam.merwick@...cle.com>, <david@...hat.com>,
<vannapurve@...gle.com>, <ackerleytng@...gle.com>, <aik@....com>,
<ira.weiny@...el.com>
Subject: Re: [PATCH 1/3] KVM: guest_memfd: Remove preparation tracking
On Thu, Nov 13, 2025 at 05:07:57PM -0600, Michael Roth wrote:
> @@ -797,19 +782,25 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
> {
> pgoff_t index = kvm_gmem_get_index(slot, gfn);
> struct folio *folio;
> - bool is_prepared = false;
> int r = 0;
>
> CLASS(gmem_get_file, file)(slot);
> if (!file)
> return -EFAULT;
>
> - folio = __kvm_gmem_get_pfn(file, slot, index, pfn, &is_prepared, max_order);
> + folio = __kvm_gmem_get_pfn(file, slot, index, pfn, max_order);
> if (IS_ERR(folio))
> return PTR_ERR(folio);
>
> - if (!is_prepared)
> - r = kvm_gmem_prepare_folio(kvm, slot, gfn, folio);
> + if (!folio_test_uptodate(folio)) {
> + unsigned long i, nr_pages = folio_nr_pages(folio);
> +
> + for (i = 0; i < nr_pages; i++)
> + clear_highpage(folio_page(folio, i));
> + folio_mark_uptodate(folio);
Here, the entire folio is cleared only when the folio is not marked uptodate.
Then, please check my questions at the bottom
> + }
> +
> + r = kvm_gmem_prepare_folio(kvm, slot, gfn, folio);
>
> folio_unlock(folio);
>
> @@ -852,7 +843,6 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src, long
> struct folio *folio;
> gfn_t gfn = start_gfn + i;
> pgoff_t index = kvm_gmem_get_index(slot, gfn);
> - bool is_prepared = false;
> kvm_pfn_t pfn;
>
> if (signal_pending(current)) {
> @@ -860,19 +850,12 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src, long
> break;
> }
>
> - folio = __kvm_gmem_get_pfn(file, slot, index, &pfn, &is_prepared, &max_order);
> + folio = __kvm_gmem_get_pfn(file, slot, index, &pfn, &max_order);
> if (IS_ERR(folio)) {
> ret = PTR_ERR(folio);
> break;
> }
>
> - if (is_prepared) {
> - folio_unlock(folio);
> - folio_put(folio);
> - ret = -EEXIST;
> - break;
> - }
> -
> folio_unlock(folio);
> WARN_ON(!IS_ALIGNED(gfn, 1 << max_order) ||
> (npages - i) < (1 << max_order));
TDX could hit this warning easily when npages == 1, max_order == 9.
> @@ -889,7 +872,7 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src, long
> p = src ? src + i * PAGE_SIZE : NULL;
> ret = post_populate(kvm, gfn, pfn, p, max_order, opaque);
> if (!ret)
> - kvm_gmem_mark_prepared(folio);
> + folio_mark_uptodate(folio);
As also asked in [1], why is the entire folio marked as uptodate here? Why does
kvm_gmem_get_pfn() clear all pages of a huge folio when the folio isn't marked
uptodate?
It's possible (at least for TDX) that a huge folio is only partially populated
by kvm_gmem_populate(). Then kvm_gmem_get_pfn() faults in another part of the
huge folio. For example, in TDX, GFN 0x81f belongs to the init memory region,
while GFN 0x820 is faulted after TD is running. However, these two GFNs can
belong to the same folio of order 9.
Note: the current code should not impact TDX. I'm just asking out of curiosity:)
[1] https://lore.kernel.org/all/aQ3uj4BZL6uFQzrD@yzhao56-desk.sh.intel.com/
Powered by blists - more mailing lists