linux-kernel - Re: [RFC PATCH v2 03/23] x86/tdx: Enhance tdh_phymem_page_wbinvd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGtprH9foQx=XLXXMqYnga27jWjCSkqj5QHVnAM_Akv7CLNmbw@mail.gmail.com>
Date: Tue, 9 Dec 2025 17:30:54 -0800
From: Vishal Annapurve <vannapurve@...gle.com>
To: Yan Zhao <yan.y.zhao@...el.com>
Cc: pbonzini@...hat.com, seanjc@...gle.com, linux-kernel@...r.kernel.org, 
	kvm@...r.kernel.org, x86@...nel.org, rick.p.edgecombe@...el.com, 
	dave.hansen@...el.com, kas@...nel.org, tabba@...gle.com, 
	ackerleytng@...gle.com, quic_eberman@...cinc.com, michael.roth@....com, 
	david@...hat.com, vbabka@...e.cz, thomas.lendacky@....com, pgonda@...gle.com, 
	zhiquan1.li@...el.com, fan.du@...el.com, jun.miao@...el.com, 
	ira.weiny@...el.com, isaku.yamahata@...el.com, xiaoyao.li@...el.com, 
	binbin.wu@...ux.intel.com, chao.p.peng@...el.com
Subject: Re: [RFC PATCH v2 03/23] x86/tdx: Enhance tdh_phymem_page_wbinvd_hkid()
 to invalidate huge pages

On Tue, Dec 9, 2025 at 5:20 PM Yan Zhao <yan.y.zhao@...el.com> wrote:
>
> On Tue, Dec 09, 2025 at 05:14:22PM -0800, Vishal Annapurve wrote:
> > On Thu, Aug 7, 2025 at 2:42 AM Yan Zhao <yan.y.zhao@...el.com> wrote:
> > >
> > > index 0a2b183899d8..8eaf8431c5f1 100644
> > > --- a/arch/x86/kvm/vmx/tdx.c
> > > +++ b/arch/x86/kvm/vmx/tdx.c
> > > @@ -1694,6 +1694,7 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn,
> > >  {
> > >         int tdx_level = pg_level_to_tdx_sept_level(level);
> > >         struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
> > > +       struct folio *folio = page_folio(page);
> > >         gpa_t gpa = gfn_to_gpa(gfn);
> > >         u64 err, entry, level_state;
> > >
> > > @@ -1728,8 +1729,9 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn,
> > >                 return -EIO;
> > >         }
> > >
> > > -       err = tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, page);
> > > -
> > > +       err = tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, folio,
> > > +                                         folio_page_idx(folio, page),
> > > +                                         KVM_PAGES_PER_HPAGE(level));
> >
> > This code seems to assume that folio_order() always matches the level
> > at which it is mapped in the EPT entries.
> I don't think so.
> Please check the implemenation of tdh_phymem_page_wbinvd_hkid() [1].
> Only npages=KVM_PAGES_PER_HPAGE(level) will be invalidated, while npages
> <= folio_nr_pages(folio).

Is the gfn passed to tdx_sept_drop_private_spte() always huge page
aligned if mapping is at huge page granularity?

If gfn/pfn is not aligned then when folio is split to 4K, page_folio()
will return the same page and folio_order and folio_page_idx() will be
zero. This can cause tdh_phymem_page_wbinvd_hkid() to return failure.

If the expectation is that page_folio() will always point to a head
page for given hugepage granularity mapping then that logic will not
work correctly IMO.

>
> [1] https://lore.kernel.org/all/20250807094202.4481-1-yan.y.zhao@intel.com/
>
> > IIUC guest_memfd can decide
> > to split folios to 4K for the complete huge folio before zapping the
> > hugepage EPT mappings. I think it's better to just round the pfn to
> > the hugepage address based on the level they were mapped at instead of
> > relying on the folio order.