lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aHjDIxxbv0DnqI6S@yilunxu-OptiPlex-7050>
Date: Thu, 17 Jul 2025 17:32:19 +0800
From: Xu Yilun <yilun.xu@...ux.intel.com>
To: Ackerley Tng <ackerleytng@...gle.com>
Cc: Yan Zhao <yan.y.zhao@...el.com>,
	Vishal Annapurve <vannapurve@...gle.com>,
	Jason Gunthorpe <jgg@...pe.ca>, Alexey Kardashevskiy <aik@....com>,
	Fuad Tabba <tabba@...gle.com>, kvm@...r.kernel.org,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org, x86@...nel.org,
	linux-fsdevel@...r.kernel.org, ajones@...tanamicro.com,
	akpm@...ux-foundation.org, amoorthy@...gle.com,
	anthony.yznaga@...cle.com, anup@...infault.org,
	aou@...s.berkeley.edu, bfoster@...hat.com,
	binbin.wu@...ux.intel.com, brauner@...nel.org,
	catalin.marinas@....com, chao.p.peng@...el.com,
	chenhuacai@...nel.org, dave.hansen@...el.com, david@...hat.com,
	dmatlack@...gle.com, dwmw@...zon.co.uk, erdemaktas@...gle.com,
	fan.du@...el.com, fvdl@...gle.com, graf@...zon.com,
	haibo1.xu@...el.com, hch@...radead.org, hughd@...gle.com,
	ira.weiny@...el.com, isaku.yamahata@...el.com, jack@...e.cz,
	james.morse@....com, jarkko@...nel.org, jgowans@...zon.com,
	jhubbard@...dia.com, jroedel@...e.de, jthoughton@...gle.com,
	jun.miao@...el.com, kai.huang@...el.com, keirf@...gle.com,
	kent.overstreet@...ux.dev, kirill.shutemov@...el.com,
	liam.merwick@...cle.com, maciej.wieczor-retman@...el.com,
	mail@...iej.szmigiero.name, maz@...nel.org, mic@...ikod.net,
	michael.roth@....com, mpe@...erman.id.au, muchun.song@...ux.dev,
	nikunj@....com, nsaenz@...zon.es, oliver.upton@...ux.dev,
	palmer@...belt.com, pankaj.gupta@....com, paul.walmsley@...ive.com,
	pbonzini@...hat.com, pdurrant@...zon.co.uk, peterx@...hat.com,
	pgonda@...gle.com, pvorel@...e.cz, qperret@...gle.com,
	quic_cvanscha@...cinc.com, quic_eberman@...cinc.com,
	quic_mnalajal@...cinc.com, quic_pderrin@...cinc.com,
	quic_pheragu@...cinc.com, quic_svaddagi@...cinc.com,
	quic_tsoni@...cinc.com, richard.weiyang@...il.com,
	rick.p.edgecombe@...el.com, rientjes@...gle.com,
	roypat@...zon.co.uk, rppt@...nel.org, seanjc@...gle.com,
	shuah@...nel.org, steven.price@....com, steven.sistare@...cle.com,
	suzuki.poulose@....com, thomas.lendacky@....com,
	usama.arif@...edance.com, vbabka@...e.cz, viro@...iv.linux.org.uk,
	vkuznets@...hat.com, wei.w.wang@...el.com, will@...nel.org,
	willy@...radead.org, xiaoyao.li@...el.com, yilun.xu@...el.com,
	yuzenghui@...wei.com, zhiquan1.li@...el.com
Subject: Re: [RFC PATCH v2 04/51] KVM: guest_memfd: Introduce
 KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls

On Wed, Jul 16, 2025 at 03:22:06PM -0700, Ackerley Tng wrote:
> Yan Zhao <yan.y.zhao@...el.com> writes:
> 
> > On Tue, Jun 24, 2025 at 07:10:38AM -0700, Vishal Annapurve wrote:
> >> On Tue, Jun 24, 2025 at 6:08 AM Jason Gunthorpe <jgg@...pe.ca> wrote:
> >> >
> >> > On Tue, Jun 24, 2025 at 06:23:54PM +1000, Alexey Kardashevskiy wrote:
> >> >
> >> > > Now, I am rebasing my RFC on top of this patchset and it fails in
> >> > > kvm_gmem_has_safe_refcount() as IOMMU holds references to all these
> >> > > folios in my RFC.
> >> > >
> >> > > So what is the expected sequence here? The userspace unmaps a DMA
> >> > > page and maps it back right away, all from the userspace? The end
> >> > > result will be the exactly same which seems useless. And IOMMU TLB
> >> 
> >>  As Jason described, ideally IOMMU just like KVM, should just:
> >> 1) Directly rely on guest_memfd for pinning -> no page refcounts taken
> >> by IOMMU stack
> > In TDX connect, TDX module and TDs do not trust VMM. So, it's the TDs to inform
> > TDX module about which pages are used by it for DMAs purposes.
> > So, if a page is regarded as pinned by TDs for DMA, the TDX module will fail the
> > unmap of the pages from S-EPT.
> >
> > If IOMMU side does not increase refcount, IMHO, some way to indicate that
> > certain PFNs are used by TDs for DMA is still required, so guest_memfd can
> > reject the request before attempting the actual unmap.
> > Otherwise, the unmap of TD-DMA-pinned pages will fail.
> >
> > Upon this kind of unmapping failure, it also doesn't help for host to retry
> > unmapping without unpinning from TD.
> >
> >
> 
> Yan, Yilun, would it work if, on conversion,
> 
> 1. guest_memfd notifies IOMMU that a conversion is about to happen for a
>    PFN range

It is the Guest fw call to release the pinning. By the time VMM get the
conversion requirement, the page is already physically unpinned. So I
agree with Jason the pinning doesn't have to reach to iommu from SW POV.

> 2. IOMMU forwards the notification to TDX code in the kernel
> 3. TDX code in kernel tells TDX module to stop thinking of any PFNs in
>    the range as pinned for DMA?

TDX host can't stop the pinning. Actually this mechanism is to prevent
host from unpin/unmap the DMA out of Guest expectation.

Thanks,
Yilun

> 
> If the above is possible then by the time we get to unmapping from
> S-EPTs, TDX module would already consider the PFNs in the range "not
> pinned for DMA".

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ