lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <687a6483506f2_3c6f1d2945a@iweiny-mobl.notmuch>
Date: Fri, 18 Jul 2025 10:13:07 -0500
From: Ira Weiny <ira.weiny@...el.com>
To: Xu Yilun <yilun.xu@...ux.intel.com>, Ackerley Tng <ackerleytng@...gle.com>
CC: Yan Zhao <yan.y.zhao@...el.com>, Vishal Annapurve <vannapurve@...gle.com>,
	Jason Gunthorpe <jgg@...pe.ca>, Alexey Kardashevskiy <aik@....com>, "Fuad
 Tabba" <tabba@...gle.com>, <kvm@...r.kernel.org>, <linux-mm@...ck.org>,
	<linux-kernel@...r.kernel.org>, <x86@...nel.org>,
	<linux-fsdevel@...r.kernel.org>, <ajones@...tanamicro.com>,
	<akpm@...ux-foundation.org>, <amoorthy@...gle.com>,
	<anthony.yznaga@...cle.com>, <anup@...infault.org>, <aou@...s.berkeley.edu>,
	<bfoster@...hat.com>, <binbin.wu@...ux.intel.com>, <brauner@...nel.org>,
	<catalin.marinas@....com>, <chao.p.peng@...el.com>, <chenhuacai@...nel.org>,
	<dave.hansen@...el.com>, <david@...hat.com>, <dmatlack@...gle.com>,
	<dwmw@...zon.co.uk>, <erdemaktas@...gle.com>, <fan.du@...el.com>,
	<fvdl@...gle.com>, <graf@...zon.com>, <haibo1.xu@...el.com>,
	<hch@...radead.org>, <hughd@...gle.com>, <ira.weiny@...el.com>,
	<isaku.yamahata@...el.com>, <jack@...e.cz>, <james.morse@....com>,
	<jarkko@...nel.org>, <jgowans@...zon.com>, <jhubbard@...dia.com>,
	<jroedel@...e.de>, <jthoughton@...gle.com>, <jun.miao@...el.com>,
	<kai.huang@...el.com>, <keirf@...gle.com>, <kent.overstreet@...ux.dev>,
	<kirill.shutemov@...el.com>, <liam.merwick@...cle.com>,
	<maciej.wieczor-retman@...el.com>, <mail@...iej.szmigiero.name>,
	<maz@...nel.org>, <mic@...ikod.net>, <michael.roth@....com>,
	<mpe@...erman.id.au>, <muchun.song@...ux.dev>, <nikunj@....com>,
	<nsaenz@...zon.es>, <oliver.upton@...ux.dev>, <palmer@...belt.com>,
	<pankaj.gupta@....com>, <paul.walmsley@...ive.com>, <pbonzini@...hat.com>,
	<pdurrant@...zon.co.uk>, <peterx@...hat.com>, <pgonda@...gle.com>,
	<pvorel@...e.cz>, <qperret@...gle.com>, <quic_cvanscha@...cinc.com>,
	<quic_eberman@...cinc.com>, <quic_mnalajal@...cinc.com>,
	<quic_pderrin@...cinc.com>, <quic_pheragu@...cinc.com>,
	<quic_svaddagi@...cinc.com>, <quic_tsoni@...cinc.com>,
	<richard.weiyang@...il.com>, <rick.p.edgecombe@...el.com>,
	<rientjes@...gle.com>, <roypat@...zon.co.uk>, <rppt@...nel.org>,
	<seanjc@...gle.com>, <shuah@...nel.org>, <steven.price@....com>,
	<steven.sistare@...cle.com>, <suzuki.poulose@....com>,
	<thomas.lendacky@....com>, <usama.arif@...edance.com>, <vbabka@...e.cz>,
	<viro@...iv.linux.org.uk>, <vkuznets@...hat.com>, <wei.w.wang@...el.com>,
	<will@...nel.org>, <willy@...radead.org>, <xiaoyao.li@...el.com>,
	<yilun.xu@...el.com>, <yuzenghui@...wei.com>, <zhiquan1.li@...el.com>
Subject: Re: [RFC PATCH v2 04/51] KVM: guest_memfd: Introduce
 KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls

Xu Yilun wrote:
> On Thu, Jul 17, 2025 at 09:56:01AM -0700, Ackerley Tng wrote:
> > Xu Yilun <yilun.xu@...ux.intel.com> writes:
> > 
> > > On Wed, Jul 16, 2025 at 03:22:06PM -0700, Ackerley Tng wrote:
> > >> Yan Zhao <yan.y.zhao@...el.com> writes:
> > >> 
> > >> > On Tue, Jun 24, 2025 at 07:10:38AM -0700, Vishal Annapurve wrote:
> > >> >> On Tue, Jun 24, 2025 at 6:08 AM Jason Gunthorpe <jgg@...pe.ca> wrote:
> > >> >> >
> > >> >> > On Tue, Jun 24, 2025 at 06:23:54PM +1000, Alexey Kardashevskiy wrote:
> > >> >> >
> > >> >> > > Now, I am rebasing my RFC on top of this patchset and it fails in
> > >> >> > > kvm_gmem_has_safe_refcount() as IOMMU holds references to all these
> > >> >> > > folios in my RFC.
> > >> >> > >
> > >> >> > > So what is the expected sequence here? The userspace unmaps a DMA
> > >> >> > > page and maps it back right away, all from the userspace? The end
> > >> >> > > result will be the exactly same which seems useless. And IOMMU TLB
> > >> >> 
> > >> >>  As Jason described, ideally IOMMU just like KVM, should just:
> > >> >> 1) Directly rely on guest_memfd for pinning -> no page refcounts taken
> > >> >> by IOMMU stack
> > >> > In TDX connect, TDX module and TDs do not trust VMM. So, it's the TDs to inform
> > >> > TDX module about which pages are used by it for DMAs purposes.
> > >> > So, if a page is regarded as pinned by TDs for DMA, the TDX module will fail the
> > >> > unmap of the pages from S-EPT.
> > >> >
> > >> > If IOMMU side does not increase refcount, IMHO, some way to indicate that
> > >> > certain PFNs are used by TDs for DMA is still required, so guest_memfd can
> > >> > reject the request before attempting the actual unmap.
> > >> > Otherwise, the unmap of TD-DMA-pinned pages will fail.
> > >> >
> > >> > Upon this kind of unmapping failure, it also doesn't help for host to retry
> > >> > unmapping without unpinning from TD.
> > >> >
> > >> >
> > >> 
> > >> Yan, Yilun, would it work if, on conversion,
> > >> 
> > >> 1. guest_memfd notifies IOMMU that a conversion is about to happen for a
> > >>    PFN range
> > >
> > > It is the Guest fw call to release the pinning.
> > 
> > I see, thanks for explaining.
> > 
> > > By the time VMM get the
> > > conversion requirement, the page is already physically unpinned. So I
> > > agree with Jason the pinning doesn't have to reach to iommu from SW POV.
> > >
> > 
> > If by the time KVM gets the conversion request, the page is unpinned,
> > then we're all good, right?
> 
> Yes, unless guest doesn't unpin the page first by mistake.

Or maliciously?  :-(

My initial response to this was that this is a bug and we don't need to be
concerned with it.  However, can't this be a DOS from one TD to crash the
system if the host uses the private page for something else and the
machine #MC's?

Ira

> Guest would
> invoke a fw call tdg.mem.page.release to unpin the page before
> KVM_HC_MAP_GPA_RANGE.
> 
> > 
> > When guest_memfd gets the conversion request, as part of conversion
> > handling it will request to zap the page from stage-2 page tables. TDX
> > module would see that the page is unpinned and the unmapping will
> > proceed fine. Is that understanding correct?
> 
> Yes, again unless guess doesn't unpin.
> 
> > 
> > >> 2. IOMMU forwards the notification to TDX code in the kernel
> > >> 3. TDX code in kernel tells TDX module to stop thinking of any PFNs in
> > >>    the range as pinned for DMA?
> > >
> > > TDX host can't stop the pinning. Actually this mechanism is to prevent
> > > host from unpin/unmap the DMA out of Guest expectation.
> > >
> > 
> > On this note, I'd also like to check something else. Putting TDX connect
> > and IOMMUs aside, if the host unmaps a guest private page today without
> > the guest requesting it, the unmapping will work and the guest will be
> > broken, right?
> 
> Correct. The unmapping will work, the guest can't continue anymore.
> 
> Thanks,
> Yilun



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ