lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <diqzwm803xa4.fsf@ackerleytng-ctop.c.googlers.com>
Date: Tue, 22 Jul 2025 11:17:39 -0700
From: Ackerley Tng <ackerleytng@...gle.com>
To: Xu Yilun <yilun.xu@...ux.intel.com>, Ira Weiny <ira.weiny@...el.com>
Cc: Yan Zhao <yan.y.zhao@...el.com>, Vishal Annapurve <vannapurve@...gle.com>, 
	Jason Gunthorpe <jgg@...pe.ca>, Alexey Kardashevskiy <aik@....com>, Fuad Tabba <tabba@...gle.com>, kvm@...r.kernel.org, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org, x86@...nel.org, 
	linux-fsdevel@...r.kernel.org, ajones@...tanamicro.com, 
	akpm@...ux-foundation.org, amoorthy@...gle.com, anthony.yznaga@...cle.com, 
	anup@...infault.org, aou@...s.berkeley.edu, bfoster@...hat.com, 
	binbin.wu@...ux.intel.com, brauner@...nel.org, catalin.marinas@....com, 
	chao.p.peng@...el.com, chenhuacai@...nel.org, dave.hansen@...el.com, 
	david@...hat.com, dmatlack@...gle.com, dwmw@...zon.co.uk, 
	erdemaktas@...gle.com, fan.du@...el.com, fvdl@...gle.com, graf@...zon.com, 
	haibo1.xu@...el.com, hch@...radead.org, hughd@...gle.com, 
	isaku.yamahata@...el.com, jack@...e.cz, james.morse@....com, 
	jarkko@...nel.org, jgowans@...zon.com, jhubbard@...dia.com, jroedel@...e.de, 
	jthoughton@...gle.com, jun.miao@...el.com, kai.huang@...el.com, 
	keirf@...gle.com, kent.overstreet@...ux.dev, kirill.shutemov@...el.com, 
	liam.merwick@...cle.com, maciej.wieczor-retman@...el.com, 
	mail@...iej.szmigiero.name, maz@...nel.org, mic@...ikod.net, 
	michael.roth@....com, mpe@...erman.id.au, muchun.song@...ux.dev, 
	nikunj@....com, nsaenz@...zon.es, oliver.upton@...ux.dev, palmer@...belt.com, 
	pankaj.gupta@....com, paul.walmsley@...ive.com, pbonzini@...hat.com, 
	pdurrant@...zon.co.uk, peterx@...hat.com, pgonda@...gle.com, pvorel@...e.cz, 
	qperret@...gle.com, quic_cvanscha@...cinc.com, quic_eberman@...cinc.com, 
	quic_mnalajal@...cinc.com, quic_pderrin@...cinc.com, quic_pheragu@...cinc.com, 
	quic_svaddagi@...cinc.com, quic_tsoni@...cinc.com, richard.weiyang@...il.com, 
	rick.p.edgecombe@...el.com, rientjes@...gle.com, roypat@...zon.co.uk, 
	rppt@...nel.org, seanjc@...gle.com, shuah@...nel.org, steven.price@....com, 
	steven.sistare@...cle.com, suzuki.poulose@....com, thomas.lendacky@....com, 
	usama.arif@...edance.com, vbabka@...e.cz, viro@...iv.linux.org.uk, 
	vkuznets@...hat.com, wei.w.wang@...el.com, will@...nel.org, 
	willy@...radead.org, xiaoyao.li@...el.com, yilun.xu@...el.com, 
	yuzenghui@...wei.com, zhiquan1.li@...el.com
Subject: Re: [RFC PATCH v2 04/51] KVM: guest_memfd: Introduce
 KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls

Xu Yilun <yilun.xu@...ux.intel.com> writes:

>> > > >> Yan, Yilun, would it work if, on conversion,
>> > > >> 
>> > > >> 1. guest_memfd notifies IOMMU that a conversion is about to happen for a
>> > > >>    PFN range
>> > > >
>> > > > It is the Guest fw call to release the pinning.
>> > > 
>> > > I see, thanks for explaining.
>> > > 
>> > > > By the time VMM get the
>> > > > conversion requirement, the page is already physically unpinned. So I
>> > > > agree with Jason the pinning doesn't have to reach to iommu from SW POV.
>> > > >
>> > > 
>> > > If by the time KVM gets the conversion request, the page is unpinned,
>> > > then we're all good, right?
>> > 
>> > Yes, unless guest doesn't unpin the page first by mistake.
>> 
>> Or maliciously?  :-(
>
> Yes.
>
>> 
>> My initial response to this was that this is a bug and we don't need to be
>> concerned with it.  However, can't this be a DOS from one TD to crash the
>> system if the host uses the private page for something else and the
>> machine #MC's?
>
> I think we are already doing something to prevent vcpus from executing
> then destroy VM, so no further TD accessing. But I assume there is
> concern a TD could just leak a lot of resources, and we are
> investigating if host can reclaim them.
>
> Thanks,
> Yilun

Sounds like a malicious guest could skip unpinning private memory, and
guest_memfd's unmap will fail, leading to a KVM_BUG_ON() as Yan/Rick
suggested here [1].

Actually it seems like a legacy guest would also lead to unmap failures
and the KVM_BUG_ON(), since when TDX connect is enabled, the pinning
mode is enforced, even for non-IO private pages?

I hope your team's investigations find a good way for the host to
reclaim memory, at least from dead TDs! Otherwise this would be an open
hole for guests to leak a host's memory.

Circling back to the original topic [2], it sounds like we're okay for
IOMMU to *not* take any refcounts on pages and can rely on guest_memfd
to keep the page around on behalf of the VM?

[1] https://lore.kernel.org/all/diqzcya13x2j.fsf@ackerleytng-ctop.c.googlers.com/
[2] https://lore.kernel.org/all/CAGtprH_qh8sEY3s-JucW3n1Wvoq7jdVZDDokvG5HzPf0HV2=pg@mail.gmail.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ