lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXqx3_eE0rNh6nP0@google.com>
Date: Wed, 28 Jan 2026 17:03:27 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Ackerley Tng <ackerleytng@...gle.com>, Alexey Kardashevskiy <aik@....com>, cgroups@...r.kernel.org, 
	kvm@...r.kernel.org, linux-doc@...r.kernel.org, linux-fsdevel@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org, 
	linux-mm@...ck.org, linux-trace-kernel@...r.kernel.org, x86@...nel.org, 
	akpm@...ux-foundation.org, binbin.wu@...ux.intel.com, bp@...en8.de, 
	brauner@...nel.org, chao.p.peng@...el.com, chenhuacai@...nel.org, 
	corbet@....net, dave.hansen@...el.com, dave.hansen@...ux.intel.com, 
	david@...hat.com, dmatlack@...gle.com, erdemaktas@...gle.com, 
	fan.du@...el.com, fvdl@...gle.com, haibo1.xu@...el.com, hannes@...xchg.org, 
	hch@...radead.org, hpa@...or.com, hughd@...gle.com, ira.weiny@...el.com, 
	isaku.yamahata@...el.com, jack@...e.cz, james.morse@....com, 
	jarkko@...nel.org, jgowans@...zon.com, jhubbard@...dia.com, jroedel@...e.de, 
	jthoughton@...gle.com, jun.miao@...el.com, kai.huang@...el.com, 
	keirf@...gle.com, kent.overstreet@...ux.dev, liam.merwick@...cle.com, 
	maciej.wieczor-retman@...el.com, mail@...iej.szmigiero.name, 
	maobibo@...ngson.cn, mathieu.desnoyers@...icios.com, maz@...nel.org, 
	mhiramat@...nel.org, mhocko@...nel.org, mic@...ikod.net, michael.roth@....com, 
	mingo@...hat.com, mlevitsk@...hat.com, mpe@...erman.id.au, 
	muchun.song@...ux.dev, nikunj@....com, nsaenz@...zon.es, 
	oliver.upton@...ux.dev, palmer@...belt.com, pankaj.gupta@....com, 
	paul.walmsley@...ive.com, pbonzini@...hat.com, peterx@...hat.com, 
	pgonda@...gle.com, prsampat@....com, pvorel@...e.cz, qperret@...gle.com, 
	richard.weiyang@...il.com, rick.p.edgecombe@...el.com, rientjes@...gle.com, 
	rostedt@...dmis.org, roypat@...zon.co.uk, rppt@...nel.org, 
	shakeel.butt@...ux.dev, shuah@...nel.org, steven.price@....com, 
	steven.sistare@...cle.com, suzuki.poulose@....com, tabba@...gle.com, 
	tglx@...utronix.de, thomas.lendacky@....com, vannapurve@...gle.com, 
	vbabka@...e.cz, viro@...iv.linux.org.uk, vkuznets@...hat.com, 
	wei.w.wang@...el.com, will@...nel.org, willy@...radead.org, wyihan@...gle.com, 
	xiaoyao.li@...el.com, yan.y.zhao@...el.com, yilun.xu@...el.com, 
	yuzenghui@...wei.com, zhiquan1.li@...el.com
Subject: Re: [RFC PATCH v1 05/37] KVM: guest_memfd: Wire up
 kvm_get_memory_attributes() to per-gmem attributes

On Wed, Jan 28, 2026, Jason Gunthorpe wrote:
> On Wed, Jan 28, 2026 at 01:47:50PM -0800, Ackerley Tng wrote:
> > Alexey Kardashevskiy <aik@....com> writes:
> > 
> > >
> > > [...snip...]
> > >
> > >
> > 
> > Thanks for bringing this up!
> > 
> > > I am trying to make it work with TEE-IO where fd of VFIO MMIO is a dmabuf
> > > fd while the rest (guest RAM) is gmemfd. The above suggests that if there
> > > is gmemfd - then the memory attributes are handled by gmemfd which is...
> > > expected?
> > >
> > 
> > I think this is not expected.
> > 
> > IIUC MMIO guest physical addresses don't have an associated memslot, but
> > if you managed to get to that line in kvm_gmem_get_memory_attributes(),
> > then there is an associated memslot (slot != NULL)?
> 
> I think they should have a memslot, shouldn't they? I imagine creating
> a memslot from a FD and the FD can be memfd, guestmemfd, dmabuf, etc,
> etc ?

Yeah, there are two flavors of MMIO for KVM guests.  Emulated MMIO, which is
what Ackerley is thinking of, and "host" MMIO (for lack of a better term), which
is what I assume "fd of VFIO MMIO" is referring to.

Emulated MMIO does NOT have memslots[*].  There are some wrinkles and technical
exceptions, e.g. read-only memslots for emulating option ROMs, but by and large,
lack of a memslot means Emulated MMIO.

Host MMIO isn't something KVM really cares about, in the sense that, for the most
part, it's "just another memslot".  KVM x86 does need to identify host MMIO for
vendor specific reasons, e.g. to ensure UC memory stays UC when using EPT (MTRRs
are ignored), to create shared mappings when SME is enabled, and to mitigate the
lovely MMIO Stale Data vulnerability.

But those Host MMIO edge cases are almost entirely contained to make_spte() (see
the kvm_is_mmio_pfn() calls).  And so the vast, vast majority of "MMIO" code in
KVM is dealing with Emulated MMIO, and when most people talk about MMIO in KVM,
they're also talking about Emulated MMIO.

> > Either way, guest_memfd shouldn't store attributes for guest physical
> > addresses that don't belong to some guest_memfd memslot.
> > 
> > I think we need a broader discussion for this on where to store memory
> > attributes for MMIO addresses.
> > 
> > I think we should at least have line of sight to storing memory
> > attributes for MMIO addresses, in case we want to design something else,
> > since we're putting vm_memory_attributes on a deprecation path with this
> > series.
> 
> I don't know where you want to store them in KVM long term, but they
> need to come from the dmabuf itself (probably via a struct
> p2pdma_provider) and currently it is OK to assume all DMABUFs are
> uncachable MMIO that is safe for the VM to convert into "write
> combining" (eg Normal-NC on ARM)

+1.  For guest_memfd, we initially defined per-VM memory attributes to track
private vs. shared.  But as Ackerley noted, we are in the process of deprecating
that support, e.g. by making it incompatible with various guest_memfd features,
in favor of having each guest_memfd instance track the state of a given page.

The original guest_memfd design was that it would _only_ hold private pages, and
so tracking private vs. shared in guest_memfd didn't make any sense.  As we've
pivoted to in-place conversion, tracking private vs. shared in the guest_memfd
has basically become mandatory.  We could maaaaaybe make it work with per-VM
attributes, but it would be insanely complex.

For a dmabuf fd, the story is the same as guest_memfd.  Unless private vs. shared
is all or nothing, and can never change, then the only entity that can track that
info is the owner of the dmabuf.  And even if the private vs. shared attributes
are constant, tracking it external to KVM makes sense, because then the provider
can simply hardcode %true/%false.

As for _how_ to do that, no matter where the attributes are stored, we're going
to have to teach KVM to play nice with a non-guest_memfd provider of private
memory.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ