[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a0129a912e21c5f3219b382f2f51571ab2709460.camel@intel.com>
Date: Tue, 8 Jul 2025 14:52:33 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>
CC: "pvorel@...e.cz" <pvorel@...e.cz>, "kvm@...r.kernel.org"
<kvm@...r.kernel.org>, "catalin.marinas@....com" <catalin.marinas@....com>,
"Miao, Jun" <jun.miao@...el.com>, "Shutemov, Kirill"
<kirill.shutemov@...el.com>, "pdurrant@...zon.co.uk" <pdurrant@...zon.co.uk>,
"vbabka@...e.cz" <vbabka@...e.cz>, "peterx@...hat.com" <peterx@...hat.com>,
"x86@...nel.org" <x86@...nel.org>, "amoorthy@...gle.com"
<amoorthy@...gle.com>, "jack@...e.cz" <jack@...e.cz>,
"quic_svaddagi@...cinc.com" <quic_svaddagi@...cinc.com>, "keirf@...gle.com"
<keirf@...gle.com>, "palmer@...belt.com" <palmer@...belt.com>,
"vkuznets@...hat.com" <vkuznets@...hat.com>, "mail@...iej.szmigiero.name"
<mail@...iej.szmigiero.name>, "Annapurve, Vishal" <vannapurve@...gle.com>,
"anthony.yznaga@...cle.com" <anthony.yznaga@...cle.com>, "Wang, Wei W"
<wei.w.wang@...el.com>, "tabba@...gle.com" <tabba@...gle.com>,
"Wieczor-Retman, Maciej" <maciej.wieczor-retman@...el.com>, "Zhao, Yan Y"
<yan.y.zhao@...el.com>, "ajones@...tanamicro.com" <ajones@...tanamicro.com>,
"willy@...radead.org" <willy@...radead.org>, "rppt@...nel.org"
<rppt@...nel.org>, "quic_mnalajal@...cinc.com" <quic_mnalajal@...cinc.com>,
"aik@....com" <aik@....com>, "usama.arif@...edance.com"
<usama.arif@...edance.com>, "Hansen, Dave" <dave.hansen@...el.com>,
"fvdl@...gle.com" <fvdl@...gle.com>, "paul.walmsley@...ive.com"
<paul.walmsley@...ive.com>, "bfoster@...hat.com" <bfoster@...hat.com>,
"nsaenz@...zon.es" <nsaenz@...zon.es>, "anup@...infault.org"
<anup@...infault.org>, "quic_eberman@...cinc.com" <quic_eberman@...cinc.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"thomas.lendacky@....com" <thomas.lendacky@....com>, "mic@...ikod.net"
<mic@...ikod.net>, "oliver.upton@...ux.dev" <oliver.upton@...ux.dev>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"quic_cvanscha@...cinc.com" <quic_cvanscha@...cinc.com>,
"steven.price@....com" <steven.price@....com>, "binbin.wu@...ux.intel.com"
<binbin.wu@...ux.intel.com>, "hughd@...gle.com" <hughd@...gle.com>, "Li,
Zhiquan1" <zhiquan1.li@...el.com>, "rientjes@...gle.com"
<rientjes@...gle.com>, "mpe@...erman.id.au" <mpe@...erman.id.au>, "Aktas,
Erdem" <erdemaktas@...gle.com>, "david@...hat.com" <david@...hat.com>,
"jgg@...pe.ca" <jgg@...pe.ca>, "jhubbard@...dia.com" <jhubbard@...dia.com>,
"Xu, Haibo1" <haibo1.xu@...el.com>, "Du, Fan" <fan.du@...el.com>,
"maz@...nel.org" <maz@...nel.org>, "muchun.song@...ux.dev"
<muchun.song@...ux.dev>, "Yamahata, Isaku" <isaku.yamahata@...el.com>,
"jthoughton@...gle.com" <jthoughton@...gle.com>, "steven.sistare@...cle.com"
<steven.sistare@...cle.com>, "quic_pheragu@...cinc.com"
<quic_pheragu@...cinc.com>, "jarkko@...nel.org" <jarkko@...nel.org>,
"chenhuacai@...nel.org" <chenhuacai@...nel.org>, "Huang, Kai"
<kai.huang@...el.com>, "shuah@...nel.org" <shuah@...nel.org>,
"dwmw@...zon.co.uk" <dwmw@...zon.co.uk>, "Peng, Chao P"
<chao.p.peng@...el.com>, "pankaj.gupta@....com" <pankaj.gupta@....com>,
"Graf, Alexander" <graf@...zon.com>, "nikunj@....com" <nikunj@....com>,
"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>, "pbonzini@...hat.com"
<pbonzini@...hat.com>, "yuzenghui@...wei.com" <yuzenghui@...wei.com>,
"jroedel@...e.de" <jroedel@...e.de>, "suzuki.poulose@....com"
<suzuki.poulose@....com>, "jgowans@...zon.com" <jgowans@...zon.com>, "Xu,
Yilun" <yilun.xu@...el.com>, "liam.merwick@...cle.com"
<liam.merwick@...cle.com>, "michael.roth@....com" <michael.roth@....com>,
"quic_tsoni@...cinc.com" <quic_tsoni@...cinc.com>, "Li, Xiaoyao"
<xiaoyao.li@...el.com>, "aou@...s.berkeley.edu" <aou@...s.berkeley.edu>,
"Weiny, Ira" <ira.weiny@...el.com>, "richard.weiyang@...il.com"
<richard.weiyang@...il.com>, "kent.overstreet@...ux.dev"
<kent.overstreet@...ux.dev>, "qperret@...gle.com" <qperret@...gle.com>,
"dmatlack@...gle.com" <dmatlack@...gle.com>, "james.morse@....com"
<james.morse@....com>, "brauner@...nel.org" <brauner@...nel.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"ackerleytng@...gle.com" <ackerleytng@...gle.com>, "pgonda@...gle.com"
<pgonda@...gle.com>, "quic_pderrin@...cinc.com" <quic_pderrin@...cinc.com>,
"hch@...radead.org" <hch@...radead.org>, "linux-mm@...ck.org"
<linux-mm@...ck.org>, "will@...nel.org" <will@...nel.org>,
"roypat@...zon.co.uk" <roypat@...zon.co.uk>
Subject: Re: [RFC PATCH v2 00/51] 1G page support for guest_memfd
On Tue, 2025-07-08 at 07:20 -0700, Sean Christopherson wrote:
> > For TDX if we don't zero on conversion from private->shared we will be
> > dependent
> > on behavior of the CPU when reading memory with keyid 0, which was
> > previously
> > encrypted and has some protection bits set. I don't *think* the behavior is
> > architectural. So it might be prudent to either make it so, or zero it in
> > the
> > kernel in order to not make non-architectual behavior into userspace ABI.
>
> Ya, by "vendor specific", I was also lumping in cases where the kernel would
> need to zero memory in order to not end up with effectively undefined
> behavior.
Yea, more of an answer to Vishal's question about if CC VMs need zeroing. And
the answer is sort of yes, even though TDX doesn't require it. But we actually
don't want to zero memory when reclaiming memory. So TDX KVM code needs to know
that the operation is a to-shared conversion and not another type of private
zap. Like a callback from gmem, or maybe more simply a kernel internal flag to
set in gmem such that it knows it should zero it.
>
> > Up the thread Vishal says we need to support operations that use in-place
> > conversion (overloaded term now I think, btw). Why exactly is pKVM using
> > private/shared conversion for this private data provisioning?
>
> Because it's literally converting memory from shared to private? And IICU,
> it's
> not a one-time provisioning, e.g. memory can go:
>
> shared => fill => private => consume => shared => fill => private => consume
>
> > Instead of a special provisioning operation like the others? (Xiaoyao's
> > suggestion)
>
> Are you referring to this suggestion?
Yea, in general to make it a specific operation preserving operation.
>
> : And maybe a new flag for KVM_GMEM_CONVERT_PRIVATE for user space to
> : explicitly request that the page range is converted to private and the
> : content needs to be retained. So that TDX can identify which case needs
> : to call in-place TDH.PAGE.ADD.
>
> If so, I agree with that idea, e.g. add a PRESERVE flag or whatever. That way
> userspace has explicit control over what happens to the data during
> conversion,
> and KVM can reject unsupported conversions, e.g. PRESERVE is only allowed for
> shared => private and only for select VM types.
Ok, we should POC how it works with TDX.
Powered by blists - more mailing lists