[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <diqzikju4ko7.fsf@ackerleytng-ctop.c.googlers.com>
Date: Mon, 14 Jul 2025 12:49:44 -0700
From: Ackerley Tng <ackerleytng@...gle.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>, "Zhao, Yan Y" <yan.y.zhao@...el.com>
Cc: "quic_eberman@...cinc.com" <quic_eberman@...cinc.com>, "Li, Xiaoyao" <xiaoyao.li@...el.com>,
"Du, Fan" <fan.du@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>,
"david@...hat.com" <david@...hat.com>, "thomas.lendacky@....com" <thomas.lendacky@....com>,
"vbabka@...e.cz" <vbabka@...e.cz>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
"Shutemov, Kirill" <kirill.shutemov@...el.com>, "michael.roth@....com" <michael.roth@....com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "Weiny, Ira" <ira.weiny@...el.com>,
"Peng, Chao P" <chao.p.peng@...el.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>, "tabba@...gle.com" <tabba@...gle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"binbin.wu@...ux.intel.com" <binbin.wu@...ux.intel.com>, "Annapurve, Vishal" <vannapurve@...gle.com>,
"jroedel@...e.de" <jroedel@...e.de>, "Miao, Jun" <jun.miao@...el.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "pgonda@...gle.com" <pgonda@...gle.com>,
"x86@...nel.org" <x86@...nel.org>
Subject: Re: [RFC PATCH 08/21] KVM: TDX: Increase/decrease folio ref for huge pages
"Edgecombe, Rick P" <rick.p.edgecombe@...el.com> writes:
> On Fri, 2025-07-11 at 13:12 +0800, Yan Zhao wrote:
>> > Yan, is that your recollection? I guess the other points were that although
>> > TDX
>> I'm ok if KVM_BUG_ON() is considered loud enough to warn about the rare
>> potential corruption, thereby making TDX less special.
>>
>> > doesn't need it today, for long term, userspace ABI around invalidations
>> > should
>> > support failure. But the actual gmem/kvm interface for this can be figured
>> > out
>> Could we elaborate what're included in userspace ABI around invalidations?
>
> Let's see what Ackerley says.
>
There's no specific invalidation command for ioctl but I assume you're
referring to the conversion ioctl?
There is a conversion ioctl planned for guest_memfd and the conversion
ioctl can return an error. The process of conversion involves
invalidating the memory that is to be converted, and for now,
guest_memfd assumes unmapping is successful (like Yan says), but that
can be changed.
>>
>> I'm a bit confused as I think the userspace ABI today supports failure
>> already.
>>
>> Currently, the unmap API between gmem and KVM does not support failure.
>
> Great. I'm just trying to summarize the internal conversations. I think the
> point was for a future looking user ABI, supporting failure is important. But we
> don't need the KVM/gmem interface figured out yet.
>
I'm onboard here. So "do nothing" means if there is a TDX unmap failure,
+ KVM_BUG_ON() and hence the TD in question stops running,
+ No more conversions will be possible for this TD since the TD
stops running.
+ Other TDs can continue running?
+ No refcounts will be taken for the folio/page where the memory failure
happened.
+ No other indication (including HWpoison) anywhere in folio/page to
indicate this happened.
+ To round this topic up, do we do anything else as part of "do nothing"
that I missed? Is there any record in the TDX module (TDX module
itself, not within the kernel)?
I'll probably be okay with an answer like "won't know what will happen",
but just checking - what might happen if this page that had an unmap
failure gets reused? Suppose the KVM_BUG_ON() is noted but somehow we
couldn't get to the machine in time and the machine continues to serve,
and the memory is used by
1. Some other non-VM user, something else entirely, say a database?
2. Some new non-TDX VM?
3. Some new TD?
>>
>> In the future, we hope gmem can check if KVM allows a page to be unmapped
>> before
>> triggering the actual unmap.
Powered by blists - more mailing lists