lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2b714bb6e547e2505a83c97fdad79e5dda687d05.camel@intel.com>
Date: Wed, 10 Dec 2025 19:49:26 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "sagis@...gle.com" <sagis@...gle.com>, "Zhao, Yan Y"
	<yan.y.zhao@...el.com>
CC: "Du, Fan" <fan.du@...el.com>, "Li, Xiaoyao" <xiaoyao.li@...el.com>,
	"quic_eberman@...cinc.com" <quic_eberman@...cinc.com>, "Hansen, Dave"
	<dave.hansen@...el.com>, "david@...hat.com" <david@...hat.com>,
	"thomas.lendacky@....com" <thomas.lendacky@....com>, "tabba@...gle.com"
	<tabba@...gle.com>, "vbabka@...e.cz" <vbabka@...e.cz>, "kvm@...r.kernel.org"
	<kvm@...r.kernel.org>, "michael.roth@....com" <michael.roth@....com>,
	"seanjc@...gle.com" <seanjc@...gle.com>, "Weiny, Ira" <ira.weiny@...el.com>,
	"pbonzini@...hat.com" <pbonzini@...hat.com>, "binbin.wu@...ux.intel.com"
	<binbin.wu@...ux.intel.com>, "ackerleytng@...gle.com"
	<ackerleytng@...gle.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "Yamahata, Isaku" <isaku.yamahata@...el.com>,
	"Peng, Chao P" <chao.p.peng@...el.com>, "kas@...nel.org" <kas@...nel.org>,
	"Annapurve, Vishal" <vannapurve@...gle.com>, "Miao, Jun"
	<jun.miao@...el.com>, "zhiquan1.li@...el.com" <zhiquan1.li@...el.com>,
	"x86@...nel.org" <x86@...nel.org>, "pgonda@...gle.com" <pgonda@...gle.com>
Subject: Re: [RFC PATCH v2 10/23] KVM: TDX: Enable huge page splitting under
 write kvm->mmu_lock

On Wed, 2025-12-10 at 11:16 -0600, Sagi Shahar wrote:
> Thanks. I don't have access to the 1.5.28.04 module and we need the
> code to work with the 1.5.24 module as well based on our timeline so I
> guess we can just add the retries locally for now.
> 
> Do you see any issue with retrying the operation in case of
> TDX_INTERRUPTED_RESTARTABLE? 
> 

Yan has been testing with a similar workaround. See "[DROP ME] x86/virt/tdx:
Loop for TDX_INTERRUPTED_RESTARTABLE in tdh_mem_page_demote()".

With TDX_INTERRUPTED_RESTARTABLE compared to RESUMABLE, the problem is that
there is no guarantee it will make forward progress. So looping during an
interrupt storm would halt the process context in an unusual way.

So the two kernel side options we discussed were loop forever, or loop for a
certain amount of times and KVM_BUG_ON()/warn (like you had). They have
different problems - unbounded loop vs potentially killing the TD for unrelated
host behavior. So that is how we came to the decision to rely on TDX module
changes for the long term upstream solution.

You could also see this thread that touches on disabling interrupts around the
seamcall:
https://lore.kernel.org/kvm/99f5585d759328db973403be0713f68e492b492a.camel@intel.com/

However it does not help the NMI case. Do you know which you are hitting?

> From what I saw this is not just a
> theoretical race but happens every time I try to boot a VM
> 

Oh, interesting.

> , even for a
> small VM with 4 VCPUs and 8GB of memory.

It probably more matters what else is happening on the system to cause a host
interrupt.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ