lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a7affba9-0cea-4493-b868-392158b59d83@paulmck-laptop>
Date: Mon, 14 Jul 2025 17:21:47 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Chao Gao <chao.gao@...el.com>
Cc: linux-coco@...ts.linux.dev, x86@...nel.org, kvm@...r.kernel.org,
	seanjc@...gle.com, pbonzini@...hat.com, eddie.dong@...el.com,
	kirill.shutemov@...el.com, dave.hansen@...el.com,
	dan.j.williams@...el.com, kai.huang@...el.com,
	isaku.yamahata@...el.com, elena.reshetova@...el.com,
	rick.p.edgecombe@...el.com, Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC PATCH 00/20] TD-Preserving updates

On Fri, Jul 11, 2025 at 04:04:48PM +0800, Chao Gao wrote:
> On Fri, May 23, 2025 at 02:52:23AM -0700, Chao Gao wrote:
> >Hi Reviewers,
> >
> >This series adds support for runtime TDX module updates that preserve
> >running TDX guests (a.k.a, TD-Preserving updates). The goal is to gather
> >feedback on the feature design. Please pay attention to the following items:
> >
> >1. TD-Preserving updates are done in stop_machine() context. it copy-pastes
> >   part of multi_cpu_stop() to guarantee step-locked progress on all CPUs.
> >   But, there are a few differences between them. I am wondering whether
> >   these differences have reached a point where abstracting a common
> >   function might do more harm than good. See more details in patch 10.

Please note that multi_cpu_stop() is used by a number of functions,
so it is a good example of common code.  But you are within your rights
to create your own function to pass to stop_machine(), and quite a
few call sites do just that.  Most of them expect this function to be
executed on only one CPU, but these run on multiple CPUs:

o	__apply_alternatives_multi_stop(), which has CPU 0 do the
	work and the rest wati on it.

o	cpu_enable_non_boot_scope_capabilities(), which works on
	a per-CPU basis.

o	do_join(), which is similar to your do_seamldr_install_module().
	Somewhat similar, anyway.

o	__ftrace_modify_code(), of which there are several, some of
	which have some vague resemblance to your code.

o	cache_rendezvous_handler(), which works on a per-CPU basis.

o	panic_stop_irqoff_fn(), which is a simple barrier-wait, with
	the last CPU to arrive doing the work.

I strongly recommend looking at these functions.  They might
suggest an improved way to do what you are trying to accomplish with
do_seamldr_install_module().

> >2. P-SEAMLDR seamcalls (specificially SEAMRET from P-SEAMLDR) clear current
> >   VMCS pointers, which may disrupt KVM. To prevent VMX instructions in IRQ
> >   context from encountering NULL current-VMCS pointers, P-SEAMLDR
> >   seamcalls are called with IRQ disabled. I'm uncertain if NMIs could
> >   cause a problem, but I believe they won't. See more information in patch 3.
> >
> >3. Two helpers, cpu_vmcs_load() and cpu_vmcs_store(), are added in patch 3
> >   to save and restore the current VMCS. KVM has a variant of cpu_vmcs_load(),
> >   i.e., vmcs_load(). Extracting KVM's version would cause a lot of code
> >   churn, and I don't think that can be justified for reducing ~16 LoC
> >   duplication. Please let me know if you disagree.
> 
> Gentle ping!

I do not believe that I was CCed on the original.  Just in case you
were wondering why I did not respond.  ;-)

> There are three open issues: one regarding stop_machine() and two related to
> interactions with KVM.
> 
> Sean and Paul, do you have any preferences or insights on these matters?

Again, you are within your rights to create a new function and pass
it to stop_machine().  But it seems quite likely that there is a much
simpler way to get your job done.

Either way, please add a header comment stating what your function
is trying to do, which appears to be to wait for all CPUs to enter
do_seamldr_install_module() and then just leave?  Sort of like
multi_cpu_stop(), except leaving interrupts enabled and not executing a
"msdata->fn(msdata->data);", correct?

If so, something like panic_stop_irqoff_fn() might be a simpler model,
perhaps with the touch_nmi_watchdog() and rcu_momentary_eqs() added.

Oh, and one bug:  You must have interrupts disabled when you call
rcu_momentary_eqs().  Please fix this.

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ