linux-kernel - Re: [PATCH v2] x86/sev: Fix host kdump support for SNP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <25ca73c9-e4ba-4a95-82c8-0d6cf8d0ff78@redhat.com>
Date: Wed, 4 Sep 2024 12:29:47 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: "Kalra, Ashish" <ashish.kalra@....com>,
 Sean Christopherson <seanjc@...gle.com>
Cc: dave.hansen@...ux.intel.com, tglx@...utronix.de, mingo@...hat.com,
 bp@...en8.de, x86@...nel.org, hpa@...or.com, peterz@...radead.org,
 linux-kernel@...r.kernel.org, kvm@...r.kernel.org, thomas.lendacky@....com,
 michael.roth@....com, kexec@...ts.infradead.org, linux-coco@...ts.linux.dev
Subject: Re: [PATCH v2] x86/sev: Fix host kdump support for SNP

On 9/4/24 00:58, Kalra, Ashish wrote:
> The issue here is that panic path will ensure that all (other) CPUs
> have been shutdown via NMI by checking that they have executed the
> NMI shutdown callback.
> 
> But the above synchronization is specifically required for SNP case,
> as we don't want to execute the SNP_DECOMMISSION command (to destroy
> SNP guest context) while one or more CPUs are still in the NMI VMEXIT
> path and still in the process of saving the vCPU state (and still
> modifying SNP guest context?) during this VMEXIT path. Therefore, we
> ensure that all the CPUs have saved the vCPU state and entered NMI
> context before issuing SNP_DECOMMISSION. The point is that this is a
> specific SNP requirement (and that's why this specific handling in
> sev_emergency_disable()) and i don't know how we will be able to
> enforce it in the generic panic path ?

I think a simple way to do this is to _first_ kick out other
CPUs through NMI, and then the one that is executing
emergency_reboot_disable_virtualization().  This also makes
emergency_reboot_disable_virtualization() and
native_machine_crash_shutdown() more similar, in that
the latter already stops other CPUs before disabling
virtualization on the one that orchestrates the shutdown.

Something like (incomplete, it has to also add the bool argument
to cpu_emergency_virt_callback and the actual callbacks):

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 340af8155658..3df25fbe969d 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -111,7 +111,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
  
  	crash_smp_send_stop();
  
-	cpu_emergency_disable_virtualization();
+	cpu_emergency_disable_virtualization(true);
  
  	/*
  	 * Disable Intel PT to stop its logging
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 0e0a4cf6b5eb..7a86ec786987 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -558,7 +558,7 @@ EXPORT_SYMBOL_GPL(cpu_emergency_unregister_virt_callback);
   * reboot.  VMX blocks INIT if the CPU is post-VMXON, and SVM blocks INIT if
   * GIF=0, i.e. if the crash occurred between CLGI and STGI.
   */
-void cpu_emergency_disable_virtualization(void)
+void cpu_emergency_disable_virtualization(bool last)
  {
  	cpu_emergency_virt_cb *callback;
  
@@ -572,7 +572,7 @@ void cpu_emergency_disable_virtualization(void)
  	rcu_read_lock();
  	callback = rcu_dereference(cpu_emergency_virt_callback);
  	if (callback)
-		callback();
+		callback(last);
  	rcu_read_unlock();
  }
  
@@ -591,11 +591,11 @@ static void emergency_reboot_disable_virtualization(void)
  	 * other CPUs may have virtualization enabled.
  	 */
  	if (rcu_access_pointer(cpu_emergency_virt_callback)) {
-		/* Safely force _this_ CPU out of VMX/SVM operation. */
-		cpu_emergency_disable_virtualization();
-
  		/* Disable VMX/SVM and halt on other CPUs. */
  		nmi_shootdown_cpus_on_restart();
+
+		/* Safely force _this_ CPU out of VMX/SVM operation. */
+		cpu_emergency_disable_virtualization(true);
  	}
  }
  #else
@@ -877,7 +877,7 @@ static int crash_nmi_callback(unsigned int val, struct pt_regs *regs)
  	 * Prepare the CPU for reboot _after_ invoking the callback so that the
  	 * callback can safely use virtualization instructions, e.g. VMCLEAR.
  	 */
-	cpu_emergency_disable_virtualization();
+	cpu_emergency_disable_virtualization(false);
  
  	atomic_dec(&waiting_for_crash_ipi);
  
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 18266cc3d98c..9a863348d1a7 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -124,7 +124,7 @@ static int smp_stop_nmi_callback(unsigned int val, struct pt_regs *regs)
  	if (raw_smp_processor_id() == atomic_read(&stopping_cpu))
  		return NMI_HANDLED;
  
-	cpu_emergency_disable_virtualization();
+	cpu_emergency_disable_virtualization(false);
  	stop_this_cpu(NULL);
  
  	return NMI_HANDLED;
@@ -136,7 +136,7 @@ static int smp_stop_nmi_callback(unsigned int val, struct pt_regs *regs)
  DEFINE_IDTENTRY_SYSVEC(sysvec_reboot)
  {
  	apic_eoi();
-	cpu_emergency_disable_virtualization();
+	cpu_emergency_disable_virtualization(false);
  	stop_this_cpu(NULL);
  }
  

And then a second patch adds sev_emergency_disable() and only
executes it if last == true.

Paolo