lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250807165950.14953-1-kim.phillips@amd.com>
Date: Thu, 7 Aug 2025 11:59:49 -0500
From: Kim Phillips <kim.phillips@....com>
To: <linux-kernel@...r.kernel.org>, <kvm@...r.kernel.org>,
	<linux-coco@...ts.linux.dev>, <x86@...nel.org>
CC: Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>, Dave Hansen
	<dave.hansen@...ux.intel.com>, Sean Christopherson <seanjc@...gle.com>,
	"Paolo Bonzini" <pbonzini@...hat.com>, Ingo Molnar <mingo@...hat.com>, "H.
 Peter Anvin" <hpa@...or.com>, Thomas Gleixner <tglx@...utronix.de>, K Prateek
 Nayak <kprateek.nayak@....com>, "Nikunj A . Dadhania" <nikunj@....com>, "Tom
 Lendacky" <thomas.lendacky@....com>, Michael Roth <michael.roth@....com>,
	Ashish Kalra <ashish.kalra@....com>, Borislav Petkov
	<borislav.petkov@....com>, Borislav Petkov <bp@...en8.de>, Nathan Fontenot
	<nathan.fontenot@....com>, Dhaval Giani <Dhaval.Giani@....com>, "Santosh
 Shukla" <santosh.shukla@....com>, Naveen Rao <naveen.rao@....com>, "Gautham R
 . Shenoy" <gautham.shenoy@....com>, Ananth Narayan <ananth.narayan@....com>,
	Pankaj Gupta <pankaj.gupta@....com>, David Kaplan <david.kaplan@....com>,
	"Jon Grimm" <Jon.Grimm@....com>, Kim Phillips <kim.phillips@....com>
Subject: [RFC PATCH 0/1] KVM: SEV: Add support for SMT Protection

On an SMT-enabled system, the SMT Protection feature allows an
SNP guest to demand its hardware vCPU thread to run alone on
the physical core.  It will opt to do this to protect itself
against possible side channel attacks from shared core resources.
Hardware supports this by enforcing the sibling of the vCPU thread
to be in the idle state when the vCPU is running: If hardware detects
the sibling has not entered the idle state, or it exited it, then
the vCPU VMRUN exits with a new "IDLE_REQUIRED" status, where the
hypervisor should schedule the idle process on the sibling thread
simultaneously with resuming the vCPU VMRUN.

There is a new HLT_WAKEUP_ICR MSR that the hypervisor programs
for each system SMT thread such that if an idle sibling of a
SMT Protected guest vCPU receives an interrupt, hardware will write
the HLT_WAKEUP_ICR value to the APIC ICR to 'kick' the vCPU
thread out of its VMRUN state. Hardware then allows the sibling
to then exit the idle state and service its interrupt.

The feature is supported on EYPC Zen 4 and above CPUs.

For more information, see "15.36.17 Side-Channel Protection",
"SMT Protection", in:

"AMD64 Architecture Programmer's Manual Volume 2: System Programming Part 2,
Pub. 24593 Rev. 3.42 - March 2024"

available here:

https://bugzilla.kernel.org/attachment.cgi?id=306250

See the end of this message for the qemu hack that calls the
Linux Core Scheduler prctl syscall to create a unique per-vCPU
cookie to ensure the vCPU process will not be scheduled if
there is anything else running on the sibling thread of the
core.

As it turns out, this approach is less than efficient because
existing Core Scheduling semantics only prevent other userspace
processes from running on the sibling thread that hardware requires
to be in the idle state.

Because of this, the sibling CPU VMRUN frequently exits with
"IDLE_REQUIRED" when the scheduler runs its "OS noise" (softirq
work, etc.) instead of forcing the hardware idle state throughout
the duration of the VMRUN.

Mild testing yields eventual CPU stalls in the guest (minutes after
boot):

[    C0] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[    C0] rcu: 	1-...!: (0 ticks this GP) idle=8d58/0/0x0 softirq=12830/12830 fqs=0 (false positive?)
[    C0] rcu: 	(detected by 0, t=16253 jiffies, g=12377, q=12 ncpus=2)
[    C0] rcu: rcu_preempt kthread timer wakeup didn't happen for 16252 jiffies! g12377 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[    C0] rcu: 	Possible timer handling issue on cpu=1 timer-softirq=15006
[    C0] rcu: rcu_preempt kthread starved for 16253 jiffies! g12377 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[    C0] rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.

..with the occasional "NOHZ tick-stop error: local softirq work is
pending, handler #200!!!" on the host.

However, this RFC represents only one of three approaches attempted:

 - Another brute-force approach simply called remove_cpu() on the sibling
   before, and add_cpu() after __svm_sev_es_vcpu_run() in
   svm_vcpu_enter_exit().  The effort was quickly abandoned since
   it led to insurmountable lock contention issues:
   BUG: scheduling while atomic: qemu-system-x86/6743/0x00000002
    4 locks held by qemu-system-x86/6743:
    #0: ff160079b2dd80b8 (&vcpu->mutex){....}-{3:3}, at: kvm_vcpu_ioctl+0x94/0xa40 [kvm]
    #1: ffffffffba3c5410 (device_hotplug_lock){....}-{3:3}, at: lock_device_hotplug+0x1b/0x30
    #2: ff16009838ff5398 (&dev->mutex){....}-{3:3}, at: device_offline+0x9c/0x120
    #3: ffffffffb9e7e6b0 (cpu_add_remove_lock){....}-{3:3}, at: cpu_device_down+0x24/0x50

 - The third approach attempted to forward port vCPU Core Scheduling
   from the original 4.18 based work by Peter Z.:

   https://github.com/pdxChen/gang/commits/sched_1.23-base

   K. Prateek Nayak provided enough guidance to get me past host lockups
   from "kvm,sched: Track VCPU threads", but the following "sched: Add VCPU
   aware SMT scheduling" commit proved insurmountable to forward-port
   given the complex changes to scheduler internals since then.

Comments welcome:

- Are any of these three approaches even close to an
  upstream-acceptable solution to support SMT Protection?

- Given the feature's strict sibling idle state constraints,
  should SMT Protection even be supported at all?

This RFC applies to kvm-x86/next kvm-x86-next-2025.07.21 (33f843444e28).

Qemu hack:

>From 0278a4078933d9bce16a8e80f415466b44244a59 Mon Sep 17 00:00:00 2001
From: Kim Phillips <kim.phillips@....com>
Date: Wed, 2 Apr 2025 16:02:50 -0500
Subject: [RFC PATCH] system/cpus: Affine and Core-Schedule vCPUs onto pCPUs

DO NOT MERGE.

Hack to experiment supporting SEV-SNP "SMT Protection" feature.  It:

 1. Affines vCPUs to individual core pCPUs (as cpu_index increments
    over single-core threads 1, 2, etc.),

 2. Calls the Linux Core Scheduler prctl syscall to create a per-vCPU
    unique cookie to ensure the vCPU process will not be scheduled
    if there is anything else on the sibling thread of the pCPU core.

Note: It contains POSIX-specific code that really belongs in
util/qemu-thread-posix.c, and other hackery.

Signed-off-by: Kim Phillips <kim.phillips@....com>
---
 accel/kvm/kvm-accel-ops.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/accel/kvm/kvm-accel-ops.c b/accel/kvm/kvm-accel-ops.c
index c239dfc87a..4b853d3024 100644
--- a/accel/kvm/kvm-accel-ops.c
+++ b/accel/kvm/kvm-accel-ops.c
@@ -26,9 +26,12 @@
 #include <linux/kvm.h>
 #include "kvm-cpus.h"

+#include <sys/prctl.h> /* PR_SCHED_CORE_CREATE */
+
 static void *kvm_vcpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
+    cpu_set_t cpuset;
     int r;

     rcu_register_thread();
@@ -38,6 +41,16 @@ static void *kvm_vcpu_thread_fn(void *arg)
     cpu->thread_id = qemu_get_thread_id();
     current_cpu = cpu;

+    CPU_ZERO(&cpuset);
+    CPU_SET(cpu->cpu_index, &cpuset);
+    pthread_setaffinity_np(cpu->thread->thread, sizeof(cpu_set_t), &cpuset);
+
+    r = prctl(PR_SCHED_CORE, PR_SCHED_CORE_CREATE, 0, 0, 0);
+    if (r) {
+        printf("%s %d: CORE CREATE ret %d \r\n", __func__, __LINE__, r);
+        exit(1);
+    }
+
     r = kvm_init_vcpu(cpu, &error_fatal);
     kvm_init_cpu_signals(cpu);

--
2.43.0

Kim Phillips (1):
  KVM: SEV: Add support for SMT Protection

 arch/x86/include/asm/cpufeatures.h |  1 +
 arch/x86/include/asm/msr-index.h   |  1 +
 arch/x86/include/asm/svm.h         |  1 +
 arch/x86/include/uapi/asm/svm.h    |  1 +
 arch/x86/kvm/svm/sev.c             | 17 +++++++++++++++++
 arch/x86/kvm/svm/svm.c             |  3 +++
 6 files changed, 24 insertions(+)

base-commit: 33f843444e28920d6e624c6c24637b4bb5d3c8de
--
2.43.0

Kim Phillips (1):
  KVM: SEV: Add support for SMT Protection

 arch/x86/include/asm/cpufeatures.h |  1 +
 arch/x86/include/asm/msr-index.h   |  1 +
 arch/x86/include/asm/svm.h         |  1 +
 arch/x86/include/uapi/asm/svm.h    |  1 +
 arch/x86/kvm/svm/sev.c             | 17 +++++++++++++++++
 arch/x86/kvm/svm/svm.c             |  3 +++
 6 files changed, 24 insertions(+)


base-commit: 33f843444e28920d6e624c6c24637b4bb5d3c8de
-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ