lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20231206163550.1454453-1-vkuznets@redhat.com>
Date:   Wed,  6 Dec 2023 17:35:50 +0100
From:   Vitaly Kuznetsov <vkuznets@...hat.com>
To:     Tom Lendacky <thomas.lendacky@....com>,
        Michael Roth <michael.roth@....com>,
        Brijesh Singh <brijesh.singh@....com>,
        Alexander Graf <graf@...zon.de>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org,
        Joerg Roedel <jroedel@...e.de>,
        Dionna Glaze <dionnaglaze@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>
Subject: [PATCH RFC] x86/sev: Temporary disable CPU re-onlining for SEV-SNP

It was discovered that an attempt to re-online a CPU in a SEV-SNP enabled
instance in AWS leads to the immediate reboot upon SVM_VMGEXIT_AP_CREATE
VMGEXIT. While support for SEV-SNP in KVM is not yet upstream, it is
unclear whether the problem is guest related or if the hypervisor is not
handling the case correctly. Note, currently Linux doesn't do
SVM_VMGEXIT_AP_DESTROY upon CPU offlining but it is also not entirely clear
from the specification whether this is a must or a nice-to-have
action. When done prior to SVM_VMGEXIT_AP_CREATE on AWS, guest reboot is no
longer observed. Unfortunately, CPU still fails to come up ("CPU1 failed
to report alive state").

Note, SEV-SNP feature on Hyper-V uses a different CPU wakeup
path (see hv_snp_boot_ap() in arch/x86/hyperv/ivm.c) which uses a
hypercall. This one does not seem to have any issues with CPU re-onlining,
at least on publicly available Azure instances.

Signed-off-by: Vitaly Kuznetsov <vkuznets@...hat.com>
---
 RFC: I'm using this silly patch (which makes the problem a bit less severe
 though) to ask if there are plans to make this work, either on the host or
 on the guest side. Thanks!
---
 arch/x86/kernel/sev.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 70472eebe719..f7e56cae05c5 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1005,6 +1005,10 @@ static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip)
 
 	cur_vmsa = per_cpu(sev_vmsa, cpu);
 
+	/* Re-onlining CPUs is currently unsupported */
+	if (cur_vmsa)
+		return -EOPNOTSUPP;
+
 	/*
 	 * A new VMSA is created each time because there is no guarantee that
 	 * the current VMSA is the kernels or that the vCPU is not running. If
-- 
2.43.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ