linux-kernel - Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241112014644.3p2a6te3sbh5x55c@jpoimboe>
Date: Mon, 11 Nov 2024 17:46:44 -0800
From: Josh Poimboeuf <jpoimboe@...nel.org>
To: Andrew Cooper <andrew.cooper3@...rix.com>
Cc: Amit Shah <amit@...nel.org>, linux-kernel@...r.kernel.org,
	kvm@...r.kernel.org, x86@...nel.org, linux-doc@...r.kernel.org,
	amit.shah@....com, thomas.lendacky@....com, bp@...en8.de,
	tglx@...utronix.de, peterz@...radead.org,
	pawan.kumar.gupta@...ux.intel.com, corbet@....net, mingo@...hat.com,
	dave.hansen@...ux.intel.com, hpa@...or.com, seanjc@...gle.com,
	pbonzini@...hat.com, daniel.sneddon@...ux.intel.com,
	kai.huang@...el.com, sandipan.das@....com,
	boris.ostrovsky@...cle.com, Babu.Moger@....com,
	david.kaplan@....com, dwmw@...zon.co.uk
Subject: Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for
 AMD

On Tue, Nov 12, 2024 at 12:29:28AM +0000, Andrew Cooper wrote:
> This is my take.  On AMD CPUs, there are two unrelated issues to take
> into account:
> 
> 1) SRSO
> 
> Affects anything which doesn't enumerate SRSO_NO, which is all parts to
> date including Zen5.
> 
> SRSO ends up overflowing the RAS with arbitrary BTB targets, such that a
> subsequent genuine RET follows a prediction which never came from a real
> CALL instruction.
> 
> Mitigations for SRSO are either safe-ret, or IBPB-on-entry.  Parts
> without IBPB_RET using IBPB-on-entry need to manually flush the RAS.
> 
> Importantly, SMEP does not protection you against SRSO across the
> user->kernel boundary, because the bad RAS entries are arbitrary.  New
> in Zen5 is the SRSO_U/S_NO bit which says this case can't occur any
> more.  So on Zen5, you can in principle get away without a RAS flush on
> entry.

Updated to mention SRSO:

	/*
	 * In general there are two types of RSB attacks:
	 *
	 * 1) RSB underflow ("Intel Retbleed")
	 *
	 *    Some Intel parts have "bottomless RSB".  When the RSB is empty,
	 *    speculated return targets may come from the branch predictor,
	 *    which could have a user-poisoned BTB or BHB entry.
	 *
	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack is
	 *    mitigated by the IBRS branch prediction isolation properties, so
	 *    the RSB buffer filling wouldn't be necessary to protect against
	 *    this type of attack.
	 *
	 *    The "user -> user" attack is mitigated by RSB filling on context
	 *    switch.
	 *
	 *    The "guest -> host" attack is mitigated by IBRS or eIBRS.
	 *
	 * 2) Poisoned RSB entry
	 *
	 *    If the 'next' in-kernel return stack is shorter than 'prev',
	 *    'next' could be tricked into speculating with a user-poisoned RSB
	 *    entry.  Poisoned RSB entries can also be created by Branch Type
	 *    Confusion ("AMD retbleed") or SRSO.
	 *
	 *    The "user -> kernel" attack is mitigated by SMEP and eIBRS.  AMD
	 *    without SRSO_NO also needs the SRSO mitigation.
	 *
	 *    The "user -> user" attack, also known as SpectreBHB, requires RSB
	 *    clearing.
	 *
	 *    The "guest -> host" attack is mitigated by either eIBRS (not
	 *    IBRS!) or RSB clearing on vmexit.  Note that eIBRS
	 *    implementations with X86_BUG_EIBRS_PBRSB still need "lite" RSB
	 *    clearing which retires a single CALL before the first RET.
	 */


---8<---

From: Josh Poimboeuf <jpoimboe@...nel.org>
Subject: [PATCH] x86/bugs: Update insanely long comment about RSB attacks

The long comment above the setting of X86_FEATURE_RSB_CTXSW is a bit
confusing.  It starts out being about context switching specifically,
but then goes on to describe "user -> kernel" mitigations, which aren't
necessarily limited to context switches.

Clarify that it's about *all* RSB attacks and their mitigations.

For consistency, add the "guest -> host" mitigations as well.  Then the
comment above spectre_v2_determine_rsb_fill_type_at_vmexit() can be
removed and the overall line count is reduced.

Signed-off-by: Josh Poimboeuf <jpoimboe@...nel.org>
---
 arch/x86/kernel/cpu/bugs.c | 60 +++++++++++++-------------------------
 1 file changed, 20 insertions(+), 40 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..3dd1e504d706 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1581,26 +1581,6 @@ static void __init spec_ctrl_disable_kernel_rrsba(void)
 
 static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_mitigation mode)
 {
-	/*
-	 * Similar to context switches, there are two types of RSB attacks
-	 * after VM exit:
-	 *
-	 * 1) RSB underflow
-	 *
-	 * 2) Poisoned RSB entry
-	 *
-	 * When retpoline is enabled, both are mitigated by filling/clearing
-	 * the RSB.
-	 *
-	 * When IBRS is enabled, while #1 would be mitigated by the IBRS branch
-	 * prediction isolation protections, RSB still needs to be cleared
-	 * because of #2.  Note that SMEP provides no protection here, unlike
-	 * user-space-poisoned RSB entries.
-	 *
-	 * eIBRS should protect against RSB poisoning, but if the EIBRS_PBRSB
-	 * bug is present then a LITE version of RSB protection is required,
-	 * just a single call needs to retire before a RET is executed.
-	 */
 	switch (mode) {
 	case SPECTRE_V2_NONE:
 		return;
@@ -1818,43 +1798,43 @@ static void __init spectre_v2_select_mitigation(void)
 	pr_info("%s\n", spectre_v2_strings[mode]);
 
 	/*
-	 * If Spectre v2 protection has been enabled, fill the RSB during a
-	 * context switch.  In general there are two types of RSB attacks
-	 * across context switches, for which the CALLs/RETs may be unbalanced.
+	 * In general there are two types of RSB attacks:
 	 *
-	 * 1) RSB underflow
+	 * 1) RSB underflow ("Intel Retbleed")
 	 *
 	 *    Some Intel parts have "bottomless RSB".  When the RSB is empty,
 	 *    speculated return targets may come from the branch predictor,
 	 *    which could have a user-poisoned BTB or BHB entry.
 	 *
-	 *    AMD has it even worse: *all* returns are speculated from the BTB,
-	 *    regardless of the state of the RSB.
+	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack is
+	 *    mitigated by the IBRS branch prediction isolation properties, so
+	 *    the RSB buffer filling wouldn't be necessary to protect against
+	 *    this type of attack.
 	 *
-	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack
-	 *    scenario is mitigated by the IBRS branch prediction isolation
-	 *    properties, so the RSB buffer filling wouldn't be necessary to
-	 *    protect against this type of attack.
+	 *    The "user -> user" attack is mitigated by RSB filling on context
+	 *    switch.
 	 *
-	 *    The "user -> user" attack scenario is mitigated by RSB filling.
+	 *    The "guest -> host" attack is mitigated by IBRS or eIBRS.
 	 *
 	 * 2) Poisoned RSB entry
 	 *
 	 *    If the 'next' in-kernel return stack is shorter than 'prev',
 	 *    'next' could be tricked into speculating with a user-poisoned RSB
-	 *    entry.
+	 *    entry.  Poisoned RSB entries can also be created by Branch Type
+	 *    Confusion ("AMD retbleed") or SRSO.
 	 *
-	 *    The "user -> kernel" attack scenario is mitigated by SMEP and
-	 *    eIBRS.
+	 *    The "user -> kernel" attack is mitigated by SMEP and eIBRS.  AMD
+	 *    without SRSO_NO also needs the SRSO mitigation.
 	 *
-	 *    The "user -> user" scenario, also known as SpectreBHB, requires
-	 *    RSB clearing.
+	 *    The "user -> user" attack, also known as SpectreBHB, requires RSB
+	 *    clearing.
 	 *
-	 * So to mitigate all cases, unconditionally fill RSB on context
-	 * switches.
-	 *
-	 * FIXME: Is this pointless for retbleed-affected AMD?
+	 *    The "guest -> host" attack is mitigated by either eIBRS (not
+	 *    IBRS!) or RSB clearing on vmexit.  Note that eIBRS
+	 *    implementations with X86_BUG_EIBRS_PBRSB still need "lite" RSB
+	 *    clearing which retires a single CALL before the first RET.
 	 */
+
 	setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
 	pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
 
-- 
2.47.0