[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250304120449.GHZ8bsYYyEBOKQIxBm@fat_crate.local>
Date: Tue, 4 Mar 2025 13:04:49 +0100
From: Borislav Petkov <bp@...en8.de>
To: Rik van Riel <riel@...riel.com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, peterz@...radead.org,
dave.hansen@...ux.intel.com, zhengqi.arch@...edance.com,
nadav.amit@...il.com, thomas.lendacky@....com, kernel-team@...a.com,
linux-mm@...ck.org, akpm@...ux-foundation.org, jackmanb@...gle.com,
jannh@...gle.com, mhklinux@...look.com, andrew.cooper3@...rix.com,
Manali.Shukla@....com, mingo@...nel.org
Subject: [PATCH] x86/mm: Always set the ASID valid bit for the INVLPGB
instruction
On Tue, Feb 25, 2025 at 10:00:35PM -0500, Rik van Riel wrote:
> Add support for broadcast TLB invalidation using AMD's INVLPGB instruction.
One more patch ontop from Tom.
Now lemme test this a bit again...
From: Tom Lendacky <thomas.lendacky@....com>
Date: Tue, 4 Mar 2025 12:59:56 +0100
Subject: [PATCH] x86/mm: Always set the ASID valid bit for the INVLPGB instruction
When executing the INVLPGB instruction on a bare-metal host or hypervisor, if
the ASID valid bit is not set, the instruction will flush the TLB entries that
match the specified criteria for any ASID, not just the those of the host. If
virtual machines are running on the system, this may result in inadvertent
flushes of guest TLB entries.
When executing the INVLPGB instruction in a guest and the INVLPGB instruction is
not intercepted by the hypervisor, the hardware will replace the requested ASID
with the guest ASID and set the ASID valid bit before doing the broadcast
invalidation. Thus a guest is only able to flush its own TLB entries.
So to limit the host TLB flushing reach, always set the ASID valid bit using an
ASID value of 0 (which represents the host/hypervisor). This will will result in
the desired effect in both host and guest.
Signed-off-by: Tom Lendacky <thomas.lendacky@....com>
Signed-off-by: Borislav Petkov (AMD) <bp@...en8.de>
---
arch/x86/include/asm/tlb.h | 58 +++++++++++++++++++++-----------------
1 file changed, 32 insertions(+), 26 deletions(-)
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index e8561a846754..56fe331fb797 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -33,6 +33,27 @@ enum addr_stride {
PMD_STRIDE = 1
};
+/*
+ * INVLPGB can be targeted by virtual address, PCID, ASID, or any combination
+ * of the three. For example:
+ * - FLAG_VA | FLAG_INCLUDE_GLOBAL: invalidate all TLB entries at the address
+ * - FLAG_PCID: invalidate all TLB entries matching the PCID
+ *
+ * The first is used to invalidate (kernel) mappings at a particular
+ * address across all processes.
+ *
+ * The latter invalidates all TLB entries matching a PCID.
+ */
+#define INVLPGB_FLAG_VA BIT(0)
+#define INVLPGB_FLAG_PCID BIT(1)
+#define INVLPGB_FLAG_ASID BIT(2)
+#define INVLPGB_FLAG_INCLUDE_GLOBAL BIT(3)
+#define INVLPGB_FLAG_FINAL_ONLY BIT(4)
+#define INVLPGB_FLAG_INCLUDE_NESTED BIT(5)
+
+/* The implied mode when all bits are clear: */
+#define INVLPGB_MODE_ALL_NONGLOBALS 0UL
+
#ifdef CONFIG_BROADCAST_TLB_FLUSH
/*
* INVLPGB does broadcast TLB invalidation across all the CPUs in the system.
@@ -40,14 +61,20 @@ enum addr_stride {
* The INVLPGB instruction is weakly ordered, and a batch of invalidations can
* be done in a parallel fashion.
*
- * The instruction takes the number of extra pages to invalidate, beyond
- * the first page, while __invlpgb gets the more human readable number of
- * pages to invalidate.
+ * The instruction takes the number of extra pages to invalidate, beyond the
+ * first page, while __invlpgb gets the more human readable number of pages to
+ * invalidate.
*
* The bits in rax[0:2] determine respectively which components of the address
* (VA, PCID, ASID) get compared when flushing. If neither bits are set, *any*
* address in the specified range matches.
*
+ * Since it is desired to only flush TLB entries for the ASID that is executing
+ * the instruction (a host/hypervisor or a guest), the ASID valid bit should
+ * always be set. On a host/hypervisor, the hardware will use the ASID value
+ * specified in EDX[15:0] (which should be 0). On a guest, the hardware will
+ * use the actual ASID value of the guest.
+ *
* TLBSYNC is used to ensure that pending INVLPGB invalidations initiated from
* this CPU have completed.
*/
@@ -55,9 +82,9 @@ static inline void __invlpgb(unsigned long asid, unsigned long pcid,
unsigned long addr, u16 nr_pages,
enum addr_stride stride, u8 flags)
{
- u32 edx = (pcid << 16) | asid;
+ u64 rax = addr | flags | INVLPGB_FLAG_ASID;
u32 ecx = (stride << 31) | (nr_pages - 1);
- u64 rax = addr | flags;
+ u32 edx = (pcid << 16) | asid;
/* The low bits in rax are for flags. Verify addr is clean. */
VM_WARN_ON_ONCE(addr & ~PAGE_MASK);
@@ -87,27 +114,6 @@ static inline void __invlpgb(unsigned long asid, unsigned long pcid,
static inline void __tlbsync(void) { }
#endif
-/*
- * INVLPGB can be targeted by virtual address, PCID, ASID, or any combination
- * of the three. For example:
- * - FLAG_VA | FLAG_INCLUDE_GLOBAL: invalidate all TLB entries at the address
- * - FLAG_PCID: invalidate all TLB entries matching the PCID
- *
- * The first is used to invalidate (kernel) mappings at a particular
- * address across all processes.
- *
- * The latter invalidates all TLB entries matching a PCID.
- */
-#define INVLPGB_FLAG_VA BIT(0)
-#define INVLPGB_FLAG_PCID BIT(1)
-#define INVLPGB_FLAG_ASID BIT(2)
-#define INVLPGB_FLAG_INCLUDE_GLOBAL BIT(3)
-#define INVLPGB_FLAG_FINAL_ONLY BIT(4)
-#define INVLPGB_FLAG_INCLUDE_NESTED BIT(5)
-
-/* The implied mode when all bits are clear: */
-#define INVLPGB_MODE_ALL_NONGLOBALS 0UL
-
static inline void __invlpgb_flush_user_nr_nosync(unsigned long pcid,
unsigned long addr,
u16 nr, bool stride)
--
2.43.0
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists