linux-kernel - [PATCH 1/3] x86/membarrier: Get rid of a dubious optimization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <9b8ecd9a017960c9e56ef3f3e733f32aa1194366.1606758530.git.luto@kernel.org>
Date:   Mon, 30 Nov 2020 09:50:33 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     x86@...nel.org, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Nicholas Piggin <npiggin@...il.com>,
        Arnd Bergmann <arnd@...db.de>,
        Anton Blanchard <anton@...abs.org>,
        Andy Lutomirski <luto@...nel.org>
Subject: [PATCH 1/3] x86/membarrier: Get rid of a dubious optimization

sync_core_before_usermode() had an incorrect optimization.  If we're
in an IRQ, we can get to usermode without IRET -- we just have to
schedule to a different task in the same mm and do SYSRET.
Fortunately, there were no callers of sync_core_before_usermode()
that could have had in_irq() or in_nmi() equal to true, because it's
only ever called from the scheduler.

While we're at it, clarify a related comment.

Signed-off-by: Andy Lutomirski <luto@...nel.org>
---
 arch/x86/include/asm/sync_core.h | 9 +++++----
 arch/x86/mm/tlb.c                | 6 ++++--
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/sync_core.h b/arch/x86/include/asm/sync_core.h
index 0fd4a9dfb29c..ab7382f92aff 100644
--- a/arch/x86/include/asm/sync_core.h
+++ b/arch/x86/include/asm/sync_core.h
@@ -98,12 +98,13 @@ static inline void sync_core_before_usermode(void)
 	/* With PTI, we unconditionally serialize before running user code. */
 	if (static_cpu_has(X86_FEATURE_PTI))
 		return;
+
 	/*
-	 * Return from interrupt and NMI is done through iret, which is core
-	 * serializing.
+	 * Even if we're in an interrupt, we might reschedule before returning,
+	 * in which case we could switch to a different thread in the same mm
+	 * and return using SYSRET or SYSEXIT.  Instead of trying to keep
+	 * track of our need to sync the core, just sync right away.
 	 */
-	if (in_irq() || in_nmi())
-		return;
 	sync_core();
 }
 
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 11666ba19b62..dabe683ab076 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -474,8 +474,10 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 	/*
 	 * The membarrier system call requires a full memory barrier and
 	 * core serialization before returning to user-space, after
-	 * storing to rq->curr. Writing to CR3 provides that full
-	 * memory barrier and core serializing instruction.
+	 * storing to rq->curr, when changing mm.  This is because membarrier()
+	 * sends IPIs to all CPUs that are in the target mm, but another
+	 * CPU switch to the target mm in the mean time.  Writing to CR3
+	 * provides that full memory barrier and core serializing instruction.
 	 */
 	if (real_prev == next) {
 		VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) !=
-- 
2.28.0