lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241113095557.2d60a073@imladris.surriel.com>
Date: Wed, 13 Nov 2024 09:55:57 -0500
From: Rik van Riel <riel@...riel.com>
To: Borislav Petkov <bp@...en8.de>
Cc: linux-kernel@...r.kernel.org, dave.hansen@...ux.intel.com,
 luto@...nel.org, peterz@...radead.org, tglx@...utronix.de,
 mingo@...hat.com, x86@...nel.org, kernel-team@...a.com, hpa@...or.com,
 bigeasy@...utronix.de
Subject: Re: [PATCh 0/3] x86,tlb: context switch optimizations

On Wed, 13 Nov 2024 10:55:50 +0100
Borislav Petkov <bp@...en8.de> wrote:

> On Fri, Nov 08, 2024 at 07:27:47PM -0500, Rik van Riel wrote:
> > While profiling switch_mm_irqs_off with several workloads,
> > it appears there are two hot spots that probably don't need
> > to be there.  
> 
> One of those three is causing the below here, zapping them from tip.
> 

TL;DR: __text_poke ends up sending IPIs with interrupts disabled.

> [    3.186469]  on_each_cpu_cond_mask+0x50/0x90
> [    3.186469]  flush_tlb_mm_range+0x1a8/0x1f0
> [    3.186469]  ? cpu_bugs_smt_update+0x14/0x1f0
> [    3.186469]  __text_poke+0x366/0x5d0

Here is an alternative to avoid __text_poke() from calling
on_each_cpu_cond_mask() with IRQs disabled:

---8<---
From e872edeaad14c793036f290afc28000281e1b76a Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel@...riel.com>
Date: Wed, 13 Nov 2024 09:51:16 -0500
Subject: [PATCH] x86/alternatives: defer poking_mm TLB flush to next use

Instead of doing a TLB flush of the poking_mm after we have
already switched back to the prev mm, we can simply increment
the tlb_gen for the poking_mm at unuse time.

This will cause switch_mm_irqs_off to flush the TLB next time
it loads the poking_mm, in the unlikely case that poking_mm still
has an ASID on that CPU by then.

Signed-off-by: Rik van Riel <riel@...riel.com>
---
 arch/x86/kernel/alternative.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index d17518ca19b8..f3caf5bc4df9 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1830,6 +1830,9 @@ static inline void unuse_temporary_mm(temp_mm_state_t prev_state)
 	lockdep_assert_irqs_disabled();
 	switch_mm_irqs_off(NULL, prev_state.mm, current);
 
+	/* Force a TLB flush next time poking_mm is used. */
+	inc_mm_tlb_gen(poking_mm);
+
 	/*
 	 * Restore the breakpoints if they were disabled before the temporary mm
 	 * was loaded.
@@ -1940,14 +1943,6 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
 	 */
 	unuse_temporary_mm(prev);
 
-	/*
-	 * Flushing the TLB might involve IPIs, which would require enabled
-	 * IRQs, but not if the mm is not used, as it is in this point.
-	 */
-	flush_tlb_mm_range(poking_mm, poking_addr, poking_addr +
-			   (cross_page_boundary ? 2 : 1) * PAGE_SIZE,
-			   PAGE_SHIFT, false);
-
 	if (func == text_poke_memcpy) {
 		/*
 		 * If the text does not match what we just wrote then something is
-- 
2.45.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ