lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250626163457.08bbb2e1@gandalf.local.home>
Date: Thu, 26 Jun 2025 16:34:57 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 bpf@...r.kernel.org, x86@...nel.org
Cc: Masami Hiramatsu <mhiramat@...nel.org>, Mathieu Desnoyers
 <mathieu.desnoyers@...icios.com>, Josh Poimboeuf <jpoimboe@...nel.org>,
 Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>, Jiri
 Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>, Thomas
 Gleixner <tglx@...utronix.de>, Andrii Nakryiko <andrii@...nel.org>, Indu
 Bhagat <indu.bhagat@...cle.com>, "Jose E. Marchesi" <jemarch@....org>, Beau
 Belgrave <beaub@...ux.microsoft.com>, Jens Remus <jremus@...ux.ibm.com>,
 Linus Torvalds <torvalds@...ux-foundation.org>, Andrew Morton
 <akpm@...ux-foundation.org>, Jens Axboe <axboe@...nel.dk>
Subject: Re: [PATCH v11 06/14] unwind_user/deferred: Add deferred unwinding
 interface

On Thu, 26 Jun 2025 12:48:55 -0400
Steven Rostedt <rostedt@...dmis.org> wrote:

> >  static __always_inline void unwind_reset_info(void)
> >  {
> > -	if (unlikely(current->unwind_info.cache))
> > +	/* Exit out early if this was never used */
> > +	if (likely(!current->unwind_info.timestamp))
> > +		return;  
> 
> I found that this breaks the use of perf using the unwind_user_faultable()
> directly and not relying on the deferred infrastructure (which it does when
> it traces a single task and also needs to remove the separate in_nmi()
> code). Because this still requires the nr_entries to be set to zero.
> 
> The clearing of the nr_entries has to be separate from the timestamp. To
> prevent unneeded writes after the cache is allocated, should we check the
> nr_entries is set before writing zero?
> 
> 	if (current->unwind_info.cache && current->unwind_info.cache->nr_entries)
>   		current->unwind_info.cache->nr_entries = 0;
> 
> ?

I just made this into:

 	if (current->unwind_info.cache)
   		current->unwind_info.cache->nr_entries = 0;

As later patches will add more here and I added a new patch that added a
USED bit to the info->unwind_mask that gets set whenever the stack trace is
used and this code needs to be executed. That makes it so that the
unwind_mask is the only condition that needs to be checked when it was
never used.

-- Steve

From: Steven Rostedt <rostedt@...dmis.org>
Subject: [PATCH] unwind: Add USED bit to only have one conditional on way back
 to user space

On the way back to user space, the function unwind_reset_info() is called
unconditionally (but always inlined). It currently has two conditionals.
One that checks the unwind_mask which is set whenever a deferred trace is
called and is used to know that the mask needs to be cleared. The other
checks if the cache has been allocated, and if so, it resets the
nr_entries so that the unwinder knows it needs to do the work to get a new
user space stack trace again (it only does it once per entering the
kernel).

Use one of the bits in the unwind mask as a "USED" bit that gets set
whenever a trace is created. This will make it possible to only check the
unwind_mask in the unwind_reset_info() to know if it needs to do work or
not and eliminates a conditional that happens every time the task goes
back to user space.

Signed-off-by: Steven Rostedt (Google) <rostedt@...dmis.org>
---
 include/linux/unwind_deferred.h | 14 +++++++-------
 kernel/unwind/deferred.c        |  5 ++++-
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/linux/unwind_deferred.h b/include/linux/unwind_deferred.h
index e7bf133c5a20..4786acc0f087 100644
--- a/include/linux/unwind_deferred.h
+++ b/include/linux/unwind_deferred.h
@@ -21,6 +21,10 @@ struct unwind_work {
 #define UNWIND_PENDING_BIT	(BITS_PER_LONG - 1)
 #define UNWIND_PENDING		BIT(UNWIND_PENDING_BIT)
 
+/* Set if the unwinding was used (directly or deferred) */
+#define UNWIND_USED_BIT		(UNWIND_PENDING_BIT - 1)
+#define UNWIND_USED		BIT(UNWIND_USED_BIT)
+
 enum {
 	UNWIND_ALREADY_PENDING	= 1,
 	UNWIND_ALREADY_EXECUTED	= 2,
@@ -51,14 +55,10 @@ static __always_inline void unwind_reset_info(void)
 				return;
 		} while (!try_cmpxchg(&info->unwind_mask, &bits, 0UL));
 		local64_set(&current->unwind_info.timestamp, 0);
+
+		if (unlikely(info->cache))
+			info->cache->nr_entries = 0;
 	}
-	/*
-	 * As unwind_user_faultable() can be called directly and
-	 * depends on nr_entries being cleared on exit to user,
-	 * this needs to be a separate conditional.
-	 */
-	if (unlikely(info->cache))
-		info->cache->nr_entries = 0;
 }
 
 #else /* !CONFIG_UNWIND_USER */
diff --git a/kernel/unwind/deferred.c b/kernel/unwind/deferred.c
index c783d273a2dc..9ec1e74c6469 100644
--- a/kernel/unwind/deferred.c
+++ b/kernel/unwind/deferred.c
@@ -131,6 +131,9 @@ int unwind_user_faultable(struct unwind_stacktrace *trace)
 
 	cache->nr_entries = trace->nr;
 
+	/* Clear nr_entries on way back to user space */
+	set_bit(UNWIND_USED_BIT, &info->unwind_mask);
+
 	return 0;
 }
 
@@ -325,7 +328,7 @@ int unwind_deferred_init(struct unwind_work *work, unwind_callback_t func)
 	guard(mutex)(&callback_mutex);
 
 	/* See if there's a bit in the mask available */
-	if (unwind_mask == ~(UNWIND_PENDING))
+	if (unwind_mask == ~(UNWIND_PENDING|UNWIND_USED))
 		return -EBUSY;
 
 	work->bit = ffz(unwind_mask);
-- 
2.47.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ