lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250424192456.851953422@goodmis.org>
Date: Thu, 24 Apr 2025 15:24:56 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: linux-kernel@...r.kernel.org,
 linux-trace-kernel@...r.kernel.org
Cc: Masami Hiramatsu <mhiramat@...nel.org>,
 Mark Rutland <mark.rutland@....com>,
 Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Josh Poimboeuf <jpoimboe@...nel.org>,
 x86@...nel.org,
 Peter Zijlstra <peterz@...radead.org>,
 Ingo Molnar <mingo@...nel.org>,
 Arnaldo Carvalho de Melo <acme@...nel.org>,
 Indu Bhagat <indu.bhagat@...cle.com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Jiri Olsa <jolsa@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>,
 Ian Rogers <irogers@...gle.com>,
 Adrian Hunter <adrian.hunter@...el.com>,
 linux-perf-users@...r.kernel.org,
 Mark Brown <broonie@...nel.org>,
 linux-toolchains@...r.kernel.org,
 Jordan Rome <jordalgo@...a.com>,
 Sam James <sam@...too.org>,
 Andrii Nakryiko <andrii.nakryiko@...il.com>,
 Jens Remus <jremus@...ux.ibm.com>,
 Florian Weimer <fweimer@...hat.com>,
 Andy Lutomirski <luto@...nel.org>,
 Weinan Liu <wnliu@...gle.com>,
 Blake Jones <blakejones@...gle.com>,
 Beau Belgrave <beaub@...ux.microsoft.com>,
 "Jose E. Marchesi" <jemarch@....org>,
 Alexander Aring <aahringo@...hat.com>
Subject: [PATCH v5 0/9] tracing: Deferred unwinding of user space stack traces


I'm currently working on getting sframe support from the kernel.
Josh Poimboeuf did a lot of the hard work already, but he told me he doesn't
have time to continue it so I'm picking it up where he left off.

His last series of v4 is here:

  https://lore.kernel.org/all/cover.1737511963.git.jpoimboe@kernel.org/

It covers a lot of topics as he found issues with other aspects of
the kernel that needed to be fixed for sframes to work properly.

This series focuses on implementing the deferred unwinding for ftrace
(and LTTng could use it).

This implements the three API functions that Josh had in his series:

  unwind_deferred_init()
  unwind_deferred_request()
  unwind_deferred_cancel()

The difference is that it does not add the task_work to the tracer's
unwind_work structure. Instead, it uses a global bitmask where each
registered tracer gets a bit. That means it can have at most 32 tracers
registered at a time on a 32 bit system, and 64 tracers on a 64 bit
system. Ideally, there should not be more than 10, and that is a lot.

This is also why perf does not use this method, as it would register
a callback for pretty much every event or task or CPU, and that goes
into the hundreds.

But for generic tracers that have a single entity tracing multiple
tasks, this works out well.

When a tracer registers with unwind_deferred_init(), a avaliable bit
in the global mask is assigned to that tracer. If there are no more
bits available, -EBUSY is returned.

When a tracer requests a stacktrace on task exit back to user space,
it is given a cookie that is associated to that stacktrace. The tracer
can save that cookie into its buffer and use it to attach the stacktrace
when it gets back. It's bit is set in the task structures unwind_mask
and when the task returns back to user space, it will iterate all
the tracers that are registered, and if their corresponding bit is
set it will call its callback and clear the bit.

The last patches implement the tracing subsystem to use this for
its global user space stack tracing per event (individual events is
not supported yet). It creates a two new events, where one is to
record the cookie when the stack trace is requested, and the other is
for the user space stacktrace itself.

Since the callback is called in faultable context, it uses this opportunity
to look at the addresses in the stacktrace and convert them to where
they would be in the executable file (if found). It also records
the inode and device major/minor numbers into the trace, so that post
processing can find the exact location where the stacks are.

Josh Poimboeuf (3):
      unwind_user/deferred: Add deferred unwinding interface
      unwind_user/deferred: Make unwind deferral requests NMI-safe
      mm: Add guard for mmap_read_lock

Steven Rostedt (6):
      unwind deferred: Use bitmask to determine which callbacks to call
      tracing: Do not bother getting user space stacktraces for kernel threads
      tracing: Rename __dynamic_array() to __dynamic_field() for ftrace events
      tracing: Implement deferred user space stacktracing
      tracing: Have deferred user space stacktrace show file offsets
      tracing: Show inode and device major:minor in deferred user space stacktrace

----
 include/linux/entry-common.h          |   2 +-
 include/linux/mmap_lock.h             |   2 +
 include/linux/sched.h                 |   1 +
 include/linux/unwind_deferred.h       |  23 ++-
 include/linux/unwind_deferred_types.h |   4 +
 kernel/trace/trace.c                  | 138 +++++++++++++++++
 kernel/trace/trace.h                  |  14 +-
 kernel/trace/trace_entries.h          |  38 ++++-
 kernel/trace/trace_export.c           |  25 +++-
 kernel/trace/trace_output.c           |  99 ++++++++++++
 kernel/unwind/deferred.c              | 275 +++++++++++++++++++++++++++++++++-
 11 files changed, 610 insertions(+), 11 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ