lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251007214008.080852573@kernel.org>
Date: Tue, 07 Oct 2025 17:40:08 -0400
From: Steven Rostedt <rostedt@...nel.org>
To: linux-kernel@...r.kernel.org,
 linux-trace-kernel@...r.kernel.org,
 bpf@...r.kernel.org,
 x86@...nel.org
Cc: Masami Hiramatsu <mhiramat@...nel.org>,
 Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Josh Poimboeuf <jpoimboe@...nel.org>,
 Peter Zijlstra <peterz@...radead.org>,
 Ingo Molnar <mingo@...nel.org>,
 Jiri Olsa <jolsa@...nel.org>,
 Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>,
 Thomas Gleixner <tglx@...utronix.de>,
 Andrii Nakryiko <andrii@...nel.org>,
 Indu Bhagat <indu.bhagat@...cle.com>,
 "Jose E. Marchesi" <jemarch@....org>,
 Beau Belgrave <beaub@...ux.microsoft.com>,
 Jens Remus <jremus@...ux.ibm.com>,
 Linus Torvalds <torvalds@...ux-foundation.org>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Florian Weimer <fweimer@...hat.com>,
 Sam James <sam@...too.org>,
 Kees Cook <kees@...nel.org>,
 "Carlos O'Donell" <codonell@...hat.com>
Subject: [PATCH v16 0/4] perf: Support the deferred unwinding infrastructure

This is based on top of tip/perf/core commit: 6d48436560e91be85

Then I added the patches from Peter Zijlstra:

    https://lore.kernel.org/all/20250924075948.579302904@infradead.org/

This series implements the perf interface to use deferred user space stack
tracing.

The patches for the user space side should still work with this series:

  https://lore.kernel.org/linux-trace-kernel/20250908175319.841517121@kernel.org

Patch 1 updates the deferred unwinding infrastructure. It adds a new
function called: unwind_deferred_task_init(). This is used when a tracer
(perf) only needs to follow a single task. The descriptor returned can
be used the same way as the descriptor returned by unwind_deferred_init(),
but the tracer must only use it on one task at a time.

Patch 2 adds the per task deferred stack traces to perf. It adds a new event
type called PERF_RECORD_CALLCHAIN_DEFERRED that is recorded when a task is
about to go back to user space and happens in a location that pages may be
faulted in. It also adds a new callchain context called
PERF_CONTEXT_USER_DEFERRED that is used as a place holder in a kernel
callchain to append the deferred user space stack trace to.

Patch 3 adds the user stack trace context cookie in the kernel callchain right
after the PERF_CONTEXT_USER_DEFERRED context so that the user space side can
map the request to the deferred user space stack trace.

Patch 4 adds support for the per CPU perf events that will allow the kernel to
associate each of the per CPU perf event buffers to a single application. This
is needed so that when a request for a deferred stack trace happens on a task
that then migrates to another CPU, it will know which CPU buffer to use to
record the stack trace on. It is possible to have more than one perf user tool
running and a request made by one perf tool should have the deferred trace go
to the same perf tool's perf CPU event buffer. A global list of all the
descriptors representing each perf tool that is using deferred stack tracing
is created to manage this.

Changes since v15: https://lore.kernel.org/linux-trace-kernel/20250825180638.877627656@kernel.org/

- The main update was that I moved the code to do single task deferred
  stack tracing into the unwind code. That allowed to reuse the code
  for tracing all tasks, and simplified the perf code in doing so.

  The first patch updates the unwind deferred code to have this
  infrastructure. It only added a new function:
    unwind_deferred_task_init()
  This is the same as unwind_deferred_init() but it is used when the
  tracer will only trace a single task. The descriptor returned will
  have its own task_work callback it will use and it allows for any
  number of callers, not a limited set like the "all task" deferred
  unwinding has.

- The new code also removed the need to expose the generation of the
  cookie.

Josh Poimboeuf (1):
      perf: Support deferred user callchains

Steven Rostedt (3):
      unwind: Add interface to allow tracing a single task
      perf: Have the deferred request record the user context cookie
      perf: Support deferred user callchains for per CPU events

----
 include/linux/perf_event.h            |   9 +-
 include/linux/unwind_deferred.h       |  15 ++
 include/uapi/linux/perf_event.h       |  25 ++-
 kernel/bpf/stackmap.c                 |   4 +-
 kernel/events/callchain.c             |  14 +-
 kernel/events/core.c                  | 362 +++++++++++++++++++++++++++++++++-
 kernel/unwind/deferred.c              | 283 ++++++++++++++++++++++----
 tools/include/uapi/linux/perf_event.h |  25 ++-
 8 files changed, 686 insertions(+), 51 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ