[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250710163522.3195293-1-jremus@linux.ibm.com>
Date: Thu, 10 Jul 2025 18:35:06 +0200
From: Jens Remus <jremus@...ux.ibm.com>
To: linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
bpf@...r.kernel.org, x86@...nel.org,
Steven Rostedt <rostedt@...nel.org>
Cc: Jens Remus <jremus@...ux.ibm.com>, Heiko Carstens <hca@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>,
Ilya Leoshkevich <iii@...ux.ibm.com>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Josh Poimboeuf <jpoimboe@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>,
Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Andrii Nakryiko <andrii@...nel.org>,
Indu Bhagat <indu.bhagat@...cle.com>,
"Jose E. Marchesi" <jemarch@....org>,
Beau Belgrave <beaub@...ux.microsoft.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Jens Axboe <axboe@...nel.dk>, Florian Weimer <fweimer@...hat.com>,
Sam James <sam@...too.org>
Subject: [RFC PATCH v1 00/16] s390: SFrame user space unwinding
This RFC series adds s390 support for unwinding of user space using
SFrame. It is based on Josh's and Steven's work (see prerequisites
below). The generic unwind user (sframe) frameworks are extended to
enable support for a few s390-particularities (see patches 6-9),
including unwinding of user space using back chain (see patch 12).
The latter could be broken apart as a separate patch series.
Posting as RFC so that the s390-particularities could be taken into
account in any of the prerequisite series from Steve and to obtain
early feedback to improve my patches. I would also be fine with any
of the required infrastructure changes being integrated into the
prerequisite series.
Hopefully it was the right think to use the distribtion list from
Steve's prerequisite series.
Motivation:
On s390 unwinding using frame pointer (FP) is unsupported, because of
lack of proper s390 64-bit (s390x) ABI specification and compiler
support. The ABI does only specify a "preferred" FP register. Both GCC
and Clang, regardless of compiler option -fno-omit-frame-pointer, setup
the preferred FP register as late as possible, which usually is after
static stack allocation, so that the CFA cannot be deduced from the FP
without any further data, such as provided by DWARF CFI or SFrame.
In theory there is a s390-specific alternative of unwinding using
back chain (compiler option -mbackchain), but this has its own
limitations and there is currently no distribution that builds user
space with back chain.
As a consequence the Kernel stack tracer cannot unwind user space
(except if it is built with back chain). Recording call graphs of user
space using perf is limited top stack sampling (i.e. perf record
--call-graph dwarf), which generates a fairly large amount of data and
has limitations.
Initial testing of recording call graphs using perf using the s390
support for SFrame provided by this series (on top of Josh's and
Steve's) shows that both the sampling rate and data size notably
improve:
perf record data size is greatly reduced (smaller perf.data):
SFrame (--call-graph fp):
# perf record -F 9999 --call-graph fp objdump -wdWF objdump
[ perf record: Woken up 9 times to write data ]
[ perf record: Captured and wrote 2.498 MB perf.data (10891 samples) ]
Stack sampling (--call-graph dwarf) with a default stack size of 8192:
# perf record -F 9999 --call-graph dwarf objdump -wdWF objdump
[ perf record: Woken up 270 times to write data ]
[ perf record: Captured and wrote 67.467 MB perf.data (8241 samples) ]
perf record sampling rate is a lot higher (higher number of events):
SFrame (--call-graph fp):
# perf record -F 99999 --call-graph fp objdump -wdWF objdump
[ perf record: Woken up 213 times to write data ]
[ perf record: Captured and wrote 53.167 MB perf.data (283993 samples) ]
Stack sampling (--call-graph dwarf) with a default stack size of 8192:
# perf record -F 99999 --call-graph dwarf objdump -wdWF objdump
[ perf record: Woken up 2678 times to write data ]
Warning:
Processed 91458 events and lost 45 chunks!
Check IO/CPU overload!
Warning:
Processed 102157 samples and lost 19.24%!
[ perf record: Captured and wrote 675.513 MB perf.data (82497 samples) ]
Prerequirements:
This RFC series applies on top of Josh's and Steve's series
"[PATCH v8 00/12] unwind_deferred: Implement sframe handling":
https://lore.kernel.org/all/20250708021115.894007410@kernel.org/
Note that this series depends on others.
It is based on top of Steve's branch available at:
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git unwind/main
It depends on my Binutils series "[PATCH v3 00/11] s390: Support to
generate .sframe in assembler and linker":
https://inbox.sourceware.org/binutils/20250627110849.1198336-1-jremus@linux.ibm.com/
Note that my latest v4 of that series is already based on SFrame V2
format changes (i.e. SFRAME_F_FDE_FUNC_START_PCREL), that require
changes to the generic unwind user sframe implementation.
Josh's and Steve's series depends on a Glibc patch from Josh, that adds
support for the prctls introduced in the Kernel:
https://lore.kernel.org/all/20250122023517.lmztuocecdjqzfhc@jpoimboe/
Note that Josh's Glibc patch needs to be adjusted for the updated prctl
numbers from "[PATCH v8 12/12] unwind_user/sframe: Add prctl() interface
for registering .sframe sections":
https://lore.kernel.org/all/20250708021200.397301537@kernel.org/
Overview:
Patch 1 adds and rewords a few comments to Josh's and Steve's user
unwind framework.
Patch 2 aligns asm/dwarf.h to x86 asm/dwarf2.h.
Patch 3 replicates Josh's x86 patch "x86/asm: Avoid emitting DWARF
CFI for non-VDSO" for s390.
Patch 4 replicates Josh's patch "x86/vdso: Enable sframe generation
in VDSO" for s390. It enables generation of SFrame stack trace
information (.sframe section) for the vDSO if the assembler supports it.
Note that this depends on a new config option CONFIG_AS_SFRAME that is
introduced by a separate series by Josh/Steven, from which I have
included the required patches as PREREQ.
Patch 5 changes the build of the vDSO on s390 to keep the function
symbols for stack tracing purposes. Note that Josh does this in his
patch "x86/vdso: Enable sframe generation in VDSO", by chaning objcopy
option -S to -g.
Patches 6-9 enable Josh's generic unwind user (sframe) frameworks to
support the following s390 particularities:
- Patch 6 adds support for architectures that define their CFA as SP at
callsite + offset.
- Patch 7 adds support support for architectures that do not necessarily
save the RA on the stack (or in another register) in the topmost
frame (e.g. in the prologue or in lead functions).
- Patch 8 adds support for architectures that save RA/FP in other
registers.
- Patch 9 adds support for architectures that store the CFA offset
from CFA base register (e.g. SP or FP) in SFrame encoded. For
instance on s390 the CFA offset is stored adjusted by -160 and
then scaled down by 8 to enable and improve the use of signed 8-bit
SFrame offsets (i.e. CFA, RA, and FP offset).
Patch 10 introduces frame_pointer() and user_return_address() in
ptrace on s390. Both are prerequisites for the subsequent patch.
Patch 11 adds support for unwinding of user space using SFrame on
s390. It leverages the extensions of the generic unwind user
framework from patches 6-9.
Patch 12 introduces unwinding of user space using back chain to the
unwind user framework.
Patch 13 adds support for unwinding of user space using back chain on
s390.
Patches 14-15 are pre-requisite patches from Josh's and Steve's
series "[PATCH v6 0/6] x86/vdso: VDSO updates and fixes for sframes":
https://lore.kernel.org/all/20250425023750.669174660@goodmis.org/
They introduce the config option CONFIG_AS_SFRAME required by patch 4.
Patch 16 is a WIP fixup for user unwind sframe on s390 to use macros
instead of magic numbers that I would like to get some feedback on,
whether that would be the correct approach.
Initially I had a patch on top that uses the unwind user framework in
stack trace on s390 in arch_stack_walk_user_common(), now that it can
unwind user space using back chain. But a recent change changed
macro for_each_user_frame() private, so that it can no longer be used.
Note that this would still not enable stack traces of user space to be
generated. The reason is that the stack tracer does not allow for page
faults, causing the unwind user framework attempt to unwind using SFrame
to fail and fallback to unwind using back chain, which usually also
fails, as user space is not built with back chain (see motivation).
Limitations:
Unwinding of user space using back chain cannot - by design - restore
the FP. Therefore unwiding of subsequent frames using e.g. SFrame may
fail, if the FP is the CFA base register.
Thanks and regards,
Jens
Jens Remus (14):
fixup! unwind_user: Add frame pointer support
s390: asm/dwarf.h should only be included in assembly files
s390/vdso: Avoid emitting DWARF CFI for non-vDSO
s390/vdso: Enable SFrame generation in vDSO
s390/vdso: Keep function symbols in vDSO
unwind_user: Enable archs that define CFA = SP_callsite + offset
unwind_user: Enable archs that do not necessarily save RA
unwind_user: Enable archs that save RA/FP in other registers
unwind_user/sframe: Enable archs with encoded SFrame CFA offsets
s390/ptrace: Enable HAVE_USER_RA_REG
s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME
unwind_user/backchain: Introduce back chain user space unwinding
s390/unwind_user/backchain: Enable HAVE_UNWIND_USER_BACKCHAIN
WIP: fixup! s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME
Josh Poimboeuf (2):
PREREQ: x86/asm: Avoid emitting DWARF CFI for non-VDSO
PREREQ: x86/vdso: Enable sframe generation in VDSO
arch/Kconfig | 21 +++
arch/s390/Kconfig | 4 +
arch/s390/include/asm/dwarf.h | 53 +++++---
arch/s390/include/asm/ptrace.h | 25 +++-
arch/s390/include/asm/unwind_user.h | 83 ++++++++++++
arch/s390/include/asm/unwind_user_backchain.h | 127 ++++++++++++++++++
arch/s390/include/asm/unwind_user_sframe.h | 37 +++++
arch/s390/kernel/vdso64/Makefile | 9 +-
arch/s390/kernel/vdso64/vdso64.lds.S | 5 +
arch/x86/entry/vdso/Makefile | 10 +-
arch/x86/entry/vdso/vdso-layout.lds.S | 3 +
arch/x86/include/asm/dwarf2.h | 54 +++++---
arch/x86/include/asm/unwind_user.h | 26 +++-
include/asm-generic/Kbuild | 1 +
include/asm-generic/unwind_user.h | 20 +++
include/asm-generic/unwind_user_sframe.h | 65 +++++++++
include/linux/ptrace.h | 8 ++
include/linux/sframe.h | 4 +-
include/linux/unwind_user_backchain.h | 17 +++
include/linux/unwind_user_types.h | 21 ++-
kernel/unwind/Makefile | 1 +
kernel/unwind/sframe.c | 28 ++--
kernel/unwind/sframe.h | 16 +++
kernel/unwind/user.c | 101 +++++++++++---
kernel/unwind/user_backchain.c | 13 ++
25 files changed, 671 insertions(+), 81 deletions(-)
create mode 100644 arch/s390/include/asm/unwind_user.h
create mode 100644 arch/s390/include/asm/unwind_user_backchain.h
create mode 100644 arch/s390/include/asm/unwind_user_sframe.h
create mode 100644 include/asm-generic/unwind_user_sframe.h
create mode 100644 include/linux/unwind_user_backchain.h
create mode 100644 kernel/unwind/user_backchain.c
--
2.48.1
Powered by blists - more mailing lists