[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250701192619.20eb2a58@gandalf.local.home>
Date: Tue, 1 Jul 2025 19:26:19 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Kees Cook <kees@...nel.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
bpf@...r.kernel.org, x86@...nel.org, Masami Hiramatsu
<mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Josh Poimboeuf <jpoimboe@...nel.org>, Peter Zijlstra
<peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>, Jiri Olsa
<jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>, Thomas Gleixner
<tglx@...utronix.de>, Andrii Nakryiko <andrii@...nel.org>, Indu Bhagat
<indu.bhagat@...cle.com>, "Jose E. Marchesi" <jemarch@....org>, Beau
Belgrave <beaub@...ux.microsoft.com>, Jens Remus <jremus@...ux.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>, Jens Axboe <axboe@...nel.dk>,
Florian Weimer <fweimer@...hat.com>
Subject: Re: [PATCH v12 00/14] unwind_user: x86: Deferred unwinding
infrastructure
On Tue, 1 Jul 2025 15:49:23 -0700
Kees Cook <kees@...nel.org> wrote:
> On Mon, Jun 30, 2025 at 10:45:39PM -0400, Steven Rostedt wrote:
> > On Mon, 30 Jun 2025 19:06:12 -0700
> > Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> >
> > > On Mon, 30 Jun 2025 at 17:54, Steven Rostedt <rostedt@...dmis.org> wrote:
> > > >
> > > > This is the first patch series of a set that will make it possible to be able
> > > > to use SFrames[1] in the Linux kernel. A quick recap of the motivation for
> > > > doing this.
> > >
> > > You have a '[1]' to indicate there's a link to what SFrames are.
> > [...]
> > [1] https://sourceware.org/binutils/wiki/sframe
>
> Okay, I've read the cover letter and this wiki page, but I am dense: why
> does the _kernel_ want to do this? Shouldn't it only be userspace that
> cares about userspace unwinding? I don't use perf, ftrace, and ebpf
> enough to make this obvious to me, I guess. ;)
>
It's how perf does profiling. It needs to walk the user space stack to see
what functions are being called. Ftrace can do the same thing, but is not
as used because it doesn't have the tooling (yet) to figure out what the
user space addresses mean (but I'm working on fixing that).
And BPF has commands that it can do, but I don't know BPF enough to comment.
The big user is perf with profiling. It currently uses frame pointers, but
because of the way frame pointers are set up, it misses a lot of the leaf
functions when the interrupt triggers (which sframes does not have that
problem). Also, if frame pointers is not configured, perf may just copy
thousands of bytes of the user space stack into the kernel ring buffer and
then parse it later (this isn't used often due to the overhead).
Then there's s390 that doesn't have frame pointers and only has the copy of
thousands of bytes to do any meaningful user space profiling.
Note, this has been a long standing issue where in 2022, we had a BOF on
this, looking for something like ORC in user space as it would solve lots
of our issues. Then December of that same year, we heard about SFrames.
At Kernel Recipes in 2023, Brendan Gregg during his talk was saying that
there needs to be a better way to do profiling of user space from the
kernel without frame pointers. I mentioned SFrames and he was quite excited
to hear about it. That's also when Josh, who was in the attendance, asked
if he could do the implementation of it in the kernel!
Anyway, yeah, it's something that has a ton of interest, as it's the way
for tools like perf to give nice graphs of where user space bottlenecks
exist.
-- Steve
Powered by blists - more mailing lists