[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250520195549.17f6c2c7@gandalf.local.home>
Date: Tue, 20 May 2025 19:55:49 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
Cc: Namhyung Kim <namhyung@...nel.org>, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, bpf@...r.kernel.org, x86@...nel.org,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Josh Poimboeuf
<jpoimboe@...nel.org>, Peter Zijlstra <peterz@...radead.org>, Ingo Molnar
<mingo@...nel.org>, Jiri Olsa <jolsa@...nel.org>, Thomas Gleixner
<tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>, Dave Hansen
<dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>, Andrii
Nakryiko <andrii@...nel.org>
Subject: Re: [PATCH v9 00/13] unwind_user: x86: Deferred unwinding
infrastructure
On Wed, 21 May 2025 08:26:05 +0900
Masami Hiramatsu (Google) <mhiramat@...nel.org> wrote:
> > Maybe I asked this before but I don't remember if I got the answer. :)
> > How does it handle task exits as it won't go to userspace? I guess it'll
> > lose user callstacks for exit syscalls and other termination paths.
I just checked, and the good news is that task_work does indeed get called
when a task exits. The bad news is that it happens after do_exit() cleans
up the task's "mm" structure via exit_mm(). Which means that current->mm is
NULL :-p
There's a proposal to move trace_sched_process_exit() to before exit_mm().
If that happens, we could make that tracepoint a "faultable" tracepoint and
then the unwind infrastructure could attach to it and do the unwinding from
that tracepoint.
> >
> > Similarly, it will miss user callstacks in the samples at the end of
> > profiling if the target tasks remain in the kernel (or they sleep).
> > It looks like a fundamental limitation of the deferred callchains.
Yes that is a limitation.
>
> Can we use a hybrid approach for this case?
> It might be more balanced (from the performance point of view) to save
> the full stack in a classic way only in this case, rather than faulting
> on process exit or doing file access just to load the sframe.
Another approach is that the tool (like perf) could request to take the
user space stack trace every time a task enters the kernel via a system
call.
-- Steve
Powered by blists - more mailing lists